![]() | Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||
Solution Type Troubleshooting Sure Solution 1009921.1 : Troubleshooting the DSCP Service on Sun SPARC(R) Enterprise M3000/M4000/M5000/M8000/M9000 Servers
PreviouslyPublishedAs 213600 Applies to:Sun SPARC Enterprise M5000 Server - Version Not Applicable and laterSun SPARC Enterprise M4000 Server - Version Not Applicable and later Sun SPARC Enterprise M8000 Server - Version Not Applicable and later Sun SPARC Enterprise M3000 Server - Version Not Applicable and later Sun SPARC Enterprise M9000-32 Server - Version Not Applicable and later All Platforms PurposeThis resolution path addresses problems where the OPL Mx000 DSCP service is not started, which prevents data from flowing between the XSCF and domain. This issue can manifest itself in several different ways listed below: XSCF> showdevices -d 01 Can't get device information from DomainID 1. <===== XSCF> deleteboard -c unassign 09-0 Start unconfiguring XSB from domain. XSB#09-0 will be unassigned from domain immediately. Continue [y|n] :y DR failed. Domain (DomainID 1) cannot communicate via DSCP path. <====== snapshot file @scf@cli@usr@bin@showdevices_-v_-d_?.err will contain: SNAPSHOT MSG: Command Timeout: Exit Signal: 15 DSCP is used by FMA to transfer data between fmd on the domain and the XSCFU. The FMA logs and /var/adm/messages on the domain will log failures if DSCP is not properly configured. The fmd logs will consume large amounts of space recording errors of ereport.fm.fmd.module and no fma messages are propagated to the xscfu. Troubleshooting StepsSteps to Follow
- pkginfo | grep SUNWdscp indicates the SUNWdscpr and SUNWdscpu packages are installed on the domain - pkginfo | grep SUNWppp indicates that the SUNWpppd, SUNWpppdr, SUNWpppdu, and SUNWpppdt are installed. - pkginfo | grep SUNWsckm indicates that the SUNWsckmr, and SUNWsckmu packages are installed. - pkginfo | grep SUNWdcs indicates that the SUNWdcsr, and SUNWdcsu packages are installed. - A reboot may be required following the installation of any missing packages. Reference : <Document: 1011376.1> - Sun SPARC[R] Enterprise Mx000 Servers (OPL) packages for minimal Solaris[TM] Installation 2. Verify on the problematic Solaris domain that ifconfig -a displays a sppp0 interface and the interface flags show the interface as RUNNING . If sppp0 exists but is not RUNNING , then create an escalation to assist in the resolution. .... sppp0: flags=10010008d1<UP,POINTOPOINT,RUNNING,NOARP,MULTICAST,IPv4,FIXEDMTU> mtu 1500 index inet 192.168.224.3 --> 192.168.224.1 netmask ffffff00 ether 0 3. Verify on the problematic Solaris domain that svcs -l dscp displays enabled true and state online To enable the dscp service, type svcadm enable dscp . Example: # svcs -l dscp fmri svc:/platform/sun4u/dscp:default name DSCP Service enabled true state online next_state none .... Note : The absence of the appropriate SUNWppp* packages, the absence of the spp0 interface or the DSCP service disabled, as described above, will be reported on the domain via ereport.fm.fmd.module in the 'fmdump -e'. Further investigation using 'fmdump -V' will report the following signatures : msg = xport - dscpBind on accept socket failed for dev:///sp0 : rv = 5 and msg = xport - dscpAddr on client socket failed for dev:///sp0 : rv = 5 Example: Display DSCP information. XSCF> showdscp If it is not configured showdscp output will state this: XSCF> showdscp ERROR: DSCP is not configured. Please use setdscp. Then DSCP should be configured: *Note, all domains should be powered down when DSCP is configured on the XSCF* From the setdscp man page: Example: XSCF> setdscp XSCF> rebootxscf 5. Verify on the problematic Solaris domain that svcs -l sckmd shows enabled true and state online . To enable sckmd services, from the Solaris domain type svcadm enable sckmd . Example: # svcs -l sckmd fmri svc:/platform/sun4u/sckmd:default name key management daemon enabled true state online next_state none Note : The absence of the appropriate packages for sckmd or the sckmd service disabled will be reported on the domain via ereport.fm.fmd.module in the 'fmdump -e'. Further investigation using 'fmdump -V' will report the following signature : msg = Failed to write C_HELLO to dev:///sp0: Transport endpoint is not connected 6. Verify on the problematic Solaris domain that the ipseckey dump command returns security key information in addition to Dump succeeded for SA type 0. If the ipseckey dump does not return any keys check to see that domain OS is S10U8 or greater or that <Sunpatch: 140589-01> or greater is installed. Example: # ipseckey dump Base message (version 2) type DUMP, SA type AH. Message length 136 bytes, seq=1, pid=766. SA: SADB_ASSOC spi=0xff01, replay=0, state=MATURE SA: Authentication algorithm = hmac-md5 SA: flags=0x80000000 < X_USED > SRC: Source address (proto=0/<unspecified>) SRC: AF_INET: port 0, 192.168.224.3 <unknown>. ..... Dump succeeded for SA type 0. Reboot (via shutdown, init 6, reboot, or some other orderly process) the problematic Solaris domain that has been unable to use the DSCP interface. 7. Verify on the problematic domain that svcs -l dcs shows enabled true and state online .
Example: # svcadm enable dcs # svcs -l dcs fmri svc:/platform/sun4u/dcs:default name domain configuration server enabled true state online next_state none 8. Verify on the problematic domain that a ping from the domain to the XSCF via the DSCP interface is working ( ping shows alive ): - The ifconfig -a will show information about the sppp0 interface. On the Solaris domain, locate the IP address following the "-->" . This is the point to point (ppp) interface for the XSCF from this domain. This is the interface you want to ping...in the example below, you would ping 192.168.224.1. - As this is a ppp style interface, the two IP interfaces should be on the same subnet (ie 192.168.224.Z) and have the same netmask, which is controlled by the XSCF setdscp command IP Address Example: .... sppp0: flags=10010008d1<UP,POINTOPOINT,RUNNING,NOARP,MULTICAST,IPv4,FIXEDMTU> mtu 1500 index inet 192.168.224.3 --> 192.168.224.1 netmask ffffff00 ether 0 Ping Example: # ping 192.168.224.1 192.168.224.1 is alive The XSCF command showdevices -d <rebooted domain> to the problematic Solaris domain should now be successful. 9. Verify the pppd service is not failing due to stale lock detection. <SunBug 7001864> logic for a stale lock detection in pppd is not sufficient Aug 23 05:06:27 cores2-da-sparc-2-d pppd[638]: [ID 860527 daemon.notice] pppd 2.4.0b1 (Sun Microsystems, Inc.) started by root, uid 0
Aug 23 05:06:27 cores2-da-sparc-2-d pppd[638]: [ID 702911 daemon.notice] Device /dev/dm2s0 is locked by pid ??? Removing the specific lock file from /var/spool/locks containing the pid will allow the service to be restarted. Boot single user and remove: /var/spool/locks/LK* At this point, if you have validated that each troubleshooting step above is true for your environment, and the issue still exists, further troubleshooting is required. For additional support contact Oracle Support. Internal Comments Reference: 6821108 DR and "showdevices" don't work after XSCF reboot The DSCP service is used by the Solaris[TM] domain and XSCF on an Mx000 system to communicate with each other. While these processes should start automatically on boot, there may be instances where a customer has disabled the functionality inappropriately or unintentionally. When this occurs, commands such as DR and showdevices will not function correctly. DSCP also depends on another component sckmd to perform header authentication of packets going back and forth between the domain and XSCF. It is possible to start DSCP without sckmd (and vice versa) so it is important to check for the proper operation of both Attachments This solution has no attachment |
||||||||||||
|