Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Troubleshooting Sure Solution 1009921.1 : Troubleshooting the DSCP service on Sun SPARC(R) Enterprise Mx000 (OPL) servers
PreviouslyPublishedAs 213600
Applies to:Sun SPARC Enterprise M4000 Server - Version: Not ApplicableSun SPARC Enterprise M5000 Server - Version: Not Applicable and later [Release: N/A and later] Sun SPARC Enterprise M8000 Server - Version: Not Applicable and later [Release: N/A and later] Sun SPARC Enterprise M3000 Server - Version: Not Applicable and later [Release: N/A and later] Sun SPARC Enterprise M9000-32 Server - Version: Not Applicable and later [Release: N/A and later] All Platforms PurposeThis resolution path addresses problems where the OPL Mx000 DSCP service is not started, which prevents data from flowing between the XSCF and domain.Symptoms This issue can manifest itself in several different ways listed below: XSCF> showdevices -d 01 XSCF> deleteboard -c unassign 09-0 DSCP is used by FMA to transfer data between fmd on the domain and the XSCFU. The FMA logs and /var/adm/messages on the domain will log failures if DSCP is not properly configured. The fmd logs will consume large amounts of space recording errors of ereport.fm.fmd.module and no fma messages are propagated to the xscfu.
Last Review DateApril 6, 2011Instructions for the ReaderA Troubleshooting Guide is provided to assist
in debugging a specific issue. When possible, diagnostic tools are included in the document
to assist in troubleshooting.
Troubleshooting DetailsSteps to FollowPlease validate that each troubleshooting step below is true for your environment. The steps will provide instructions or a link to a document, for validating the step and taking corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Please do not skip a step. Note: This procedure requires that the Solaris[TM] domain that cannot be reached via DSCP service be rebooted in order to complete the resolution steps. 1. Verify via Solaris that the required packages for the DSCP service are installed on the Solaris domain
- pkginfo | grep SUNWdscp indicates the SUNWdscpr and SUNWdscpu packages are installed on the domain - pkginfo | grep SUNWppp indicates that the SUNWpppd, SUNWpppdr, SUNWpppdu, and SUNWpppdt are installed. - pkginfo | grep SUNWsckm indicates that the SUNWsckmr, and SUNWsckmu packages are installed. - pkginfo | grep SUNWdcs indicates that the SUNWdcsr, and SUNWdcsu packages are installed. - A reboot may be required following the installation of any missing packages. Reference : <Document: 1011376.1> - Sun SPARC[R] Enterprise Mx000 Servers (OPL) packages for minimal Solaris[TM] Installation 2. Verify on the problematic Solaris domain that ifconfig -a displays a sppp0 interface and the interface flags show the interface as RUNNING . If sppp0 exists but is not RUNNING , then create an escalation to assist in the resolution. .... sppp0: flags=10010008d1<UP,POINTOPOINT,RUNNING,NOARP,MULTICAST,IPv4,FIXEDMTU> mtu 1500 index inet 192.168.224.3 --> 192.168.224.1 netmask ffffff00 ether 0 3. Verify on the problematic Solaris domain that svcs -l dscp displays enabled true and state online To enable the dscp service, type svcadm enable dscp . Example: # svcs -l dscp fmri svc:/platform/sun4u/dscp:default name DSCP Service enabled true state online next_state none .... Note : The absence of the appropriate SUNWppp* packages, the absence of the spp0 interface or the DSCP service disabled, as described above, will be reported on the domain via ereport.fm.fmd.module in the 'fmdump -e'. Further investigation using 'fmdump -V' will report the following signatures : msg = xport - dscpBind on accept socket failed for dev:///sp0 : rv = 5 and msg = xport - dscpAddr on client socket failed for dev:///sp0 : rv = 5
Example: Display DSCP information. XSCF> showdscp If it is not configured showdscp output will state this: XSCF> showdscp ERROR: DSCP is not configured. Please use setdscp. Then DSCP should be configured: *Note, all domains should be powered down when DSCP is configured on the XSCF* From the setdscp man page:"setdscp is intended for initial configuration only. Domains should not be powered on when running this command. Note - You are required to reboot the Service Processor after modifying the DSCP IP address assignment using this command, and before the IP addresses you speci- fied are used." Example: XSCF> setdscp XSCF> rebootxscf 5. Verify on the problematic Solaris domain that svcs -l sckmd shows enabled true and state online . To enable sckmd services, from the Solaris domain type svcadm enable sckmd . Example: # svcs -l sckmd fmri svc:/platform/sun4u/sckmd:default name key management daemon enabled true state online next_state none Note : The absence of the appropriate packages for sckmd or the sckmd service disabled will be reported on the domain via ereport.fm.fmd.module in the 'fmdump -e'. Further investigation using 'fmdump -V' will report the following signature : msg = Failed to write C_HELLO to dev:///sp0: Transport endpoint is not connected 6. Verify on the problematic Solaris domain that the ipseckey dump command returns security key information in addition to Dump succeeded for SA type 0. If the ipseckey dump does not return any keys check to see that domain OS is S10U8 or greater or that patch 140589-01 or greater is installed. # ipseckey dump Base message (version 2) type DUMP, SA type AH. Message length 136 bytes, seq=1, pid=766. SA: SADB_ASSOC spi=0xff01, replay=0, state=MATURE SA: Authentication algorithm = hmac-md5 SA: flags=0x80000000 < X_USED > SRC: Source address (proto=0/<unspecified>) SRC: AF_INET: port 0, 192.168.224.3 <unknown>. ..... Dump succeeded for SA type 0. Reboot (via shutdown, init 6, reboot, or some other orderly process) the problematic Solaris domain that has been unable to use the DSCP interface. 7. Verify on the problematic domain that svcs -l dcs shows enabled true and state online .
Example: # svcadm enable dcs # svcs -l dcs fmri svc:/platform/sun4u/dcs:default name domain configuration server enabled true state online next_state none 8. Verify on the problematic domain that a ping from the domain to the XSCF via the DSCP interface is working ( ping shows alive ): - The ifconfig -a will show information about the sppp0 interface. On the Solaris domain, locate the IP address following the "-->" . This is the point to point (ppp) interface for the XSCF from this domain. This is the interface you want to ping...in the example below, you would ping 192.168.224.1. - As this is a ppp style interface, the two IP interfaces should be on the same subnet (ie 192.168.224.Z) and have the same netmask, which is controlled by the XSCF setdscp command IP Address Example: .... sppp0: flags=10010008d1<UP,POINTOPOINT,RUNNING,NOARP,MULTICAST,IPv4,FIXEDMTU> mtu 1500 index inet 192.168.224.3 --> 192.168.224.1 netmask ffffff00 ether 0 Ping Example: # ping 192.168.224.1 192.168.224.1 is alive The XSCF command showdevices -d <rebooted domain> to the problematic Solaris domain should now be successful. At this point, if you have validated that each troubleshooting step above is true for your environment, and the issue still exists, further troubleshooting is required. For additional support contact Oracle Support. Internal Comments Reference: 6821108 DR and "showdevices" don't work after XSCF reboot The DSCP service is used by the Solaris[TM] domain and XSCF on an Mx000 system to communicate with each other. While these processes should start automatically on boot, there may be instances where a customer has disabled the functionality inappropriately or unintentionally. When this occurs, commands such as DR and showdevices will not function correctly. DSCP also depends on another component sckmd to perform header authentication of packets going back and forth between the domain and XSCF. It is possible to start DSCP without sckmd (and vice versa) so it is important to check for the proper operation of both Attachments This solution has no attachment |
||||||||||||
|