Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Problem Resolution Sure Solution 1008805.1 : Sun Fire[TM] 12K/15K/E20K/E25K: Remote Dynamic Reconfiguration (DR) generates "DCA/DCS Communication Error" and showdevices is “Unable to get device information from domain”.
PreviouslyPublishedAs 212092 ***Checked for relevance on 09-May-2011***
Applies to:Sun Fire E20K ServerSun Fire E25K Server Sun Fire 15K Server Sun Fire 12K Server All Platforms SymptomsThe rcfgadm or showdevices commands, generate errors from the system controller (SC). The error message might be "DCA/DCS Communication Error" when executing these commands. The command showdevices might generate the following error (where x is the domain ID):
This showdevices error could also be seen in Explorer data from the Main System Controller (SC). The file, showdevices_-v_-d_x.out, which is in the /explorer/sf15k/ directory of Explorer will show the same "Unable to get device information from domain x" error message. Also the following messages might be logged in the platform log file on the SC ( $SMSVAR/adm/platform/messages )
CauseThese errors can be caused by configuration.SolutionTo resolve the problem:
>scman0 and dman0 The dca <> dcs handshaking takes place over the I1 network. This means that scman0 on the SC and dman0 on the domain must be configured and running properly. This is often overlooked, so be sure to verify this information with the following command: On SC:
# ifconfig -a
On domain:
Note that the IP addresses and netmasks on the dman0 and scman0
interfaces should match the information stored in the /etc/SUNWMSMS/config/MAN.cf file on the
SC.
This should be further confirmed by running the following command on the domain:
Domain Configuration Agent(DCA) The Domain Configuration Agent (DCA) daemon runs on the SC,one per domain. Similar to a netcon session on a Sun Enterprise[TM] 10000 server, the DCA provides communication between the DCA on the SC and the Domain Configuration Server (DCS) on the specified domain. If DCA is not running, the showdevices and the rcfgadm commands fail. To verify that DCA is running, issue the following command on the SC:
Domain Configuration Server (DCS) DCS is a domain daemon process that supports remote dynamic reconfiguration. DCS must also be running on the domain in order for the showdevices or rcfgadm commands to work on the domain. If either command fails, check the domain for the following lines in the /etc/inetd.conf file: sun-dr stream tcp wait root /usr/lib/dcs dcs sun-dr stream tcp6 wait root /usr/lib/dcs dcs These lines must be in the /etc/inetd.conf file for the rcfgadm and showdevices commands to work properly. If the lines are not in the file, and showdevices fails from the SC, add the indicated lines above and restart the inetd process as follows:
For additional information, refer to the man page about dcs. Note for domains running Solaris[TM] 10 (without patch 120253-02 ): The /etc/inetd.conf file is no longer directly used to configure inetd. inetd is now configured in the Service Management Facility. You can get the list of the list of all the SMF services installed.
The /platform/sun4u/dcs service must be enabled/online. You can now get more information from the svc:/platform/sun4u/dcs service and list its properties via the svccfg command :
If any dcs processes are running, pids will be reported in inetd_state/start_pids. Note that, on domains running Solaris 10 w/o 120253-02, the dcs process will not be running if the SC has not recently communicated with the domain. It's forked by inetd upon request (Remote DR request started from the SC). Hence, the PPID for dcs is the inetd PID. Ex : Note for domains running Solaris[TM] 10 Update 2( with patch 120253-02 ): Due to the fixes for : Bug ID 4792021 per-socket level IPsec policy for dynamic reconfiguration Bug ID 6380945 Changes required for PSARC 2006/038 introduced in patch 120253-02, dcs does not belong to inetd any longer. Since inetd does not support per-socket IPsec, dcs will be changed to run standalone. Both dcs and cvcd will be controlled by SMF and use SMF properties to define command line options. Hence, running:
will not return information about dcs any longer. Use the following command to get the status from the dcs service:
Note that, on domains running Solaris 10 U2 or w/ 120253-02 The dcs process starts at boot time. And due to the new implementation, dcs will now be running with different options and will accept command line arguments ("-a", "-e", and "-u") allowing the administrator to configure the encryption and authentication IPsec options. Where:
See the manpages for dcs(1M) for more details. Example:
To check to see if any process is actually listening on the sun-dr port (port 665), run: e25ka-dom-c# netstat -an | grep 665 *.665 *.* 0 0 49152 0 LISTEN *.665 *.* 0 0 49152 0 LISTEN This verifies that there is indeed some process listening on the sun-dr port, 665. If there is nothing listening on port 665, then the showdevices and addboard / deleteboard commands on the SC can never work properly. The /etc/services File The /etc/services file must also have the following entry on the domain for remote Dynamic Reconfiguration (DR):
$ niscat services.org_dir | grep sun-dr
sun-dr sun-dr tcp 665 Remote Dynamic Reconfiguration
/etc/inet/ipsecinit.conf File on the Domain When running Solaris 9 or below the /etc/inet/ipsecinit.conf file should contain the following entries:
If the entries do not exist, add them and then issue:
If the domain is running Solaris 10 with patch 120253 then the service is managed by SMF and will not need the ipsecinit.conf file.
Domain X Server (DXS) The console command uses DXS. It is similar to the netcon_server on the Sun Enterprise[TM] 10000 server. DXS runs on the SC, one per domain. To verify that DXS is running, issue the following command on the SC:
Console commands take place over the console bus but can be toggled between the console bus and I1 network using the ~= command. When the domain is rebooting, a message appears on the SC that is similar to "dxs disconnecting." The reboot of a domain causes an hpost -Q. which is a quick POST from the SC. Sun Fire[TM] 12K/15K/E20K/E25K key management daemon (sckmd) The sckmd server process resides on a Sun Fire[TM] 12K/15K/E20K/E25K domain. The sckmd daemon maintains the Internet Protocol Security (IPsec) Security Associations (SAs) needed to secure the communication between the SC and the cvcd and dcs daemons running on the domains. The sckmd daemon must be running on the domain in order for the "showdevices" or "rcfgadm" commands to work on the domain. To verify that the sckmd daemon is running, issue the following command on the domain:
Failure after a Solaris[TM] 10+ OS initial installation Upon the initial installation of a Solaris 10+ domain, showdevices/rcfgadm will not work successfully. Running the commands will generate domain-side console messages such as:
To fix this, on the domain, issue the command:
# ipsecalgs -s For a more detailed explanation on this issue, please see Bug ID 6233334 Failure after a Solaris[TM] 10 Update 2 Installation or after installing 120253-02 on Solaris[TM]10. After an upgrade to Solaris[TM] Update 2 or patch installation the dcs service may fail to go online, staying in maintenance mode and the dcs process is not running :
Check the reason why dcs never got online via the /etc/svc/volatile/platform-sun4u-dcs:default.log log file.
To fix this, on the domain, restart the services :
Failure after an upgrade to Solaris[TM] 10 Update 2. After an upgrade to Solaris[TM] Update 2, the dcs service may fail to go online, staying in maintenance mode and the dcs process is not running :
The new manifest /var/svc/manifest/platform/sun4u/dcs.xml provided by 120253-02 (bundled in S10U2) has not been applied properly so inetd is still trying to start it. The general/restarter property for the dcs service should now be startd and no longer be inetd. # svcprop dcs general/enabled boolean true general/entity_stability astring Unstable general/restarter fmrisvc:/network/inetd:default dcs/ah_auth astring md5 [...] See CR# 6472374 for more details. To fix this problem, the new manifest must be imported using the following procedure :
Note that when no general/restarter is mentionned, the
default one - startd is used.
**Note, in certain instances this workaround is not the complete fix. On certain systems it has been found that an inetconv command has been run, resulting in two services called sun-dr being created that will stop the DCS service from being able to start even after following the above workaround. To check for this condition: # svcs -xv svc:/platform/sun4u/dcs:default (domain configuration server) # svcs -a | grep sun-dr online - 19:14:48 - svc:/network/sun-dr/tcp6:default To clear this condition: 1. Remove 2 sun-dr lines from /etc/inetd.conf 2. svcadm disable svc:/network/sun-dr/tcp:default 3. svcadm disable svc:/network/sun-dr/tcp6:default 4. svccfg delete -f svc:/network/sun-dr/tcp:default 5. svccfg delete -f svc:/network/sun-dr/tcp6:default 6. rm /var/svc/manifest/network/sun-dr-tcp.xml 7. rm /var/svc/manifest/network/sun-dr-tcp6.xml 8. svcadm disable svc:/platform/sun4u/dcs 9. svccfg delete -f svc:/platform/sun4u/dcs 10. svccfg -v import /var/svc/manifest/platform/sun4u/dcs.xml 11. svcadm enable svc:/platform/sun4u/dcs Starcat, 12k, 15k, dca, dcs, dr, DCA/DCS communication error, E20K, E25K, Unable to get device information from domain, rcfgadm, showdevices, console, ipsecinit.conf, svcs Previously Published As 51772 Attachments This solution has no attachment |
||||||||||||
|