![]() | Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||
Solution Type Technical Instruction Sure Solution 1002033.1 : Sun Fire[TM] v1280, E2900, 3800, 4800, 4810, 6800, E4900, E6900, and Netra 1280, 1290 Server: How to Recover from a Hung System Controller
PreviouslyPublishedAs 202844 Applies to:Sun Netra 1280 Server - Version Not Applicable and laterSun Fire 3800 Server - Version Not Applicable and later Sun Fire 4800 Server - Version Not Applicable and later Sun Fire 4810 Server - Version Not Applicable and later Sun Fire 6800 Server - Version Not Applicable and later All Platforms GoalWhen a system controller (SC) is hung, try a few steps before pressing the Reset button on the SC. FixTry the following steps: 2) If the "reboot" command does not work, or you cannot enter anything, log in to the spare SC and try to force a failover by using the "setfailover force" command.
If this step does work, it will reboot the hung SC and make the spare SC the primary SC. 3) If failover does not complete, the LAST RESORT is to use the Reset button on the SC; this step is not available on Sun Fire[TM] v1280, E2900, and Netra 1280, 1290 servers, where a platform power cycle will be needed (Solaris OS will need to be shutdown before the poweroff). BEFORE YOU PRESS THIS BUTTON, you must bring down the domains. Bringing down the domains is critical because there is a possibility that the domain will crash if the Reset button is pressed and the domains are up and running. See Document 1004364.1 for details. NOTE: Make sure that connections setting are proper on SC. Use a tip session onto the serial port of the SC: 6800a-sc0:SC> showplatform -p network The system controller is configured to be on a network. Network settings: static Hostname: 6800a-sc0 IP Address: 129.156.xx.xx Netmask: 255.255.255.0 Gateway: 129.156.xx.1 DNS Domain: UK.Sun.COM Primary DNS Server: 129.156.xx.xx Secondary DNS Server: 129.156.xx.xx ***Connection type: none <----- No remote access enabled Idle connection timeout : No timeout Sun Fire Link Enabled: no *** This shows remote access via telnet or ssh is not enabled. Running the command below, changes Connection type : 6800a-sc0:SC> setupplatform -p network Network Configuration Is the system controller on a network? [yes]: Use DHCP or static network settings? [static]: Hostname [6800a-sc0]: IP Address [129.156.xx.xx]: Netmask [255.255.255.0]: Gateway [129.156.xx.1]: DNS Domain [UK.Sun.COM]: Primary DNS Server [129.156.xx.xx]: Secondary DNS Server [129.156.xx.xx]: **To enable remote access to the system controller, select "ssh" or "telnet". **Connection type (ssh, telnet, none) [telnet]: Idle connection timeout (in minutes; 0 means no timeout) [0]: Enable Sun Fire Link? [no]: To enable remote access to the system controller, select either: * ssh * telnet Rebooting the SC is required, for changes in the above network settings to take effect.
Note: as of 5.19.0 firmware this option is available only in service or engineering mode (see Bug ID 4703904). The override option ignores whatever the status of the system controller is supposed to be and tells the spare to become primary. It pays no attention to the fact that the other SC could still be primary. Warning: this procedure should be used with caution as a last resort effort, because it could crash running domains. Example (firmware prior to 5.19.0): kremlin-sc1:sc> setfailover override SC: SSC1 Spare System Controller SC Failover: disabled This will abruptly interrupt operations on the other System Controller. This System Controller will become the main System Controller. Do you want to continue? [no] yes SC Failover did not complete. The system controllers may not be synchronized. Failover can be done forcefully but may crash domain(s). Do you want to force failover to continue? [no] yes kremlin-sc1:sc> Example (firmware 5.19.0): fort-sc0:sc> setfailover override override: is not a valid argument Usage: setfailover [-y|-n] off|on|force setfailover -h fort-sc0:sc> fort-sc0:sc> engineering fort-sc0:sc[engineering]> setfailover override Spare System Controller SC Failover: disabled Clock failover disabled. This will abruptly interrupt operations on this System Controller. This System Controller will become the spare System Controller. Do you want to continue? [no] fort-sc0:sc[engineering]> Keywords: SunFire, 3800, 4800, 4810, 6800, reset, system controller, failover References@ <BUG:4703904> - ADSR12BASE: LE2062 DISPLAYES A ERROR MESSAGE AFTER LAUNCHING THE FORMS APPL.<NOTE:1004364.1> - Sun Fire[TM] Midrange Server: Safari Port Error may be caused by a resetting SC Attachments This solution has no attachment |
||||||||||||
|