![]() | Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||
Solution Type Technical Instruction Sure Solution 1001778.1 : Sun Fire[TM] 3800, 4800/4810, 6800, E2900, E4900, E6900, V1280 or Netra[TM] 1280, 1290 server: How to Gather Data from a Hung Domain [Video]
PreviouslyPublishedAs 202431 Applies to:Sun Fire E4900 Server - Version Not Applicable and laterSun Netra 1280 Server - Version Not Applicable and later Sun Fire E2900 Server - Version Not Applicable and later Sun Netra 1290 Server - Version Not Applicable and later Sun Fire 6800 Server - Version Not Applicable and later All Platforms GoalDescription Available for this topic, a brief how-to video tutorial that provides step-by-step instructions answering Sun's most frequently asked questions. View the video and/or follow the detailed instructions below.
FixSteps to Follow 1. Ensure that the domain is actually hung: - Can you ping the domain? 2. Ensure that the SC (System Controller) is not hung, If you can access the System Controller, proceed to login to the SC and obtain a platform shell. A.If you get to the platform shell run the following commands: SCname:SC> showlogs B. If the SC is hung See Document 1002033.1 for details on how to recover from a hung system controller. Then go back to step 2A. 3. Once in the platform shell attempt to get a domain shell: SCname:SC> console -d - If the command appears to hang, then we need to send a break signal to the domain. - if you are using telnet: Press CTRL ] - if you are connected to the SC via tip: use ~# At this point you should have a domain shell prompt, continue with the following commands, otherwise continue to step 4. - If you get the domain shell run the following commands:
4. If you were not able to get to the ok prompt, then the system is really hung and we will need to send an XIR (externally initiated reset) to the domain. From the domain shell type: reset This command will give different behavior depending on what the OBP variable error-reset-recovery is set to. {#} ok dump-sigblock - If you were not able to return to the ok prompt, but have a domain prompt type the following command: SCname:A> showresetstate 5. If none of these tactics work you may be forced in to just powering off the domain. If this is the case then do a setkeyswitch off for the domain.
Note: loghost setup for domain and platform may help in troubleshooting hang issues; please check reference section below for detailed information.
Instructions for Platforms employing Lights Out Management(LOM) 1. Ensure that the domain is actually hung:
2. Login to the LOM prompt via telnet/ssh or tip. A. once you get the lom prompt, run the following commands: lom>showsc -v 3. Try to connect to the domain and see what state it is in: A. use the console commands to connect to domain lom> console B. If there's no response from console, use escape sequence to break out. The default escape sequence is "#." lom>console C. Once the domain is confirmed to be un-reachable, go to next step. 4. Using the 'break' or 'reset' command to recover. A. Try to break into the OBP by 'break' and if you get to OBP, do a sync to collect a corefile. lom>breakThis will suspend Solaris. B. If 'break' does not work, a 'reset' has to be used and 'showresetstate' collected as well. The behaviour of reset also depends on the settings used in OBP lom>reset This will abruptly terminate Solaris. lom>showresetstate 5. If none of the procedures above work, a poweroff/poweron needs to be issued. power off the platformlom> poweroff power on the platform, but do not start the domainlom> poweron allpower on the platform and start the domain
Also check:
References<NOTE:1002033.1> - Sun Fire[TM] v1280, E2900, 3800, 4800, 4810, 6800, E4900, E6900, and Netra 1280, 1290 Server: How to Recover from a Hung System Controller<NOTE:1008702.1> - Console Logging Options to capture Fatal Reset output for Sun systems <NOTE:1018813.1> - Sun Fire [TM] SF3800/SF4800/SF4810/SF6800 - E4900/E6900 Server: Domains running firmware 5.15.x or later with hang-policy set to "notify" may lose critical troubleshooting data <NOTE:778.1> - Multimedia Content Reference Attachments This solution has no attachment |
||||||||||||
|