![]() | Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||
Solution Type Problem Resolution Sure Solution 1018813.1 : Sun Fire [TM] SF3800/SF4800/SF4810/SF6800 - E4900/E6900 Server: Domains running firmware 5.15.x or later with hang-policy set to "notify" may lose critical troubleshooting data
PreviouslyPublishedAs 230603
Applies to:Sun Fire 3800 ServerSun Fire 4800 Server Sun Fire 4810 Server Sun Fire 6800 Server Sun Fire E4900 Server - Version: Not Applicable and later [Release: N/A and later] All Platforms SymptomsSymptomsStarting with firmware level 5.15.0, ScApp detects and, depending on the setting of the domain hang-policy variable, can attempt to reset a hung domain. Systems initially installed with 5.15.0 or later will have the hang-policy default to "reset", which will attempt to reset a hung domain. The hang-policy variable was also present in earlier firmware versions. However, systems that were initially installed with an earlier firmware version will have the hang-policy set to "notify" by default. When these systems are upgraded to 5.15.0 or later, the current value of hang-policy, and all other existing domain and platform settings are left intact. This will cause two issues. First, the SC will not attempt to automatically reset a domain with hang- policy=notify, negating the effects of this new feature in ScApp. Second, and possibly more importantly, the new features in 5.15.x will cause the SC to log that it noticed the hung domain. It will log this notice each time it polls the domain to determine if it is active. The SC will log this notice both on the loghost server, and in its internal log buffers, which are used to display data via the showlogs command. This internal buffer is circular - as a new entry is made, it removes the oldest entry still present in the buffer. The end result is that a domain hang with hang-policy set to notify will overflow the circular buffer and eliminate any useful data from "showlogs -d x" that would indicate the initial condition that caused the hang. An example of these messages: ... If there is not a working loghost configured for the domain, the failure cannot be troubleshot. CauseSystems initially installed with 5.15.0 or later will have the hang-policydefault to "reset", which will attempt to reset a hung domain. SolutionResolution Use the "setupdomain" command to set hang-policy to "reset" on all platforms that are upgraded to 5.15.x or later. Always configure a working loghost for all Sun Fire 3800/4800/4810/6800 platforms and domains. See Sun Fire Midframe & Entry-Level Servers Best Practices Update for Firmware 5.20.x for reference Product Sun Fire 6800 Server Sun Fire 4810 Server Sun Fire 4800 Server Sun Fire 3800 Server
Bug 4906714 has been submitted re: hang-policy=notify behavior rolling the SC logs. An RFE may be forthcoming to have hang-policy on each domain changed to reset upon the first upgrade to a FW level >= 5.15.x. Attachments This solution has no attachment |
||||||||||||||
|