Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Problem Resolution Sure Solution 1018813.1 : Sun Fire[TM] 3800-6800: Domains running firmware 5.15.x or later with hang-policy set to "notify" may lose critical troubleshooting data
PreviouslyPublishedAs 230603
Applies to:Sun Fire 3800 ServerSun Fire 4800 Server Sun Fire 4810 Server Sun Fire 6800 Server All Platforms SymptomsSymptomsStarting with firmware level 5.15.0, ScApp detects and, depending on the setting of the domain hang-policy variable, can attempt to reset a hung domain. Systems initially installed with 5.15.0 or later will have the hang-policy default to "reset", which will attempt to reset a hung domain. The hang-policy variable was also present in earlier firmware versions. However, systems that were initially installed with an earlier firmware version will have the hang-policy set to "notify" by default. When these systems are upgraded to 5.15.0 or later, the current value of hang-policy, and all other existing domain and platform settings are left intact. This will cause two issues. First, the SC will not attempt to automatically reset a domain with hang- policy=notify, negating the effects of this new feature in ScApp. Second, and possibly more importantly, the new features in 5.15.x will cause the SC to log that it noticed the hung domain. It will log this notice each time it polls the domain to determine if it is active. The SC will log this notice both on the loghost server, and in its internal log buffers, which are used to display data via the showlogs command. This internal buffer is circular - as a new entry is made, it removes the oldest entry still present in the buffer. The end result is that a domain hang with hang-policy set to notify will overflow the circular buffer and eliminate any useful data from "showlogs -d x" that would indicate the initial condition that caused the hang. An example of these messages: ... If there is not a working loghost configured for the domain, the failure cannot be troubleshot. Changes{CHANGE}Cause{CAUSE}SolutionResolution Use the "setupdomain" command to set hang-policy to "reset" on all platforms that are upgraded to 5.15.x or later. Always configure a working loghost for all Sun Fire 3800/4800/4810/6800 platforms and domains. Product Sun Fire 6800 Server Sun Fire 4810 Server Sun Fire 4800 Server Sun Fire 3800 Server Internal Comments Authored by David Re, PTS-AMER-MSG. Bug 4906714 has been submitted re: hang-policy=notify behavior rolling the SC logs. An RFE may be forthcoming to have hang-policy on each domain changed to reset upon the first upgrade to a FW level >= 5.15.x. hang-policy, showlogs, notify, reset Previously Published As 71216 Attachments This solution has no attachment |
||||||||||||
|