![]() | Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||
Solution Type Sun Alert Sure Solution 1020990.1 : BIOS Versions Prior to 3.0.2 May Cause System Hangs on Sun Fire x4150/X4250/x4450 Systems
PreviouslyPublishedAs 268668 ***Checked for relevance on 22-Aug-2012*** Bug Id <SUNBUG: 6871221>, <SUNBUG: 6873737> Date of Resolved Release 02-Oct-2009 Sun Fire x4150/X4250/x4450 systems may hang as a result of correctable ECC memory errors: 1. Impact Sun Fire X4150/X4250/X4450 systems with BIOS versions 3.0.1 or earlier may hang as a result of correctable ECC memory errors not being handled properly. 2. Contributing Factors This issue can occur on the following platforms:
Note 2: Correctable errors can occur even in healthy systems. The likelihood of a system hang due to this bug is based on if an error occurs, when it occurs, how it is detected, and the operating system running. 3. Symptoms If the described issue occurs, the system will lock up/hang with no ILOM SEL entry indicating a problem. Access to the ILOM is not affected. 4. Workaround There is no workaround for this issue. Please see the Resolution section below. 5. Resolution This issue is addressed on the following platforms:
For Sun Fire X4150: For Sun Fire X4250: For Sun Fire X4450: Note: The above releases contain BIOS 1ADQW062 for the Sun Fire X4150/X4250 and BIOS 3B62 for the X4450 Modification History: 22-Aug-2012: Maintenance check for relevance/currency, no change in content Product Sun Fire X4150 Server Sun Fire X4250 Server Sun Fire X4450 Server Internal Comments Additional Information: There are 2 other known issues that are being fixed in the next (3.1.0) software release: Issue 1: Incorrect error messaging If a correctable ECC memory error is detected by the CPU, you will see this SEL entry as usual: |67| IPMI | @Log | minor | Fri Sep 4 17:04:57 2009 | ID = 1d : 09/04/2009 : 17:04:57 : Memory : BIOS : Correctable ECC; Channel: D, DIMM: 5 | If the background scrubber @detects the correctable ECC memory error, the SEL entry will look like this: |118| IPMI | Log | *critical*| Tue Sep 8 18:00:47 200 | ID = 3f : 09/08/2009 : 18:00:47 : @Memory : BIOS : Memory Scrub Failed; Channel: D, DIMM: 5 This incorrectly indicates the error as critical. A scrubber correctable ECC memory error is not a critical @error despite the SEL entry. This will be fixed in the next software release and both types will be reported as a minor correctable ECC. Issue 2: Dimms being falsely mapped out during POST due to correctable ECC memory errors. POST should not map out a DIMM due to detecting a correctable ECC memory error. If during POST a DIMM is mapped out, the system should be rebooted to determine if the mapped out DIMM is due to a correctable ECC memory error at which point three things could happen:
[email protected] and CC the following persons: Internal Contributor/Submitter Internal Eng Responsible @Engineer Internal Services Knowledge Engineer Internal Contributor/submitter [email protected] Internal Eng Responsible Engineer [email protected] Internal Services Knowledge Engineer [email protected] Internal Eng Business Unit Group SVS (SPARC Volume Systems, Horizontal Systems (includes T2000/Ontario), NWS (Network Storage), Systems Group-x64 (X4100-X4600 (includes M2), V20z/V40z/V60z/V65z, Ultra20/40) Internal Sun Alert & FAB Admin Info WF 02-Sep-2009, jfolla: sent for release WF 30-Sep-2009, jfolla: sent for review WF 29-Sep-2009, jfolla: sent to submitter with questions WF 29-Sep-2009, jfolla: created Attachments This solution has no attachment |
||||||||||||
|