Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1485400.1
Update Date:2012-09-24
Keywords:

Solution Type  Problem Resolution Sure

Solution  1485400.1 :   Sparc T4-4 Amber Light Is On  


Related Items
  • SPARC T4-4
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>CMT>SN-SPARC: T3
  •  




In this Document
Symptoms
Cause
Solution
References


Created from <SR 3-6007746771>

Applies to:

SPARC T4-4 - Version Not Applicable to Not Applicable [Release N/A]
Information in this document applies to any platform.

Symptoms

Memory faults. amber Service LED on.
FRU         : "/SYS/PM0/CMP0/BOB1/CH1/D0" (hc://:product-id=ORCL,SPARC-T4-4:product-sn=1216BDY207:server-id=sv62810:chassis-id=1216BDY207:serial=00AD01120534
30E0BA:part=07020577-Rev-01/chassis=0/cpuboard=0/dimm=6) 50%
             "/SYS/PM0/CMP0/BOB0/CH1/D0" (hc://:product-id=ORCL,SPARC-T4-4:product-sn=1216BDY207:server-id=sv62810:chassis-id=1216BDY207:serial=00AD01120534
20E0C3:part=07020577-Rev-01/chassis=0/cpuboard=0/dimm=2) 50%
                 faulty

Cause

System firmware is: 8.1.5 2012/04/10 18:53

The 2 dimms that are marked as faulted are due to a known issue with f/w 8.1.5
Update to System Firmware 8.1.5 may result in DIMM Faults (DIMMs 2 & 6 CMP0/BOB0/CH1/D0 & CMP0/BOB1/CH1/D0)

Initial Resolution:See Sun Alert 1468850.1

Update 7/6/2102:
Patch 147790-01 has been identified as a requirement for the worlaround in the Sun Alert.
Confirm it is installed or add it before implementing the workaround.
====================================================================================
1. On the service processor ILOM, first clear the fault log of the above records:

  -> start /SP/faultmgmt/shell
  faultmgmtsp> fmadm faulty

Find the records associated with the DIMM faults:

  faultmgmtsp> fmadm repair <>

Where <> is the event ID of the DIMM failure. It should only be necessary to clear one of the faults. Rerun the 'fmadm faulty' (fmadm(1M)) command to verify the records have been cleared.

2. Next (on Solaris) clear the fault and remove the record of the fault having occurred:

  # /usr/bin/fmadm faulty -a
  # /usr/sbin/fmadm acquit <>
  # /usr/sbin/fmadm flush <>

where the <> is replaced by the event ID displayed by the faulty. The <> is the memory bank reported as being faulty. This should always be '/SYS/PM0/CMP0/BOB0/CH1/D0' and '/SYS/PM0/CMP0/BOB1/CH1/D0'. It should be sufficient to only flush one record.
Note: This issue was traced to an error in earlier version of system firmware, which was not reported by the firmware. With revision 8.1.5 the issue was fixed, however, a residual error shows up on Solaris. By clearing the fault on both ILOM and the host and flushing the state of the FMA, the error no longer appears and the issue is resolved.
 
This is a known issue that first appeared with 8.1.5 firmware.
The workaround was to either upgrade to 8.2.0.a or downgrade to 8.1.4.e.
 

Solution

1. Clear faults.
2. install fma patch 1 147790-01
3. Upgrade firmware at earliest convenience
 

References

<NOTE:1468850.1> - SPARC T4-4 Falsely Reporting DIMM Failures After Upgrading Firmware to Version 8.1.5

Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback