Asset ID: |
1-72-1485400.1 |
Update Date: | 2012-09-24 |
Keywords: | |
Solution Type
Problem Resolution Sure
Solution
1485400.1
:
Sparc T4-4 Amber Light Is On
Related Categories |
- PLA-Support>Sun Systems>SPARC>CMT>SN-SPARC: T3
|
In this Document
Created from <SR 3-6007746771>
Applies to:
SPARC T4-4 - Version Not Applicable to Not Applicable [Release N/A]
Information in this document applies to any platform.
Symptoms
Memory faults. amber Service LED on.
FRU : "/SYS/PM0/CMP0/BOB1/CH1/D0" (hc://:product-id=ORCL,SPARC-T4-4:product-sn=1216BDY207:server-id=sv62810:chassis-id=1216BDY207:serial=00AD01120534
30E0BA:part=07020577-Rev-01/chassis=0/cpuboard=0/dimm=6) 50%
"/SYS/PM0/CMP0/BOB0/CH1/D0" (hc://:product-id=ORCL,SPARC-T4-4:product-sn=1216BDY207:server-id=sv62810:chassis-id=1216BDY207:serial=00AD01120534
20E0C3:part=07020577-Rev-01/chassis=0/cpuboard=0/dimm=2) 50%
faulty
Cause
System firmware is: 8.1.5 2012/04/10 18:53
The 2 dimms that are marked as faulted are due to a known issue with f/w 8.1.5
Update to System Firmware 8.1.5 may result in DIMM Faults (DIMMs 2 & 6 CMP0/BOB0/CH1/D0 & CMP0/BOB1/CH1/D0)
Initial Resolution:See Sun Alert 1468850.1
Update 7/6/2102:
Patch 147790-01 has been identified as a requirement for the worlaround in the Sun Alert.
Confirm it is installed or add it before implementing the workaround.
====================================================================================
1. On the service processor ILOM, first clear the fault log of the above records:
-> start /SP/faultmgmt/shell
faultmgmtsp> fmadm faulty
Find the records associated with the DIMM faults:
faultmgmtsp> fmadm repair <>
Where <> is the event ID of the DIMM failure. It should only be necessary to clear one of the faults. Rerun the 'fmadm faulty' (fmadm(1M)) command to verify the records have been cleared.
2. Next (on Solaris) clear the fault and remove the record of the fault having occurred:
# /usr/bin/fmadm faulty -a
# /usr/sbin/fmadm acquit <>
# /usr/sbin/fmadm flush <>
where the <> is replaced by the event ID displayed by the faulty. The <> is the memory bank reported as being faulty. This should always be '/SYS/PM0/CMP0/BOB0/CH1/D0' and '/SYS/PM0/CMP0/BOB1/CH1/D0'. It should be sufficient to only flush one record.
Note: This issue was traced to an error in earlier version of system firmware, which was not reported by the firmware. With revision 8.1.5 the issue was fixed, however, a residual error shows up on Solaris. By clearing the fault on both ILOM and the host and flushing the state of the FMA, the error no longer appears and the issue is resolved.
This is a known issue that first appeared with 8.1.5 firmware.
The workaround was to either upgrade to 8.2.0.a or downgrade to 8.1.4.e.
Solution
1. Clear faults.
2. install fma patch 1 147790-01
3. Upgrade firmware at earliest convenience
References
<NOTE:1468850.1> - SPARC T4-4 Falsely Reporting DIMM Failures After Upgrading Firmware to Version 8.1.5
Attachments
This solution has no attachment