Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-77-1369869.1
Update Date:2012-03-06
Keywords:

Solution Type  Sun Alert Sure

Solution  1369869.1 :   Healthy Solaris 10 SPARC Systems May Incorrectly Report Hardware Errors (SUNOS-8000-FU) During PCIE Correctable Events  


Related Items
  • Solaris SPARC Operating System
  •  
  • Sun Fire V215 Server
  •  
  • Sun Fire V245 Server
  •  
  • Sun Ultra 45 Workstation
  •  
  • Sun SPARC Enterprise M9000-32 Server
  •  
  • Sun Fire T2000 Server
  •  
  • Sun Fire V445 Server
  •  
  • Sun Blade T6300 Server Module
  •  
  • Sun SPARC Enterprise M8000 Server
  •  
  • Sun SPARC Enterprise T5120 Server
  •  
  • Sun SPARC Enterprise M4000 Server
  •  
  • Oracle Solaris Express
  •  
  • Sun SPARC Enterprise M5000 Server
  •  
  • Sun SPARC Enterprise T5220 Server
  •  
  • Sun SPARC Enterprise T5240 Server
  •  
  • Sun Netra CP3060 ATCA Blade Server
  •  
  • Sun SPARC Enterprise T5140 Server
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: Sun Alert
  •  
  • .Old GCS Categories>Sun Microsystems>Sun Alert>Release Phase>Resolved
  •  




In this Document
  Description
  Likelihood of Occurrence
  Possible Symptoms
  Workaround or Resolution
  Patches
  Modification History
  References


Applies to:

Sun Netra CP3060 ATCA Blade Server - Version: Not Applicable and later   [Release: N/A and later ]
Sun SPARC Enterprise M4000 Server - Version: Not Applicable and later    [Release: N/A and later]
Sun Fire T2000 Server - Version: Not Applicable and later    [Release: N/A and later]
Sun Fire V445 Server - Version: Not Applicable and later    [Release: N/A and later]
Oracle Solaris Express - Version: 2010.11 to 2010.11   [Release: 11.0 to 11.0]
Information in this document applies to any platform.
_____________________



Date of Resolved Release: 21-Oct-2011
____________________________________

Description


Incorrect handling of correctable errors on Solaris 10 SPARC systems fitted with a certain model of PCI Express Switch, may cause the error SUNOS-8000-FU to be incorrectly reported on the Fault Management Architecture (FMA) class ereport.io.pci.sec-rserr. This may result in unnecessary hardware replacement for healthy hardware.

Likelihood of Occurrence


This issue can occur in the following releases:

SPARC Platform
  • Solaris 10 with patch 125369-10 through 125369-13 or patch 127755-01 and without patch 146855-01
  • Solaris 11 Express based upon builds snv_39 through snv_157
Note 1: The following SPARC platforms are impacted by this issue:
  • Ultra 45
  • Sun Fire v445
  • Sun Fire v215, v245
  • Sun Blade T6300
  • Sun Fire T2000
  • Netra CP 3060
  • SPARC Enterprise M4000
  • SPARC Enterprise M5000
  • SPARC Enterprise T5120 with Sun External I/O Expansion Unit
  • SPARC Enterprise T5140 with Sun External I/O Expansion Unit
  • SPARC Enterprise T5220 with Sun External I/O Expansion Unit
  • SPARC Enterprise T5240 with Sun External I/O Expansion Unit
  • SPARC Enterprise T5440 with Sun External I/O Expansion Unit
  • SPARC Enterprise M8000 with Sun External I/O Expansion Unit
  • SPARC Enterprise M9000 with Sun External I/O Expansion Unit
Note 2: Solaris 8, Solaris 9, and Solaris on the x86 platform are not impacted by this issue.

Note 3: Solaris 11 Express distributions may include additional bug fixes above and beyond the build from which it was derived. The base build can be derived as follows:
   $ uname -v
snv_151
If the output is of the format 151.x.x.x, then the build installed is snv_151.

Possible Symptoms


When patch 125369-13 is installed, or a system is upgraded to a release that includes this patch or to an affected Solaris 11 Express build, FMA may report correctable errors not previously observed on the system.

If the described issue occurs, the following message will be seen on the system console:

    SUNW-MSG-ID: SUNOS-8000-FU, TYPE: Defect, VER: 1, SEVERITY: Major
EVENT-TIME: Tue Mar 29 21:03 PDT 2011
PLATFORM: SUNW,SPARC-Enterprise , CSN: -, HOSTNAME: -
SOURCE: eft, REV: 1.16
EVENT-ID: af46a1fb-a712-617b-cab3-fc57b79a1dd9
DESC: The diagnosis engine encountered telemetry for which it was unable to perform a diagnosis.
    Refer to http://sun.com/msg/SUNOS-8000-FU for more information.
    AUTO-RESPONSE: Error reports have been logged for examination by Sun.

IMPACT: Automated diagnosis and response for these events will not occur.

Use fmadm(1M) and fmdump(1M) for further confirmation or contact Oracle for support.

    # fmadm faulty
--------------- ------------------------------------ -------------- ---------
TIME EVENT-ID MSG-ID SEVERITY
--------------- ------------------------------------ -------------- ---------
May 21 04:19:41 cf33eeba-54e0-6e79-b7c3-cf7de492f1d3 SUNOS-8000-FU Major

Host : xyz1
Platform : SUNW,SPARC-Enterprise Chassis_id : xyz2400L

Fault class : defect.sunos.eft.undiag.fme

Description : The diagnosis engine encountered telemetry for which it was unable to perform a diagnosis.
Refer to http://sun.com/msg/SUNOS-8000-FU for more information.

Response : Error reports have been logged for examination by Sun.

Impact : Automated diagnosis and response for these events will not occur.

Action : Ensure that the latest Solaris Kernel and Predictive Self-Healing (PSH) patches are installed.

# fmdump -e

May 21 04:19:36.4888 ereport.io.pci.fabric
May 21 04:19:36.4885 ereport.io.pci.sec-rserr

Workaround or Resolution


There is no workaround for this issue.

This issue is resolved in the following releases:

SPARC Platform
  • Solaris 10 with patch 146855-01 or later
  • Solaris 11 Express based upon builds snv_158 or later
Note: After installing the Solaris 10 patch 146855-01, the SUNOS-8000-FU faults should be cleared using the command:
    # fmadm repair <EVENT-ID >

where the event-id is obtained from the output from the "fmadm faulty" command as shown in the symptoms section above.

Patches

<SUNPATCH 146855-01>

Modification History

21-Oct-2011: Date of Resolved Release
29-Dec-2011: Updated Document Title
06-Mar-2012: Updated note in Workaround section

Internal Comments:

This is fundamentally a PLX PEX8532 Switch bug due to Erratum #59. It wasn't, however, exposed on SPARC Solaris platforms until 6239835 was introduced; thus, it is classified as a Solaris regression.

PLX PEX8532 is an 8-port, 32-lane PCI Express switch manufactured by PLX Technology, and embedded in the Sun/Oracle platforms specified in the SA.

Please send technical questions to the following email:
[email protected]
and copy the Responsible Engineer/Contributor listed below.

Internal Contributor/Submitter: [email protected]
Internal Eng Responsible Engineer: [email protected]
Internal Services Knowledge Engineer: [email protected]
Internal Eng Business Unit Group: Systems RPE
Internal Escalation ID: 72975798 73110406, 73340258, 73374798, 2-8011418, 2-8177736, 2-8034193, 2-8064817, 2-8077393, 2-8103991, 2-8194834, 2-8196583 2-8145695, 3-2711368881, 3-2849580131, 3-3225313050
Internal Resolution Patches: 146855-01

References

<SUNBUG 6960665>
<SUNPATCH 146855-01>

Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback