Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-75-1012954.1
Update Date:2012-07-03
Keywords:

Solution Type  Troubleshooting Sure

Solution  1012954.1 :   Sun SPARC Enterprise[TM] M3000/M4000/M5000/M8000/M9000: Information & Troubleshooting fmsp faults.  


Related Items
  • Sun SPARC Enterprise M9000-64 Server
  •  
  • Sun SPARC Enterprise M9000-32 Server
  •  
  • Sun SPARC Enterprise M8000 Server
  •  
  • Sun SPARC Enterprise M3000 Server
  •  
  • Sun SPARC Enterprise M4000 Server
  •  
  • Sun SPARC Enterprise M5000 Server
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>Enterprise>SN-SPARC: Mx000
  •  
  • .Old GCS Categories>Sun Microsystems>Servers>OPL Servers
  •  

PreviouslyPublishedAs
217751


Applies to:

Sun SPARC Enterprise M3000 Server - Version Not Applicable and later
Sun SPARC Enterprise M4000 Server - Version Not Applicable and later
Sun SPARC Enterprise M5000 Server - Version Not Applicable and later
Sun SPARC Enterprise M8000 Server - Version Not Applicable and later
Sun SPARC Enterprise M9000-32 Server - Version Not Applicable and later
All Platforms

Purpose

Sun SPARC Enterprise[TM] M3000/M4000/M5000/M8000/M9000: Information & Troubleshooting fmsp faults.

This document provides information on Fault Events related to the Sun SPARC Enterprise[TM] M3000/M4000/M5000/M8000/M9000 Service Processor's Fault Manager (fmsp). It is directly linked from the FMD Fault Event articles on the Predictive Self Healing Knowledge Article Lookup website (FMA event code lookup) and contains Sun SPARC Enterprise M3000/M4000/M5000/M8000/M9000 platform specific information.

NOTE:  While the fault codes detailed in this article can be generated on any FMA enabled Solaris[TM] system (Solaris[TM] 10 or higher), this document ONLY pertains to the codes being generated on a the Sun SPARC Enterprise M3000/M4000/M5000/M8000/M9000 Service Processor.

The Fault Manager (fmsp) is a diagnostic engine consisting of software modules that are responsible for monitoring the hardware or software configuration for error events, diagnosing those events, and performing recovery actions.  If something happens that prevents a module from being able to perform its duty one of the fmsp Fault Events detailed in this document is generated.  This article is used to determine how to best recover the Sun SPARC Enterprise M3000/M4000/M5000/M8000/M9000 Fault Manager if one of the following error events is encountered.

NOTE: these events are usually due to low XSCF firmware release or sw issues and generally do not require to reach step 4 (hw replacement): please ensure to carefully follow up and implement all previous steps before planning any hw replacement.

Please click on the Fault Event Code which has been encountered in your situation and you will navigate to the appropriate section of this article for your Fault Event and the Service Action Plan will be detailed for you:

FMD-8000-0W

FMD-8000-11

FMD-8000-2K

FMD-8000-3F 

Troubleshooting Steps

Find the Fault Event Code which has been encountered for details of the appropriate Service Action Plan.

Fault Event

FMD-8000-0W

Fault Description

Under normal operations, the Fault Manager receives error reports from numerous sub-systems and then arranges for the appropriate fault diagnosis and automated responses to be applied.

This fault message indicates that the Fault Manager received an error report to which no automated diagnosis is currently available.  This can indicate a mismatch in the versions of various software components, a misconfiguration of the system, automated software may not have been loaded or is currently unavailable, or a defect in the software.

Symptoms

If any daemon with the exception of 'fmd' dies or is terminated, the Extended System Controller Facility (XSCF) will reboot automatically.

Service Action Plan

Step 1

Reboot the Extended System Controller Facility (XSCF).

     xscf> rebootxscf

If the module failure continues proceed to Step 2.

Step 2

Ensure the latest version of XCP firmware is installed on the platform.  The latest version of XCP can be downloaded from the Sun Download Center .

In order to view the version of XCP installed:

XSCF> version -c xcp -v
XSCF#0 (Active )
XCP0 (Reserve): 1090
OpenBoot PROM : 02.11.0000
XSCF : 01.09.0000
XCP1 (Current): 1090
OpenBoot PROM : 02.11.0000
XSCF : 01.09.0000
XSCF#1 (Standby)
XCP0 (Reserve): 1090
OpenBoot PROM : 02.11.0000
XSCF : 01.09.0000
XCP1 (Current): 1090
OpenBoot PROM : 02.11.0000
XSCF : 01.09.0000
OpenBoot PROM BACKUP
#0: 02.09.0000
#1: 02.11.0000

In order to view the version of XSCF firmware installed:

XSCF> version -c xscf
XSCF#0 (Active )
01.09.0000(Reserve) 01.09.0000(Current)
XSCF#1 (Standby)
01.09.0000(Reserve) 01.09.0000(Current)

If the module failure continues proceed to Step 3.

Step 3

Check information specific for this event code into appropriate Document: FMD-8000-0W - Solaris Fault Manager Received Unexpected Event (Document 1021146.1)

Step 4

Sun SPARC Enterprise[TM] M3000
  • Contact your Authorized Service Provider to schedule the replacement of the MBU_A.
Sun SPARC Enterprise[TM] M4000/M5000> 
  • Contact your Authorized Service Provider to schedule the replacement of the eXtended System Control Facility Unit (xscfu). 
Sun SPARC Enterprise[TM] M8000/M9000 
  • In a single xscfu configuration, contact your Authorized Service Provider to schedule 
    the replacement of the Active eXtended System Control Facility Unit (xscfu_b or xscfu_c).
  • In a dual xscfu configuration, you should try to failover to the spare xscfu and see if the errors cease or persist.
    - If the errors cease, contact your Authorized Service Provider to schedule
    the replacement of the error reporting eXtended System Control Facility Unit (xscfu_b
    or xscfu_c).
    - If the errors persist, contact your Authorized Service Provider to obtain assistance.




Fault Event

FMD-8000-11

Fault Description

Under normal operations, the Fault Manager receives error reports from numerous sub-systems and then arranges for the appropriate fault diagnosis and automated responses to be applied.

This fault message indicates that the Fault Manager received a diagnosis it did not expect. This can indicate a mismatch in the versions of various software components, a misconfiguration of the system, automated software may not have been loaded or is currently unavailable, or a defect in the software.

Symptoms

If any daemon with the exception of 'fmd' dies or is terminated, the Extended System Controller Facility (XSCF) will reboot automatically.

Service Action Plan

Step 1

Reboot the Extended System Controller Facility (XSCF).

     xscf> rebootxscf

If the module failure continues proceed to Step 2.

Step 2

Ensure the latest version of XCP firmware is installed on the platform.  The latest version of XCP can be downloaded from the Sun Download Center .

In order to view the version of XCP installed:

XSCF> version -c xcp -v
XSCF#0 (Active )
XCP0 (Reserve): 1090
OpenBoot PROM : 02.11.0000
XSCF : 01.09.0000
XCP1 (Current): 1090
OpenBoot PROM : 02.11.0000
XSCF : 01.09.0000
XSCF#1 (Standby)
XCP0 (Reserve): 1090
OpenBoot PROM : 02.11.0000
XSCF : 01.09.0000
XCP1 (Current): 1090
OpenBoot PROM : 02.11.0000
XSCF : 01.09.0000
OpenBoot PROM BACKUP
#0: 02.09.0000
#1: 02.11.0000

In order to view the version of XSCF firmware installed:

XSCF> version -c xscf
XSCF#0 (Active )
01.09.0000(Reserve) 01.09.0000(Current)
XSCF#1 (Standby)
01.09.0000(Reserve) 01.09.0000(Current)

If the module failure continues proceed to Step 3.

Step 3

Check information specific for this event code into appropriate Document: FMD-8000-11 - Solaris Fault Manager unable to determine message summary (Doc ID 1021147.1).

Please note that this event code is logged on XSCF when it receives an I/O related FMA diagnose from Solaris instance running on domain(s): in such a case, this is only a side effect of an event detected at domain side and is not related to any fault over the XSCF itself; check Document 1362005.1 for example scenarios.

Step 4

Sun SPARC Enterprise[TM] M3000
  • Contact your Authorized Service Provider to schedule the replacement of the MBU_A.
Sun SPARC Enterprise[TM] M4000/M5000 
  • Contact your Authorized Service Provider to schedule the replacement of the eXtended System Control Facility Unit (xscfu). 
Sun SPARC Enterprise[TM] M8000/M9000 
  • In a single xscfu configuration, contact your Authorized Service Provider to schedule 
    the replacement of the Active eXtended System Control Facility Unit (xscfu_b or xscfu_c).
  • In a dual xscfu configuration, you should try to failover to the spare xscfu and see if the errors cease or persist.
    - If the errors cease, contact your Authorized Service Provider to schedule
    the replacement of the error reporting System Control Facility Unit (xscfu_b
    or xscfu_c).
    - If the errors persist, contact your Authorized Service Provider to obtain assistance.

 


Fault Event

FMD-8000-2K

Fault Description

Under normal operations, the Fault Manager loads multiple components, referred to as modules, which perform fault diagnosis and automated response actions.

This fault message indicates that one of these modules has encountered an unrecoverable error.  Since the module cannot continue its work, the Fault Manager has unloaded and disabled the module.  This can indicate a defect in the module or the Fault Manager.

Symptoms

If any daemon with the exception of 'fmd' dies or is terminated, the Extended System Controller Facility (XSCF) will reboot automatically.

Service Action Plan

Step 1

Reboot the Extended System Controller Facility (XSCF).

     xscf> rebootxscf

If the module failure continues proceed to Step 2.

Step 2

Ensure the latest version of XCP firmware is installed on the platform.  The latest version of XCP can be downloaded from the Sun Download Center .

In order to view the version of XCP installed:

XSCF> version -c xcp -v
XSCF#0 (Active )
XCP0 (Reserve): 1090
OpenBoot PROM : 02.11.0000
XSCF : 01.09.0000
XCP1 (Current): 1090
OpenBoot PROM : 02.11.0000
XSCF : 01.09.0000
XSCF#1 (Standby)
XCP0 (Reserve): 1090
OpenBoot PROM : 02.11.0000
XSCF : 01.09.0000
XCP1 (Current): 1090
OpenBoot PROM : 02.11.0000
XSCF : 01.09.0000
OpenBoot PROM BACKUP
#0: 02.09.0000
#1: 02.11.0000

In order to view the version of XSCF firmware installed:

XSCF> version -c xscf
XSCF#0 (Active )
01.09.0000(Reserve) 01.09.0000(Current)
XSCF#1 (Standby)
01.09.0000(Reserve) 01.09.0000(Current)

If the module failure continues proceed to Step 3.

Step 3

Check information specific for this event code into appropriate Document: FMD-8000-2K - Solaris Fault Manager component had disabling error (Doc ID 1021148.1)

Step 4

Sun SPARC Enterprise M3000
  • Contact your Authorized Service Provider to schedule the replacement of the MBU_A.
Sun SPARC Enterprise M4000/M5000 
  • Contact your Authorized Service Provider to schedule the replacement of the eXtended System Control Facility Unit (xscfu). 
Sun SPARC Enterprise M8000/M9000 
  • In a single xscfu configuration, contact your Authorized Service Provider to schedule 
    the replacement of the Active eXtended System Control Facility Unit (xscfu_b or xscfu_c).
  • In a dual xscfu configuration, you should try to failover to the spare xscfu and see if the errors cease or persist.
    - If the errors cease, contact your Authorized Service Provider to schedule
    the replacement of the error reporting System Control Facility Unit (xscfu_b
    or xscfu_c).
    - If the errors persist, contact your Authorized Service Provider to obtain assistance.




Fault Event

FMD-8000-3F

Fault Description

Under normal operations, the Fault Manager receives error reports from numerous sub-systems and then arranges for the appropriate fault diagnosis and automated responses to be applied.

This fault message indicates that one of the modules failed to load during fault manager startup.  This indicates the module's configuration file has an error or a defect in the module or the Fault Manager.

Symptoms

If any daemon with the exception of 'fmd' dies or is terminated, the Extended System Controller Facility (XSCF) will reboot automatically.

Service Action Plan

Step 1

Reboot the Extended System Controller Facility (XSCF).

     xscf> rebootxscf

If the module failure continues proceed to Step 2.

Step 2

Ensure the latest version of XCP firmware is installed on the platform.  The latest version of XCP can be downloaded from the Sun Download Center .

In order to view the version of XCP installed:

XSCF> version -c xcp -v
XSCF#0 (Active )
XCP0 (Reserve): 1090
OpenBoot PROM : 02.11.0000
XSCF : 01.09.0000
XCP1 (Current): 1090
OpenBoot PROM : 02.11.0000
XSCF : 01.09.0000
XSCF#1 (Standby)
XCP0 (Reserve): 1090
OpenBoot PROM : 02.11.0000
XSCF : 01.09.0000
XCP1 (Current): 1090
OpenBoot PROM : 02.11.0000
XSCF : 01.09.0000
OpenBoot PROM BACKUP
#0: 02.09.0000
#1: 02.11.0000

In order to view the version of XSCF firmware installed:

XSCF> version -c xscf
XSCF#0 (Active )
01.09.0000(Reserve) 01.09.0000(Current)
XSCF#1 (Standby)
01.09.0000(Reserve) 01.09.0000(Current)

If the module failure continues proceed to Step 3.

Step 3

Check information specific for this event code into appropriate Document:  FMD-8000-3F - Solaris Fault Manager component has erroneous configuration file (Doc ID 1021149.1)

Step 4

Sun SPARC Enterprise M3000
  • Contact your Authorized Service Provider to schedule the replacement of the MBU_A.
Sun SPARC Enterprise M4000/M5000 
  • Contact your Authorized Service Provider to schedule the replacement of the eXtended System Control Facility Unit (xscfu). 
Sun SPARC Enterprise M8000/M9000 
  • In a single xscfu configuration, contact your Authorized Service Provider to schedule 
    the replacement of the Active eXtended System Control Facility Unit (xscfu_b or xscfu_c).
  • In a dual xscfu configuration, you should try to failover to the spare xscfu and see if the errors cease or persist.
    - If the errors cease, contact your Authorized Service Provider to schedule
    the replacement of the error reporting System Control Facility Unit (xscfu_b
    or xscfu_c).
    - If the errors persist, contact your Authorized Service Provider to obtain assistance.

Internal Section

This document is managed and updated by a team of engineers committed
to improving the accuracy of the content and formatting and presentation
of the material. Please use the Document Feedback Alias listed below
if there are comments or questions. 

Aliases and Support Information
Document Feedback Alias: [email protected]
Support Alias: [email protected]
Alias Archive: http://archives.central/alias/opl-support
Instant Message Forum: gl-esg
Call Management Queue: GL-ESG

Keywords: OPL, FRU, FMA, Fault Event, Repeat Error, Additional Troubleshooting, FMSP

Previously Published As 88172

 


Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback