Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1494894.1
Update Date:2012-10-08
Keywords:

Solution Type  Problem Resolution Sure

Solution  1494894.1 :   Sun Netra T5440 Server experiencing /SYS/MB/PCI_MEZZ Forced Fail (POST), but cause is MB/CMP0/PCI-AUX/SWITCH0  


Related Items
  • Sun Netra T5440 Server
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>CMT>SN-SPARC: T5xx0
  •  


This article is created to help the Technical Service Engineer perform a complete and accurate diagnosis of a recurring issue on the Netra T5440 where incomplete data review can lead to misdiagnosing the actual issue, thus causing inaccurate and ineffective Action Plans and incorrect part(s) replacement(s).

In this Document
Symptoms
Changes
Cause
Solution


Created from <SR 3-6243820605>

Applies to:

Sun Netra T5440 Server - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

The following POST output from the console of a Netra T5440:

2012-09-28 00:55:34.733 0:0:0>ERROR:
2012-09-28 00:55:34.736 0:0:0> POST toplevel status has the following failures:
2012-09-28 00:55:34.742 0:0:0> I/O ----------------------------------
2012-09-28 00:55:34.750 0:0:0> MB/CMP0/PCI-AUX/SWITCH0
2012-09-28 00:55:34.755 0:0:0>END_ERROR
2012-09-28 00:55:34.761 0:0:0>POST: Return to VBSC.
2012-09-28 00:55:34.766 0:0:0>Master set ACK for vbsc runpost command and spin...
Fault | critical: SP detected fault at time Fri Sep 28 00:55:36 2012. /SYS/MB/PCI_MEZZ Forced Fail (POST)

 
Due to this issue, users eventually cannot proceed beyond the POST-milestone failure and ultimately can no longer boot to the OS.


Verification

In some instances, the integrity of the system, though intermittent, may allow for users to overcome POST "failed" milestone and obtain OBP access (ok prompt) and then boot the OS long enough to gather diagnostic data, such as an explorer. However, if this is not obtainable, gathering information from the system controller will be the remaining avenue by which to gather needed data for effective diagnosis.


sc> showfaults -v
Last POST Run: Wed Sep 26 04:33:21 2012

Post Status: Failed devices: MB/PCI_MEZZ
 ID Time                           FRU               Class             Fault
  1 Sep 26 03:23:26                /SYS/MB/PCI_MEZZ                    SP detected fault: /SYS/MB/PCI_MEZZ Forced Fail (POST)

sc> showcomponent
...
Disabled Devices
 /SYS/MB/PCI_MEZZ Forced Fail (POST)


sc> showhost
Sun System Firmware 7.2.10 2010/07/19 17:10

Host flash versions:
  Hypervisor 1.7.9 2010/07/19 15:51
  OBP 4.30.9 2010/07/16 09:06
  POST 4.30.9 2010/07/16 09:47

sc> showfru
...
     /Status_EventsR[3]/UNIX_Timestamp32: Thu, Sep 13 2012 03:19:57 GMT
     /Status_EventsR[3]/Old_Status: 0x00 (OK)
     /Status_EventsR[3]/New_Status: 0xC0 (DISABLED, MAINTENANCE REQUIRED)
     /Status_EventsR[3]/Initiator: SCAPP
     /Status_EventsR[3]/Component: 0
     /Status_EventsR[3]/Event_Code: 02000000
     /Status_EventsR[3]/Message: /SYS/MB/PCI_MEZZ Forced Fail (POST)
     /Status_EventsR[4]
     /Status_EventsR[4]/UNIX_Timestamp32: Wed, Sep 26 2012 03:23:26 GMT
     /Status_EventsR[4]/Old_Status: 0xC0 (DISABLED, MAINTENANCE REQUIRED)
     /Status_EventsR[4]/New_Status: 0xC0 (DISABLED, MAINTENANCE REQUIRED)
     /Status_EventsR[4]/Initiator: SCAPP
     /Status_EventsR[4]/Component: 0
     /Status_EventsR[4]/Event_Code: 02000000
     /Status_EventsR[4]/Message: /SYS/MB/PCI_MEZZ Forced Fail (POST)


   sc> poweron
0:0:0>Sun Netra[TM] T5440 POST 4.30.9 2010/07/16 09:47
  /export/delivery/delivery/4.30/4.30.9/post4.30.9-micro/Niagara/congo/integrated (root)
0:0:0>Copyright (c) 2010, Oracle and/or its affiliates. All rights reserved.
...
0:0:0>MAU Tests...Done
0:0:0>NCU Setup and PIU link train....Done
0:0:0>
0:0:0>ERROR: TEST = PIU PCI id test
0:0:0>H/W under test = MB/CMP0/PCI-AUX/SWITCH0 0:0:0>Repair Instructions: Replace items in order listed by 'H/W under test' above.
0:0:0>MSG = ERROR: PCIE Link Width for device: MB/CMP0/PCI-AUX/SWITCH0.   Expected: 8
  Actual: 4
0:0:0>END_ERROR
0:0:0>
0:0:0>ERROR: TEST = PIU PCI id test
0:0:0>H/W under test = MOTHERBOARD, CPU_CHIP (system initialization)
0:0:0>Repair Instructions: Replace items in order listed by 'H/W under test' above.
0:0:0>MSG =
  *** Test Failed!! *** 0:0:0>END_ERROR
0:0:0>NEPTUNE Network Interface Unit Tests....DoChassis | major: Sep 28 00:55:34 ERROR: POST errors detected
2012-09-28 00:55:34.733 0:0:0>ERROR:
2012-09-28 00:55:34.736 0:0:0> POST toplevel status has the following failures:
2012-09-28 00:55:34.742 0:0:0> I/O ----------------------------------
2012-09-28 00:55:34.750 0:0:0> MB/CMP0/PCI-AUX/SWITCH0 2012-09-28 00:55:34.755 0:0:0>END_ERROR
2012-09-28 00:55:34.761 0:0:0>POST: Return to VBSC.
2012-09-28 00:55:34.766 0:0:0>Master set ACK for vbsc runpost command and spin...
Fault | critical: SP detected fault at time Fri Sep 28 00:55:36 2012. /SYS/MB/PCI_MEZZ Forced Fail (POST)
Chassis | major: Host is running

Netra T5440, No Keyboard
Copyright (c) 1998, 2010, Oracle and/or its affiliates. All rights reserved.
OpenBoot 4.30.9, 32544 MB memory available, Serial #82331574.
Ethernet address 0:14:4f:e8:47:b6, Host ID: 84e847b6.
 
ERROR: The following devices are disabled:
  MB/PCI_MEZZ Aborting auto-boot sequence.

 

Changes

This typically occurs after mezzanine replacement.

Cause

This is typically caused when jumper J19 is not installed on the mezzanine since the Netra X4450 & Netra T5440 use the same base mezzanine board, but the T5440 requires jumper J19 installed.  If this jumper is already installed, then the motherboard is most likely at fault especially if an intermittant fault during boot.
 

Solution

If Mezzanine J19 is not installed, then add this jumper from the prior board, otherwise replace the system board first.  The proper FRU can be determined by the showfru command output.

sc> showfru
/SYS/MB (container)
  SEGMENT: FL
     /Configured_LevelR
     /Configured_LevelR/UNIX_Timestamp32: Sat, Jan 22 2011 04:00:53 GMT
     /Configured_LevelR/Sun_Part_No: 5420238                        <--- ***
     /Configured_LevelR/Configured_Serial_No: 1005LCB-1045DW0009
     /Configured_LevelR/HW_Dash_Level: 01

/SYS/MB/PCI_MEZZ (container)
   SEGMENT: FD
      /InstallationR (1 iterations)
      /InstallationR[0]
      /InstallationR[0]/UNIX_Timestamp32: 2011-04-20T21:14:17+00:00
      /InstallationR[0]/Fru_Path: /SYS/MB/PCI_MEZZ
      /InstallationR[0]/Parent_Part_Number: 3713646   <----- SSH contains in FRU 540-7689
      /InstallationR[0]/Parent_Serial_Number: 42003O
      /InstallationR[0]/Parent_Dash_Level: 03

 

Depending on jumper instalation, the Mezzanine part 371-3646 is contained in FRU 540-7689 in the Sun System Handbook (SSH).  542-0238 is the proper FRU in the SSH: 8-Core 1.4GHz System Board Assembly, SATA

 


Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback