Asset ID: |
1-73-1280772.1 |
Update Date: | 2012-07-02 |
Keywords: | |
Solution Type
FAB (standard) Sure
Solution
1280772.1
:
FCO A0310-1: Sun SPARC M9000-64 XBU boards may fail because of an inadequate clock connector design.
Related Items |
- Sun SPARC Enterprise M9000-64 Server
|
Related Categories |
- PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: Sun FAB
- .Old GCS Categories>Sun Microsystems>Sun FAB>Hardware Remediation>Mandatory
|
In this Document
Oracle Confidential (PARTNER). Do not distribute to customers.
Reason: FABs available to Internals and Partners only
Applies to:
Sun SPARC Enterprise M9000-64 Server - Version Not Applicable to Not Applicable [Release N/A]
Information in this document applies to any platform.
__________
Affected Parts:
371-2240-01 - Crossbar Unit, XBU_B
Symptoms
Customers may see FMA diagnosed faults (as reported on the Active role XSCF) similar to any of the following signatures:
Jan 07 00:15:46.5751 ereport.chassis.SPARC-Enterprise.asic.xb.test (fmdump -e output)
Jan 7 00:15:52 xscfhost Alarm: /XBU_B#3/CL,/XBU_B#3,*:SCF:XB clk-cable test error (showlogs monitor output)
Jan 7 08:09:33 xscfhost Alarm: /XBU_B#11,/XBU_B#3,*:ANALYZE:XBU-XBU interface fatal error (showlogs monitor output)
The XSCF 'showstatus' command output would appear as follows:
* XBU_B#1 Status:Deconfigured;
* XBU_B#3 Status:Faulted;
* XBU_B#5 Status:Deconfigured;
* XBU_B#7 Status:Deconfigured;
* XBU_B#9 Status:Deconfigured;
* XBU_B#11 Status:Degraded;
* XBU_B#13 Status:Deconfigured;
* XBU_B#15 Status:Deconfigured;
Impact
When a Crossbar Unit (XBU_B) fails, all platform domains will crash. The system will attempt to automatically deconfigure the failed unit and recover onto a degraded half backplane.
Changes
Contributing Factors
Crossbar Units (371-2240-01) residing in M9000-64 are vulnerable to a clock connector related failure. The connector is susceptible to damage from an over-torqued cable connection. This connection exists only on an M9000-64 when XBU_B in the base and expansion cabinets are linked. The connector is not used in the M9000-32 and hence it is not exposed to the problem.
Failures may occur at installation time or deteriorate to a failure over time.
Due to the number of XBU spares required per system, the list of affected customers is being maintained at the region level by the Regional FCO Drivers. This list prioritizes the order of systems to be remediated. Due to limited availability of XBU spares, remediation will need to be based on a prioritized customer list, and your Regional FCO Driver will make known to you when parts are available to you to order for your customer.
Cause
Root Cause
The clock connector mount on the revision -01 XBU_B provided insufficient strength to protect the circuit board interface from over torquing damage. The board connector mount was redesigned to sustain greater forces and in order to incur no damage.
All M9000 systems manufactured after May 2009 included revision -02 or higher Crossbar Units which include the strengthened connector redesign. Service spares were reworked to -02 via GSAP 4660.A beginning on October 5, 2009.
Solution
Target Completion Date: January 20, 2013
Hot Swappable? No
Workaround
No workaround is available - see Resolution section below.
Resolution
This is a two year proactive FCO and requires a valid hardware contract on each system to be remediated.
Until Target Completion Date listed above, proactively replace all 371-2240-01 Crossbar Unit (XBU_B) in M9000-64 system with 371-2240-02 (or above). After the Target Completion Date systems should be only remediated per standard break-fix processes.
Identify the number of -01 boards which must be replaced. Work with your Regional FCO Driver identified in the "Hardware Remediation and Material Availability Details" section below to order that number of Crossbar Units along with one XBU Mitigation Kit (p/n 555-1959-01) per system.
The XBU Mitigation Kit will include a clock connector torque tool, 16 static protection bags used for repackaging returned Crossbar Units and an instruction manual.
For replacement procedures the SPARC Enterprise M8000/M9000 Servers Service Manual can be obtained via the below URL;
http://docs.oracle.com/cd/E19415-01/E27467/E27467.pdf
An Oracle legal approved Customer Letter is attached.
Identification of Affected Parts (how to)
All 371-2240-01 residing in M9000-64 systems are impacted by this FCO. The number of units within the system can be determined by logging into the Active role XSCF and executing the 'showhardconf' command. Output will be similar to the below:
XBU_B#0 Status:Normal; Ver:0201h; Serial:PP074403LA ;
+ FRU-Part-Number:CA06620-D302 A0 /371-2240-01 ; <=== Part # found on this line
<snip>
XBU_B#15 Status:Normal; Ver:0201h; Serial:PP0744052T ;
+ FRU-Part-Number:CA06620-D302 A0 /371-2240-01 ;
Locate the FRUs identified as XBU_B#xx. Only 371-2240-01 parts are impacted. All higher dash levels have been redesigned and are not vulnerable to the clock connector damage.
Parts may also be identified in the XSCF snapshot output in the xscf_command/@tmp@cli@[email protected] file. You can easily count the number of parts by executing this command:
"grep 371-2240-01 @tmp@cli@[email protected] | wc -l"
Note: It is important that the above procedures be used to obtain an accurate count of the revision -01 boards in the platform. Intermediate field service actions may have replaced revision -01 Crossbar Units with higher level parts. The system can hold a total of 16 Crossbar Units (XBU_B).
Hardware Remediation and Material Availability Details
At time of publication of this FAB all Regions were Materially Ready to support this activity. However, due to the number of units needed per system to address this issue, the field should work with their Regional FCO Drivers before placing orders. The Regional FCO Drivers are identified below:
North America: [email protected]
EMEA: [email protected] -or- [email protected]
Latin America: [email protected]
Japan: [email protected]
APAC: [email protected]
Comments
If you have questions about this FCO send email to the below alias;
[email protected]
References
BugID: 6842585
ECO: WO_40103
GSAP: 4660.A
For information about FAB documents, its release processes, implementation strategies and billing information, click here.
In addition to the above you may email:
[email protected]
Contacts:
Contributor: [email protected]
Responsible Engineer: [email protected]
Responsible Manager: [email protected]
Business Unit Group: Systems Group-OPL (Fujitsu, M4000 through M9000)
Attachments
This solution has no attachment