Document Audience: | INTERNAL |
Document ID: | A0261-1 |
Title: | Sun Fire V20z and V40z with affected Infineon 1GB DDR400 DIMMs may experience memory errors causing system Panics. |
Copyright Notice: | Copyright © 2007 Sun Microsystems, Inc. All Rights Reserved |
Update Date: | Thu Feb 02 00:00:00 MST 2006 |
__________________________________________________________________
*** Sun Confidential: Internal Use and Authorized VARs Only ***
__________________________________________________________________
This message including any attachments is confidential information
of Sun Microsystems, Inc. Disclosure, copying or distribution is
prohibited without permission of Sun. If you are not the intended
recipient, please reply to the sender and then delete this message
__________________________________________________________________
FIELD CHANGE ORDER
(For Authorized Distribution by Sun Services)
FCO #: A0261-1
Status: active
Synopsis: Sun Fire V20z and V40z with affected Infineon 1GB DDR400 DIMMs may experience memory errors causing system Panics.Date: Feb/02/2006
Top FIN/FCO Report: Yes
PRODUCT REFERENCE: Sun Fire V20z/V40z
Product Category: Desktop / System component
Product Affected:
Platform Description
-------- -----------
A55 Sun Fire V20z
A57 Sun Fire V40z
X-Options Affected
--------- -------
Mkt_ID Platform Model Description
------ -------- ----- -----------
X9296A A55 / A57 - 2GB, 2x1GB 400MHz SFV20zV40z
Parts Affected:
Part Number Description
----------- -----------
370-7805-01 1GB PC3200 DDR400 DIMM (single DIMM, non-CRU)
540-6428-01 2GB (2 x 1GB DIMMs), CRU
References:
URL: http://sunsolve.central.sun.com/handbook_internal/Devices/Memory/MEM_SunFireV20z.html#9295
Issue Description:
A limited number of Infineon 1GB DDR400 DIMMs (Infineon part number
HYS72D128300GBR-5-B) may experience multiple bit memory errors, which
can lead to system Panics. Only Infineon 1GB DIMMs with the affected
part number as listed above AND with a "Manufacture Date" of 2005-05-27
or earlier are affected.
The system will panic with a generic TRAP and the memory errors will be
logged to /var/adm/messages.
The system does reboot (per normal default settings for system panics)
but this is dependent on the OS settings. Typically panics do not cause
data loss but in rare instances some data may be lost between the time of
the memory error and the system panic.
Under conditions where high rates of ECC are occurring within the polling
interval multiple recovered ECC events will be recorded as a single event.
To assist the field in a communication of this issue to customers, a Customer
Letter has been provided via the below "Internal Only" URL;
http://sdpsweb.central/FIN_FCO/FCO/A0261-1/SPE/Customer_Letter.sxw
Root cause analysis determined this was caused by a design marginality issue
with specific Infineon DIMM IDT registers. When a register input parameter
(timing/level) gets too tight, the register sporadically shows improper
functionality.
Additionally, there was a process escape in the manufacturing where
pre-release material was incorrectly stocked in the same room as production
material. When production needed parts, they incorrectly selected the
pre-release parts.
All affected Infineon DIMMs were purged via Sun Stop Ship as of May 12, 2005.
This Stopship was lifted on May 27, 2005. Affected parts at factory, X-dock,
VMI, and inventories were replaced with Micron DIMMs. A new Infineon 1GB
DDR400 DIMM with TI registers and TI PLL has been qualified and will be used
as replacements for this FCO.
Corrective Action for the pre-release material escape was to segregate the
pre-release material from production material. Tighter controls were
established at Infineon to prevent mixed vendors of registers and PLLs on
DIMM Build of Material (BOMs).
All RSLs were evaluated, and it was determined that no Infineon 1GB DIMMs
were shipped as spares to Sun Services. Therefore no RSL purge was required.
IMPLEMENTATION TYPE:
---
| X | MANDATORY (Fully Pro-Active)
---
---
| | CONTROLLED PRO-ACTIVE (per Sun Geo Plan)
---
---
| | UPON FAILURE
---
IMPLEMENTATION TARGET COMPLETION DATE: March 30, 2006
Replacement Time Estimate:
0.5 hours
Special Considerations:
Because the replacement part is a CRU, customers should perform identification
and onsite replacement unless they hold a Silver or above contract on the
affected system, AND specifically request onsite support from Sun.
This FCO will have a time zone phased release based on material readiness
as follows:
Readiness Date
--------------
US/Canada READY
Ltn America READY
EMEA READY
APAC READY
ANZO Sep/23/05
Japan READY
The above dates represent when each time zone has determined that it will be
materially ready to support this FCO. All dates are estimates. Please check
with your Logistics Representative for more information with regard to material
availability.
Corrective Action:
Hot Swappable: No
Identify and replace all affected DIMMs as follows:
Replace affected Infineon DIMMs, (Sun p/n 370-7805-01) in pairs with like part
numbers. All affected Infineon DIMMs have been purged from all Sun Stocking
Locations.
Replacement DIMMs are packaged in pairs and should be ordered and returned
in pairs as Sun p/n 540-6428.
Note! DIMMs must be replaced in pairs. Use the Identification section below
to identify affected DIMMs.
Identification of Affected DIMMs:
---------------------------------
Note: A customer list with SUSPECT systems is available via the link below.
Not all systems on the list may be affected by this FCO. For systems
in the list, follow all the instructions below to determine if they
have any affected DIMMs.
http://sunwebcollab.central.sun.com/gm/document-1.9.1162833
(Note: The above URL is not accessible to those outside of SWAN.)
Affected DIMMs can be identified by either one of the following two processes
after first logging into the SP CLI from a remote server;
ssh -l
The is the manager account setup during first use of the stinger.
If it is a out-of-the-box machine, it probably doesn't has a manager account setup,
so you have to go through the step to setup a manager account in the SP by
ssh -l setup
1) Service Processor (SP) command (does not require a system power down)
To identify through the SP, enter "Inventory get hardware," which will present
the following:
localhost # inventory get hardware
Name Type OEM Manufacture Date HW Rev Part #
CPU 0 DIMM 0 memory c100000000000000 2004-02-26 0262 72D128300GBR5B
CPU 0 DIMM 1 memory c100000000000000 2004-02-26 0262 72D128300GBR5B
CPU 0 DIMM 2 memory c100000000000000 2004-08-12 0204 72D128300GBR5B
CPU 0 DIMM 3 memory c100000000000000 2004-08-12 0204 72D128300GBR5B
DDR 0 VRM memvrm S-SCI431 2005-01-13 X01 S01479
CPU 0 cpu AuthenticAMD NA x86 Family 15 Model 33 Stepping 2
CPU 0 VRM vrm S-SCI431 2005-03-28 X01 S02325
CPU 1 DIMM 0 memory c100000000000000 2005-03-26 0209 72D128300GBR5B
CPU 1 DIMM 1 memory c100000000000000 2005-03-26 0209 72D128300GBR5B
CPU 1 DIMM 2 memory c100000000000000 2005-03-26 0308 72D128300GBR5B
CPU 1 DIMM 3 memory c100000000000000 2005-03-26 0308 72D128300GBR5B
DDR 1 VRM memvrm S-SCI431 2005-01-24 X01 S01479
CPU 1 cpu AuthenticAMD NA x86 Family 15 Model 33 Stepping 2
CPU 1 VRM vrm S-SCI431 2005-03-28 X01 S02325
PIC frontpanel NA
Motherboard planar S-SCI431 2005-03-28 A01 S02595
Identify the affected DIMMs with both of the following:
a) "Part#" of 72D128300GBR5B
b) "Manufacture Date" associated with each DIMM, with a date of 2005-05-27 or earlier.
2) Visual Inspection (requires opening up the system and looking at the DIMM)
The affected Infineon 1GB DDR400 DIMM has a white Infineon label with the part number
"HYS72D128300GBR-x-x", and a manufacturing code "BVV52025079." The fourth, fifth, and
sixth number following the letters BVV, are the date code which identify the affected
DIMMS. For example,
Label shows BVV51225079
^^^
|||
Year||
||
Week
The date range of the affected DIMMs are:
521 (year 2005, week 21) and any week prior, (i.e. BVV52125079, BVV5205079, BVV51925079...)
DIMMs with date range of 522 and later are good.
Photograph of affected DIMM and label to identify the affected Manufacturing Date code:
http://sdpsweb.central/FIN_FCO/FCO/A0261-1/SPE/Infineon.jpg
Send email to [email protected] for questions or comments about
this Field Change Order.
Comments:
Complete instructions for specific components mentioned can be found via the following
URL to the complete list of V20z/v40z documentation.
http://www.sun.com/products-n-solutions/hardware/docs/Servers/Workgroup_Servers/Sun_Fire_V20z/index.html
CHANGE HISTORY:
Sep/23/05 - Removed Controlled Proactive verbage for Special Considerations
section. Republished FCO to distribution alias.
Oct/07/05 - Corrected broken links to Customer Letter and Photograph of affected DIMM.
Feb/02/06 - Modified Note in the "Identification of Affected DIMMs" section of the
Corrective Action to explain that not all systems in the Customer List
may be affected.
________________________________________________________________________
NOTE: FCO Tracking Instructions for Radiance/SPWeb:
--------------------------------------------------
If a Radiance case involves the application of an FCO to solve a customer
issue, please complete the following steps in Radiance/SPWeb prior to
closing the case:
o Select "Field Change Order" in the REFERENCE TYPE field.
o Enter FCO ID number in the REFERENCE ID field.
For example; A0222-1.
If possible, include additional details in the REFERENCE SUMMARY field
(ie. Upgrade complete, customer declined, etc.)
________________________________________________________________________
Implementation Notes
--------------------
In case of "Mandatory" FCOs, Sun Services will attempt to contact
all known customers to recommend proactive implementation.
For "Controlled Proactive" FCOs, Sun Services mission critical
support teams will initiate proactive implementation efforts for
their respective accounts, as required.
For "Upon Failure" FCOs, Sun Services and partners will implement
the necessary corrective actions as the need arises.
The CIC process must be used for proactive hardware replacement
requests when an FCO is classified as "Upon Failure".
Billing Information
-------------------
Warranty: Sun will provide parts at no charge under Warranty
Service. On-Site Labor Rates are based on specified
Warranty deliverables for the affected product.
Contract: Sun will provide parts at no charge. On-Site Labor Rates
are based on the type of service contract.
Non Contract: Sun will provide parts at no charge. Installation by
Sun is available based on the On-Site Labor Rates
defined in the Price List.
________________________________________________________________________
All FCO documents are accessible via Internal SunSolve. Type "sunsolve"
in a browser and follow the prompts to Search Collections.
For questions on this document, please email:
[email protected]
The FCO homepage is available at:
http://tns.central/FCO/
For more information on how to submit a FCO, go to:
http://pronto.central/fco.html
To access the Service Partner Exchange, use:
https://spe.sun.com
________________________________________________________________________