Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type FAB (standard) Sure Solution 1303679.1 : 8Gb Dual FC HBA PCIe (Metis) correctable errors on Blade servers.
In this Document
Oracle Confidential (PARTNER). Do not distribute to customers
Applies to:Sun Blade X6275 M2 Server Module - Version: Not Applicable to Not Applicable - Release: N/A to N/ASun Blade X6270 Server Module - Version: Not Applicable to Not Applicable [Release: N/A to N/A] Sun Blade X6275 Server Module - Version: Not Applicable to Not Applicable [Release: N/A to N/A] Sun Blade X6270 M2 Server Module - Version: Not Applicable to Not Applicable [Release: N/A to N/A] Sun Fire X4800 Server - Version: Not Applicable to Not Applicable [Release: N/A to N/A] Information in this document applies to any platform. __________ Affected X-Options: SG-XPCIEFC8GBE-Q8-N - 8Gb FC HBA, EM, Qlogic SG-XPCIEFC8GBE-Q8-Z - 8Gb FC HBA, EM, Qlogic SG-XPCIEFCGBE-E8-Z - 8Gb FC HBA, EM, Emulex Affected Parts: (FRU/CRU Part Number / Description) 371-4522-01 - 8Gb FC, Dual Ethernet PCIe Express Module 371-4666-01 - 8Gb FC, Dual Ethernet PCIe Express Module SymptomsExcessive PCIe correctable errors are detected.Impact When 1 or 2 Metis HBAs are installed in a blade server, excessive PCIe Correctable errors are detected. When used with SLES11 and SLES11SP1, these errors, combined with a Linux errata, will hang installation through the PEM, or hang the server at boot time. In all other Operating Systems, the errors are reported to log files via the Fault Management architecture mechanisms. Customer Impact: The number of correctable errors are not within the PCIe Gen 2.0 specified limit of cf2 10^-12 BER (or ~288 per hour maximum) cf0. While the nature of these correctable errors are not inherently harmful, the signal integrity is unacceptable for implementations. The performance impact of these errors should be negligible. However, the system may not boot when used with SLES11 (SUSE Linux) OS and PCIe Gen2 system. ChangesContributing FactorsThe following listed products are impacted... Sun Blade X6270 Sun Blade X6275 Sun Blade X6270M2 Sun Blade X6275M2 Sun Fire X4800 Server SPARC T3-4 SPARC T3-1B ...when one or both of the following occurs: . When 1 or 2 Metis HBAs (part numbers as identified above) are installed. . When used with SLES11 (SUSE Linux) and SLES11SP1 and PCIe Gen2 System. CauseRoot CauseThe root cause of this issue is ultimately the lack of ability for this HBA card, and more specifically the IDT PCIe Gen 2 switch on the card, to correctly respond to server DLLP settings during PCIe training. One of the more important settings, referred to as de-emphasis, sets the card's transmission and receiving to either -3.5dB or -6dB. This card defaults to -6dB, and does not respond properly to signals instructing the card to transmit -3.5dB. There are also secondary SI issues, relating to amplitude and varying clock jitter that are not fully understood at this time. The remedy is to force the upstream switch lanes on the card to PCIe Gen 1 speeds, when talking to the root port on the blade. The downstream ports (to the Gigabit Ethernet chip and the Fiber Channel chip) remain at their stock speeds. This mode of operation actually has significant testing on multiple x86/x64 and SPARC blade systems, as any server capable of only PCIe Gen 1 speeds operate in this mode. The performance impact of up to 8% degradation is expected when running full load. This corrective action was implemented by dash rolling the 371-4522 from -01 to -02 via ECO# E0000932, and purging Services inventories via GSAP# 5410 as of November 15, 2010. This corrective action was implemented by dash rolling the 371-4666 from -01 to -02 via ECO# E0002549, and purging Services inventories via GSAP# 5501 as of March 11, 2011. SolutionWorkaroundNo workaround available - see Resolution section below. Resolution . Upon failure only replace 371-4522-01 with 371-4522-02. . Upon failure only replace 371-4666-01 with 371-4666-02. Identification of Affected Parts (how to) The fixed HBAs will have a different EEPROM program that loads the configuration file for the IDT PCIe switch on the card. This configuration file level cannot be identified by an OS query, therefore visual inspection of the dash number on the card is the only way to determine if it is affected. Comments This issue was evaluated and determined not to meet FCO criteria as exposure is very limited and confined to when the PCI bus is running in Gen 2 mode only. References ECO: E0000932, E0002549 GSAP: 5410, 5501 WW Stop Ship: SSP #20771 For information about FAB documents, its release processes, implementation strategies and billing information, go to the following (Internal Only) URL: https://sunspace.sfbay.sun.com/display/Onestop/FAB%20(Field%20Action%20Bulletin) In addition to the above you may email: [email protected] Contacts: Contributor: [email protected] Responsible Engineer: [email protected] Responsible Manager: [email protected] Business Unit Group: [email protected] Attachments This solution has no attachment |
||||||||||||
|