Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type FAB (standard) Sure Solution 1000859.1 : PCI and PCI+ IO Assemblies in Sun Fire 4800/4810/6800 or Sun Fire E4900/E6900 systems may fail to power up if one or more QGE cards are installed.
PreviouslyPublishedAs 201140 Product Sun Fire 4800 Server Sun Fire 4810 Server Sun Fire 6800 Server Sun Fire E6900 Server Sun Fire E4900 Server Bug Id <SUNBUG: 6237685> Part
Impact If one or more Quad GigaSwift Ethernet UTP (QGE) adapters are present in one of the slots of a PCI or PCI+ IO Assy, the IO Assy may fail to power up. This will prevent domains from utilizing any resources in that IO Assy. Contributing Factors This affects Sun Fire 4800, 4810, 6800, E4900, and E6900 systems where a QGE card, Option X4444A, is installed in a PCI or PCI+ IO Assy. The issue will only occur if there are additional HBAs/NICs installed in the PCI or PCI+ IO Assy along with the QGE card/s. It has been seen in configurations where there are 7 or 8 total cards in an IO Assy. This issue does not affect OS-booted production domains. It is not seen once the system has been successfully booted, unless it is later shutdown and then powered on again. Since the PCI or PCI+ IO Assy will fail before the affected domain is booted, it is not possible to identify the affected QGE card using prtdiag(1M). However, here is an excerpt from a typical prtdiag showing a QGE card installed in slot 4. ========================= IO Cards ========================= Bus Max IO Port Bus Freq Bus Dev, FRU Name Type ID Side Slot MHz Freq Func State Name Model ---------- ---- ---- ---- ---- ---- ---- ---- ----- -------------------------------- -------------- /N0/IB7/P1 PCI 27 B 4 33 33 1,0 ok pci-pci8086,b154.0/pci (pci) pci-bridge /N0/IB7/P1 PCI 27 B 4 33 33 0,0 ok pci-pci8086,b154.0/network (netw+ pci-bridge /N0/IB7/P1 PCI 27 B 4 33 33 0,0 ok network-pci100b,35.30 SUNW,pci-qge /N0/IB7/P1 PCI 27 B 4 33 33 1,0 ok network-pci100b,35.30 SUNW,pci-qge /N0/IB7/P1 PCI 27 B 4 33 33 4,0 ok pci-pci8086,b154.0/network (netw+ pci-bridge /N0/IB7/P1 PCI 27 B 4 33 33 2,0 ok network-pci100b,35.30 SUNW,pci-qge /N0/IB7/P1 PCI 27 B 4 33 33 3,0 ok network-pci100b,35.30 SUNW,pci-qge
When this occurs during poweron of a PCI or PCI+ IO Assy, a message similar to the following will appear. nspga:A> poweron all /N0/SB0: powered on /N0/SB2: powered on Mar 08 10:54:00 nspga Domain-A.SC: sun.serengeti.HpuFailedException: PCI I/O Board at /N0/IB6 /N0/IB6: powered on /N0/IB8: powered on Root Cause It has been determined that the initial power surge during power-on of the QGE card could cause the DC-DC converter on the PCI or PCI+ IO Assy to shut down. The QGE card consumes high power during power-up initialization, leading to this situation. The cumulative power surge during powerup initialization may cause the DC-DC converter to shut down. However, if the DC-DC converter "rides through" the initial powerup, then under nominal operation, this issue is not encountered. The issue has been addressed with an update to the QGE card (ECO WO_31157) which limits the initial current spike at poweron. The part number has been dash rolled from -07 to -08. GSAP 3122 was implemented to purge P/N 501-6522-07 and lower from Service's inventory. Workaround The workaround is to remove cards that are not being used (if such cards can be identified). This will decrease the cumulative power surge and avoid overloading the DC-DC converter on the PCI or PCI+ IO Assy. Isolating the QGE card to an IO Assy configuration with fewer populated slots is required to ensure the power draw issue does not affect any IO Assy operation. Resolution Resolution If customers encounter this issue, replace any offending QGE Card with the fixed version.
This is an Upon Failure remediation and should only be performed if this specific issue has been identified at a customer site. Comments
This issue was fully evaluated as an FCO candidate via the official FCO process. However, the FCO was rejected due to the very low expected failure rate, and the fact that it has not been reported at any customer site. References
Modification History Date: 04-MAY-2006 . Updated technical fix in Issue Description. . Added final resolution (HW replacement) to Corrective Action. Previously Published As 101753 Contacts Internal Contributor/submitter: Kevin Siebenthal Internal Eng Business Unit Group: KE Authors Internal Eng Responsible Engineer: Ron Emerick Internal Services Knowledge Engineer: Pete Stauffer Internal Kasp FAB Legacy ID 101753 Product_uuid 29d3a694-0a18-11d6-92da-df959df44cdd|Sun Fire 4800 Server 29d6f808-0a18-11d6-8aa8-943929fbbdd8|Sun Fire 4810 Server 29da7938-0a18-11d6-8a41-9ed1ad6d6779|Sun Fire 6800 Server 4fe39727-0599-11d8-84cb-080020a9ed93|Sun Fire E6900 Server bed24aa9-0598-11d8-84cb-080020a9ed93|Sun Fire E4900 Server Attachments This solution has no attachment |
||||||||||||
|