Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1008075.1
Update Date:2012-07-30
Keywords:

Solution Type  Technical Instruction Sure

Solution  1008075.1 :   Sun Fire[TM] E25K, E20K, E15K, E12K : BERR panic events off PCI-X/PCI Quad Gigaswift ethernet card installed on schizo based hsPCI slot  


Related Items
  • Sun Fire E25K Server
  •  
  • Sun Fire E20K Server
  •  
  • Sun Fire 12K Server
  •  
  • Sun Fire 15K Server
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>Enterprise>SN-SPARC: SF-Exxk
  •  
  • .Old GCS Categories>Sun Microsystems>Servers>High-End Servers
  •  

PreviouslyPublishedAs
211118


Applies to:

Sun Fire 12K Server
Sun Fire 15K Server
Sun Fire E20K Server
Sun Fire E25K Server
All Platforms

Goal

This document discusses Bus Error panic events ( BERR ) observed on Sun Fire [TM] E25K, E20K, E15K, E12K resident domains deployed with PCI-X/PCI Quad Gigaswift ethernet card(s) (X4445).

This specific anomaly is observed on domain environments running Solaris[TM] 8 and where the PCI-X/PCI Quad Gigaswift ethernet card(s) is resident in Hot Swap PCI (hsPCI) I/O Board.

Solution

For example, the BERR panic event as reproduced in a Solaris[TM] 8 domain environment resident in a Sun Fire 15000 platform loaded with the Solaris 8, Kernel patch 117350-39 and ce driver patch 111883-34 (which deploys CE Ethernet Driver v1.154):
/IO14/C5V0 PCI 476 B 33 33 1,0 ok pci-pci8086,537c.7/network (netw+ pci-bridge 
/IO14/C5V0 PCI 476 B 33 33 0,0 ok network-pci100b,35.30 SUNW,pci-x-qge 
/IO14/C5V0 PCI 476 B 33 33 1,0 ok network-pci100b,35.30 SUNW,pci-x-qge 
/IO14/C5V0 PCI 476 B 33 33 2,0 ok network-pci100b,35.30 SUNW,pci-x-qge 
/IO14/C5V0 PCI 476 B 33 33 3,0 ok network-pci100b,35.30 SUNW,pci-x-qge 

PCI-X/PCI Quad Gigaswift ethernet card (X4445) deployed at IO14 slot C5V0.

The BERR panic event reproduced in the above domain OS environment is reported as follows:
WARNING: [AFT1] Bus Error (BERR) Event detected by CPU515 Privileged Data Access at TL=0, errID 0x00000479.9d38b088
AFSR 0x00100800.00000000 AFAR 0x00000479.00200000 Fault_PC 0x10035eb4
panic[cpu515]/thread=2a10027dd20: [AFT1] errID 0x00000479.9d38b088 BERR Error(s)

where the AFAR value reported off the above AFT1 event is isolated to the following I/O location:
redxl> parse pa 479.00200000
16 GB Nasm slice 30 = 0x1E, 4GB quadrant 1: IOBus IO14/P0/B1 (14.1.0.1) Offset = 00200000

=> which translates to the I/O location housing the qge (X4445) card (IO14 slot C5V0).

Under conditions as described by the domain OS environment above, there's very little yield in addressing the BERR panic condition via replacing the PCI-X/PCI Quad Gigaswift ethernet card(s) (X4445) flagged by the BERR panic event.

Instead, the affected site should explore the following course of action to workaround the BERR panic condition:
  • Replace the PCI-X/PCI Quad Gigaswift ethernet card(s) (X4445, QGE-X) with the X4444, QGE PCI Quad Gigaswift ethernet card(s).
  • A binary CE driver on Solaris 8, version v1.158, has also been successfully verified as providing redress against this BERR panic condition; this is delivered by greater ce driver patches; i.e 111883-36 delivers CE Ethernet Driver v1.159; latest Solaris 8 ce patch is 11883-37 that has CE Ethernet Driver v1.162

Efforts to replicate the same BERR panic condition have not been successful in the same domain OS environment described above when the X4445 PCI-X/PCI Quad Gigaswift ethernet card is replaced by a X4444 PCI Quad Gigaswift ethernet card. The test results available thus far point to the fact that the X4444 card can be used a workaround solution to the BERR panic condition encountered when the X4445 card is deployed. Please bear in mind that there is no RoHS compliant version of the X4444.


Product

Sun Fire E25K Server
Sun Fire E20K Server
Sun Fire 15K Server
Sun Fire 12K Server

Internal Section

Issue is tracked into bug 6456611; bug is related to CR 6385683 (25K panics with Schizo and XMITS cards when testing fix for IB), where ce driver starting from v1.158 have been tested.

The above binary CE driver on Solaris 8 was verified on the same Solaris 8 environment used to reproduce the BERR panic events and from all indications, the new CE binary have successfully addressed the BERR panics encountered whilst using the original ce driver ( v1.154 .. aka 111883-34).

Please note that Solaris 8 Extended Support will end on May 2012: check the "Oracle Lifetime Support Policy" doc for further details.

References: Bug Id 6456611 and 6385683 fixed in Solaris 9 112817-29 and Solaris 10 118777-08

Keywords: 111883, ce, berr, solaris 8, qge-x, qge, hspci, schizo, x4445, x4444, afar, starcat, amazon, 6456611, 6385683, 1.154, 1.158

Previously Published As 87034



Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback