Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1007582.1
Update Date:2009-09-27
Keywords:

Solution Type  Problem Resolution Sure

Solution  1007582.1 :   Sun Fire[TM] 12K/15K: PCI SERR panic after missing ce0 device on the X2222A adapter during boot  


Related Items
  • Sun Fire 12K Server
  •  
  • Sun Fire 15K Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>High-End Servers
  •  

PreviouslyPublishedAs
210489


Oracle Confidential (INTERNAL). Do not distribute to customers
Reason: Migrated distribution from Sun

Symptoms
- Problem Statement:
Missing ce0 device on the X2222A Sun[TM] Dual Fast Ethernet + Dual SCSI PCI
Adapter
during OBP probe can lead to a PCI SERR panic during OS boot
- Symptoms:
. The panic occurs while booting the OS. After the initial failure occurs,
there are
successive boot failures.
. Exposure to this problem comes after a setkeyswitch on/HPOST execution
of the domain. The problem can
appear at install or after a period of stable operation and during
normal reboot operations.
. If OBP is unable to probe the ce0 interface of the X2222A adapter, this
is the symptom
of a condition that results in a PCI SERR panic while booting the OS.
The signature is:
WARNING: pcisch-2: PCI fault log start:
PCI SERR
PCI error occurred on device #6 dwordmask=0 bytemask=0
pcisch-2: PCI primary error (0):pcisch-2: PCI secondary error
(0):pcisch-2: PBM AFAR 0.00000000:
WARNING: pcisch2: PCI config space CSR=0x4280<signaled-system-error>
pcisch-2: PCI fault log end.


Resolution
- Troubleshooting:
A. To determine if the panic is the X2222A card failure, execute the
following commands at the OBP
prompt:
. Execute show-disks and probe-scsi-all to identify if there are missing
scsi connections or disk targets.
. Execute show-nets to identify if there are missing ce interfaces.
ok show-nets
a) /pci@1d,700000/pci@1/network@1
b) /pci@1c,700000/network@3,1
c) /pci@1c,700000/pci@1/network@1
d) /pci@1c,700000/pci@1/network@0
NOTE: /pci@1d,700000/pci@1/network@0 is the ce0 of this adapter and
missing from the probe.
B. Set the OBP variable diag-switch?=true to enable OBP device probing
diagnostic output on the
console:
. The following is an example device probe of PCI B in a good state:
Probing PCI B pci
Probing /pci@1d,700000 Device 1  pci
Probing /pci@1d,700000/pci@1 Device 0  network
Probing /pci@1d,700000/pci@1 Device 1  network
Probing /pci@1d,700000/pci@1 Device 2  scsi disk tape scsi disk tape
Probing /pci@1d,700000/pci@1 Device 3  Nothing there
Probing /pci@1d,700000/pci@1 Device 4  Nothing there
Probing /pci@1d,700000/pci@1 Device 5  Nothing there
Probing /pci@1d,700000/pci@1 Device 6  Nothing there
Probing /pci@1d,700000/pci@1 Device 7  Nothing there
Probing /pci@1d,700000/pci@1 Device 8  Nothing there
Probing /pci@1d,700000/pci@1 Device 9  Nothing there
Probing /pci@1d,700000/pci@1 Device a  Nothing there
Probing /pci@1d,700000/pci@1 Device b  Nothing there
Probing /pci@1d,700000/pci@1 Device c  Nothing there
Probing /pci@1d,700000/pci@1 Device d  Nothing there
Probing /pci@1d,700000/pci@1 Device e  Nothing there
Probing /pci@1d,700000/pci@1 Device f  Nothing there
Probing /pci@1d,700000 Device 2  bootbus-controller iosram
Probing /pci@1d,700000 Device 3  pci108e,1100 network firewire usb
. The following is an example device probe of PCI B in a failing state:
Probing PCI B pci
Probing /pci@1d,700000 Device 1  pci
Probing /pci@1d,700000/pci@1 Device 1  network
Probing /pci@1d,700000/pci@1 Device 2  scsi disk tape scsi disk tape
Probing /pci@1d,700000/pci@1 Device 3  Nothing there
Probing /pci@1d,700000/pci@1 Device 4  Nothing there
Probing /pci@1d,700000/pci@1 Device 5  Nothing there
Probing /pci@1d,700000/pci@1 Device 6  Nothing there
Probing /pci@1d,700000/pci@1 Device 7  Nothing there
Probing/pci@1d,700000/pci@1 Device 8  Nothing there
Probing /pci@1d,700000/pci@1 Device 9  Nothing there
Probing /pci@1d,700000/pci@1 Device a  Nothing there
Probing /pci@1d,700000/pci@1 Device b  Nothing there
Probing /pci@1d,700000/pci@1 Device c  Nothing there
Probing /pci@1d,700000/pci@1 Device d  Nothing there
Probing /pci@1d,700000/pci@1 Device e  Nothing there
Probing /pci@1d,700000/pci@1 Device f  Nothing there
Probing /pci@1d,700000 Device 2  bootbus-controller iosram
Probing /pci@1d,700000 Device 3  pci108e,1100 network firewire usb
NOTE: /pci@1d,700000/pci@1 Device 0 is missing from the probe.
C. To determine that the panicked PCI device instance corresponds to
the X2222A card with a missing ce device:
. Perform a grep of a previously captured /etc/path_to_inst file for
the pcisch instance = 2 (pcisch-2).  Use explorer output if
available.
"/pci@1d,700000" 2 "pcisch"
. Grep "pcisch2" from a previous successful start up in the
/var/adm/messages file:
pcisch2 at root: SAFARI 0x1d 0x700000
pcisch2 is /pci@1d,700000
. Alternatively, you can use the Solaris Device Path Decoder at
http://decoder.aus.sun.com.
- Resolution:
. Replace the identified X2222A card.  This has resolved the problem in all
previous instances of this bug.
. Verify OBP sees all 5 PCI devices by setting diag-switch?=true: pci
(bridge), 2x network,
2x scsi ports (where Device 2 represents the two port connections):
Probing PCI B pci
Probing /pci@1d,700000 Device 1  pci
Probing /pci@1d,700000/pci@1 Device 0  network
Probing /pci@1d,700000/pci@1 Device 1  network
Probing /pci@1d,700000/pci@1 Device 2  scsi disk tape scsi disk tape
. If the replacement card does not correct the panic, be certain to redo
the troubleshooting
steps above to confirm that the replacement card is not experiencing the
same failure.
. HPOST will be modified to drive the JTAG buss for the PCI adapters
during the auto connect
sequence to set TRST to low and change clock on.
- Summary of part number and patch ID's
X2222A - 501-5727-03   Dual FastEthernet + Dual SCSI PCI Adapter
SMS 1.2 patch 112488-08
- References and bug IDs
4723789 - PCI devices within Cauldron adapter intermittantly not seen
4732416 - hpost needs to modify auto-connect to properly connect the Cauldron
- Additional background information:
None.
- Meta-Data/Problem categorization:
Product/Platform: SF15K/SF12K
Category: hardware


Product
Sun Fire 15K Server
Sun Fire 12K Server

Previously Published As
48121

Change History
Date: 2003-10-20
User Name: 11511
Action: Approved
Comment: URL was updated. Ok to re-publish.
Version: 0
Date: 2003-10-12
User Name: 116819
Action: Approved
Comment: Did this in the lab...
Version: 0
Date: 2003-10-09
User Name: 100370
Action: Approved
Comment: modified document to fix URL for device path decoder.

kwyjibo.aus is depricated, decoder.aus is the correct URL.
Version: 0
Date: 2003-10-09
User Name: 100370
Action: Updated
Comment: need to update the device decoder url to decoder.aus, not kwyjibo.aus
Version: 0
Date: 2003-05-20
User Name: Administrator
Action: Migration from KMSCreator
Comment: updated by : Michele Whittaker
comment : Moving to standard format.

date : Oct 29, 2002



updated by : Michele Whittaker
comment : Moving to standard format.

date : Oct 29, 2002



updated by : Sandra McDougall
comment : format cleaned up and approved,
Owner reassigned
spelling checked
keywords and meta
date : Oct 25, 2002



updated by : Sandra McDougall
comment : Article created.
date : Oct 19, 2002
Version: 0
Product_uuid
29e4659c-0a18-11d6-9fa1-e67bbc033df8|Sun Fire 15K Server
077fd4c5-df8f-4320-ad69-7d01603a674d|Sun Fire 12K Server

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback