Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1324906.1
Update Date:2011-11-23
Keywords:

Solution Type  Technical Instruction Sure

Solution  1324906.1 :   Sun Storage 7000 Unified Storage System: How to check if PCIe card should be replaced again because of fault status after the replacement.  


Related Items
  • Sun Storage 7410 Unified Storage System
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>NAS>SN-DK: 7xxx NAS
  •  
  • .Old GCS Categories>Sun Microsystems>Storage - Disk>Unified Storage
  •  


slot-003     PCIe 5      faulted   Sun Microsystems, Inc.  Fishworks CLUSTRON 100

In this Document
  Goal
  Solution


Created from <SR 3-3322656141>

Applies to:

Sun Storage 7410 Unified Storage System - Version: Not Applicable and later   [Release: N/A and later ]
Information in this document applies to any platform.

Goal

To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community - 7000 Series ZFS Appliances

Customer can determine if PCIe card should replaced when the card is still reported as faulty after a hardware replacement  - by comparing the log and command outputs.

Solution

fmadm command output shows the following errors, it should be replaced of course.


Mar 30 22:16:06.1093 c1e0940d-bf0d-e98c-c9a1-d887d5184e2f PCIEX-8000-0A
100% fault.io.pciex.device-interr

Problem in: hc://:product-id=Sun-Fire-X4440:server-id=uramaki:chassis-id=0943QAF007/motherboard=0/hostbridge=4/pciexrc=4/pciexbus=129/pciexdev=0/pciexfn=0
Affects: dev:////pci@1,0/pci10de,378@b/pci104c,8231@0
FRU: hc://:product-id=Sun-Fire-X4440:server-id=uramaki:chassis-id=0943QAF007/motherboard=0/hostbridge=4/pciexrc=4/pciexbus=129/pciexdev=0
Location: PCIExp SLOT5

Apr 01 00:26:43.7253 c1e0940d-bf0d-e98c-c9a1-d887d5184e2f FMD-8000-6U Resolved
100% fault.io.pciex.device-interr Repair Attempted

Problem in: hc://:product-id=Sun-Fire-X4440:server-id=uramaki:chassis-id=0943QAF007/motherboard=0/hostbridge=4/pciexrc=4/pciexbus=129/pciexdev=0/pciexfn=0
Affects: dev:////pci@1,0/pci10de,378@b/pci104c,8231@0
FRU: hc://:product-id=Sun-Fire-X4440:server-id=uramaki:chassis-id=0943QAF007/motherboard=0/hostbridge=4/pciexrc=4/pciexbus=129/pciexdev=0
Location: PCIExp SLOT5

After replacing a faulted card with new one - but it is still shown as faulty - it is suggested to check the logs and command outputs from the supportbundle.

At first it is better to compare the previous fmadm log (before the replacement) and the current log (after the replacement). If there any changes between the previous one and the current one, it indicates the replacement made effects and we still needs to concentrate PCIe card. But if no change after the replacement, the root cause may be other stuff (as like slot in itself).

In this case the card in problem was replaced with new one on Apr 1. Comparing two logs:

Mar 30 22:16:06.1093 c1e0940d-bf0d-e98c-c9a1-d887d5184e2f PCIEX-8000-0A
100% fault.io.pciex.device-interr

Problem in: hc://:product-id=Sun-Fire-X4440:server-id=uramaki:chassis-id=0943QAF007/motherboard=0/hostbridge=4/pciexrc=4/pciexbus=129/pciexdev=0/pciexfn=0
Affects: dev:////pci@1,0/pci10de,378@b/pci104c,8231@0
FRU: hc://:product-id=Sun-Fire-X4440:server-id=uramaki:chassis-id=0943QAF007/motherboard=0/hostbridge=4/pciexrc=4/pciexbus=129/pciexdev=0
Location: PCIExp SLOT5

Apr 01 00:26:43.7253 c1e0940d-bf0d-e98c-c9a1-d887d5184e2f FMD-8000-6U Resolved
100% fault.io.pciex.device-interr Repair Attempted

Problem in: hc://:product-id=Sun-Fire-X4440:server-id=uramaki:chassis-id=0943QAF007/motherboard=0/hostbridge=4/pciexrc=4/pciexbus=129/pciexdev=0/pciexfn=0
Affects: dev:////pci@1,0/pci10de,378@b/pci104c,8231@0
FRU: hc://:product-id=Sun-Fire-X4440:server-id=uramaki:chassis-id=0943QAF007/motherboard=0/hostbridge=4/pciexrc=4/pciexbus=129/pciexdev=0
Location: PCIExp SLOT5



Apr 01 15:04:57.4136 ad841830-7d3a-ec4d-e2bb-82732521608b PCIEX-8000-DJ
40% fault.io.pciex.device-noresp

Problem in: hc://:product-id=Sun-Fire-X4440:server-id=uramaki:chassis-id=0943QAF007/motherboard=0/hostbridge=4/pciexrc=4/pciexbus=129/pciexdev=0/pciexfn=0
Affects: dev:////pci@1,0/pci10de,378@b/pci104c,8231@0
FRU: hc://:product-id=Sun-Fire-X4440:server-id=uramaki:chassis-id=0943QAF007/motherboard=0/hostbridge=4/pciexrc=4/pciexbus=129/pciexdev=0
Location: PCIExp SLOT5

40% fault.io.pciex.device-interr

Problem in: hc://:product-id=Sun-Fire-X4440:server-id=uramaki:chassis-id=0943QAF007/motherboard=0/hostbridge=4/pciexrc=4/pciexbus=129/pciexdev=0/pciexfn=0
Affects: dev:////pci@1,0/pci10de,378@b/pci104c,8231@0
FRU: hc://:product-id=Sun-Fire-X4440:server-id=uramaki:chassis-id=0943QAF007/motherboard=0/hostbridge=4/pciexrc=4/pciexbus=129/pciexdev=0
Location: PCIExp SLOT5

20% fault.io.pciex.bus-noresp

Problem in: hc://:product-id=Sun-Fire-X4440:server-id=uramaki:chassis-id=0943QAF007/motherboard=0/hostbridge=4/pciexrc=4/pciexbus=129/pciexdev=0/pciexfn=0
Affects: dev:////pci@1,0/pci10de,378@b/pci104c,8231@0
FRU: hc://:product-id=Sun-Fire-X4440:server-id=uramaki:chassis-id=0943QAF007/motherboard=0/hostbridge=4/pciexrc=4/pciexbus=129/pciexdev=0
Location: PCIExp SLOT5


It seems errors were changed.  Before the replacement, the error is fault.io.pciex.device-interr while there are two kinds of errors, fault.io.pciex.device-noresp and fault.io.pciex.device-interr. But this slot is still faulted as follows.

slot-003 PCIe 5 faulted Sun Microsystems, Inc. Fishworks CLUSTRON 100

Paying attention to this change, this is not slot problem.  If this slot is really faulty, no change should be made by the card replacement. It is recommended that the PCI card is NOT replaced again - just reseat the current card then check if this slot is OK.


Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback