Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-75-1378725.1
Update Date:2012-07-09
Keywords:

Solution Type  Troubleshooting Sure

Solution  1378725.1 :   Sun Storage 7000 Unified Storage System: How to Identify a broken CPU  


Related Items
  • Sun Storage 7410 Unified Storage System
  •  
  • Sun Storage 7310 Unified Storage System
  •  
  • Sun ZFS Storage 7120
  •  
  • Sun ZFS Storage 7320
  •  
  • Sun Storage 7110 Unified Storage System
  •  
  • Sun ZFS Storage 7420
  •  
  • Sun Storage 7210 Unified Storage System
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>NAS>SN-DK: 7xxx NAS
  •  
  • .Old GCS Categories>Sun Microsystems>Storage - Disk>Unified Storage
  •  




In this Document
Purpose
Troubleshooting Steps
References


Applies to:

Sun ZFS Storage 7120 - Version Not Applicable to Not Applicable [Release N/A]
Sun ZFS Storage 7320 - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 7410 Unified Storage System - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 7110 Unified Storage System - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 7210 Unified Storage System - Version Not Applicable to Not Applicable [Release N/A]
7000 Appliance OS (Fishworks)
NAS head revision : [not dependent]
BIOS revision : [not dependent]
ILOM revision : [not dependent]
JBODs Model : [not dependent]
CLUSTER related : [not dependent]


Purpose

This document provides a short guideline how a broken CPU can be identified on a Sun Storage 7000 Unified Storage System.

To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community - 7000 Series ZFS Appliances

Troubleshooting Steps

If a CPU on a Sun Storage 7000 Unified Storage System fails it can be identified by certain log entries or alerts.

If any thing is wrong with the system hardware the Service Processor (SP) and the Fault Management System (FMS) notify the system administrator by turning on the "service required" LED on the chassis of the head containing the CPU.
As the appliance might be located in a Data Center, it might take a while to recognize the amber fault LED on the chassis. Therefore, the appliance shows alerts in the Status area of the BUI, and it turns the green LED to amber next to the appropriate component in the "HARDWARE" section of the "Status > Dashboard" BUI screen.
The Dashboard shows a status overview for services and hardware on the left with hardware overview on the bottom.
An amber LED next to CPU means one CPU in the system has been detected to have problems. If you follow this track by clicking to the item next to the amber LED, it will guide you to the hardware overview on "Maintenance > Hardware > CPU", where the current configuration of the system is displayed. On this screen, the faulty CPU can easily be spotted.

Now that the broken CPU has been identified, all necessary data for a service request should be collected and an hardware service request should be opened in My Oracle Support.

Data to collect::

  • <Document:1019887.1> explains how to collect a support bundle
  • How to take a Snapshot on ILOM 2.x, used on all Sun ZFS Storage Appliance 7x10 series, is explained here in the Sun Integrated Lights Out Manager 2.0 User Guide.
  • How to take a Snapshot on ILOM 3.x, used on all Sun ZFS Storage Appliance 7x20 series, is explained in <Document:1020204.1>
  • Sometime the head with the failed CPU panics to protect the data. On such occasions make sure the coredump is collected and uploaded with the bundle.

 

If a coredump is included in a bundle, the bundle might become much bigger than usual. If the automatic upload of the bundle does not complete for some reason this can cause the coredump and potentially other useful data to be lost, as the system clears out the directories where the core file is held if it considers a bundle has been successfully run and uploaded. To secure the coredump cancel the automatic upload and manually download the support bundle to a local system.



Keep in mind that a snapshot of the SP takes a while, but the file is created instantly after starting the snapshot. It is possible to spot that the upload of the SP snapshot has finished by checking the size of the file on the target, the upload has finished when the size stops growing.

References

<NOTE:1019887.1> - Sun Storage 7000 Unified Storage System: How to collect a supportbundle using the BUI or CLI
<NOTE:1020204.1> - Collecting snapshot on ILOM 3.x and later platforms

Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback