Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-73-1020347.1
Update Date:2012-01-19
Keywords:

Solution Type  FAB (standard) Sure

Solution  1020347.1 :   FAB: Standard: Reactive: Important Service Procedure in the event Sun Storage 7000/7110/7210/7410 Unified Storage System NAS appliance does not boot.  


Related Items
  • Sun Storage 7110 Unified Storage System
  •  
  • Sun Storage 7210 Unified Storage System
  •  
  • Sun Storage 7410 Unified Storage System
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: Sun FAB
  •  

PreviouslyPublishedAs
256648


Bug Id
<SUNBUG: 6817092>, <SUNBUG: 6803822>, <SUNBUG: 6814738>, <SUNBUG: 6794292>

Product
Sun Storage 7110 Unified Storage System
Sun Storage 7210 Unified Storage System
Sun Storage 7410 Unified Storage System

Date of Preliminary Release
07-Apr-2009

Date of Resolved Release
29-May-2009

Important Service Procedure in the event Sun Storage 7000/7110/7210/7410 (see below)

Impact

In the event of a disk failure or unnecessary removal of disk or unsupported creation
of a HW RAID device,  the Sun Storage 7000 Unified Storage System may not boot and
will remain unbootable.

Data remains intact but unavailable. The 7x10 separates the system software from data devices
and prevents data loss or corruption.  Reinstallation of the system software does not result
in application data loss.


Contributing Factors

These are the affected platforms:

- Sun Storage 7000 family of products
- Sun Storage 7110, 7210, and 7410 Unified Storage Systems
- also known as Amber Road 7110, 7210, 7410 (7x10)
- also referred as 7x10 System NAS appliance

Note:
This issue (see numerous CRs listed above) only happens if one of the following
events or user actions occurs:
 
- a failed disk
- a simulated failure by manually removing a disk
- using the BIOS RAID controller (CTRl-C during BIOS setup)
  to create a HW RAID device  (which corrupts the system software and disk images)

Symptoms

The system does not boot.  Typically this results in dropping to the "grub> " prompt. 
Alternatively, the system could simply hang or enter a panic loop.

Console output could be one of three general scenerios:

      1. GRUB prompt "grub>"
      2. Panic loop
      3. Blank screen

Root Cause

Depending on the circumstances, either a software failure occurs due to
problems in resilvering of the failed system disks or user error as described above.

The resilvering issues have been resolved in the next major
System Software release now available.


Corrective Action

Workaround

To avoid this issue in most cases requires customer to not do the actions as described in the Contributing Factors above.


Recovery - Important Instructions

Step 1
Services must escalate every time they have a customer with this issue.

All 7x10 Systems which fail to boot must be escalated to the NAS Backline team
no matter how similar the situation. Each case needs to be escalated to RPE for evaluation. 
All recovery procedures will be driven by RPE. 

Step 2
The Backline team should check BIOS settings (drive and boot order),
ILOM / console settings and otherwise attempt to correct the issue without
attempting to modify GRUB or the appliance system software.

Step 3
If unable to do so, the matter should be escalated to the RPE Appliance team. 
The RPE Appliance team will evaluate on a case by case basis the best overall action:

      1. Instructions for the field and/or customer so as to
          prepare the system for an attempted recovery by RPE.
   
      2. RPE attends site to carry out a system reinstall.

      3. System returned to a Sun Office and RPE provided with
          access to a separate Sun server which can be used to
          temporarily setup a network install server and the 7x10
          System is reinstalled.

      4. CIC (Customer Intensive Care) and swap the hardware.
   
Note:  The above is an interim process.   Regarding the resolution:

The Fishworks engineering team will provide a mechanism for the  field to reinstall
a specific 7x10 System based on the chassis serial number.

The next major release of the 7x10 System Software will address the CRs related
to system disk resilvering. 


Resolution

Upgrade to Q2 2009 also called "2009.Q2".


The release notes contain a detailed description of all new
features and a list of known issues. It is strongly recommended for
all customers to thoroughly review these release notes prior to
upgrading systems.


This new version contains many bug fixes which enhance the quality
of the product - with specific emphasis on clustering, networking,
CIFS, and the core I/O path.


References

  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback