Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-75-1386810.1
Update Date:2012-07-17
Keywords:

Solution Type  Troubleshooting Sure

Solution  1386810.1 :   Sun Storage 7000 Unified Storage System: How to Troubleshoot Appliance Boot Problems  


Related Items
  • Sun ZFS Storage 7120
  •  
  • Sun ZFS Storage 7420
  •  
  • Sun Storage 7110 Unified Storage System
  •  
  • Sun Storage 7310 Unified Storage System
  •  
  • Sun ZFS Storage 7320
  •  
  • Sun Storage 7210 Unified Storage System
  •  
  • Sun Storage 7410 Unified Storage System
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>NAS>SN-DK: 7xxx NAS
  •  
  • .Old GCS Categories>Sun Microsystems>Storage - Disk>Unified Storage
  •  




In this Document
Purpose
Troubleshooting Steps
 1. Identifying boot issue
 2. Data collection
 For system that does not power up
 For system that power up, but no output from console
 For system that power up, but hung or panic at boot
References


Applies to:

Sun Storage 7210 Unified Storage System
Sun Storage 7410 Unified Storage System
Sun ZFS Storage 7120
Sun ZFS Storage 7420
Sun ZFS Storage 7320
7000 Appliance OS (Fishworks)
NAS head revision : [not dependent]
BIOS revision : [not dependent]
ILOM revision : [not dependent]
JBODs Model : [not dependent]
CLUSTER related : [not dependent]


Purpose

The purpose of this document is to assist in identifying, troubleshooting and resolving boot type related issues related to ZFS Storage Appliance.

The boot process of an Sun STorage 7000 unified Storage System consists of the following basic steps:

  1. Load the BIOS and run selftest
  2. Determine the boot device
  3. Determine the 'bootable' partition'
  4. Process the boot options
  5. Boot from the selected device
  6. Initialise the OS environment
  7. Start the system services (including 'akd')

Each Phase needs to be evaluated

To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community - 7000 Series ZFS Appliances

Troubleshooting Steps

Most boot related issue will required assistance from Oracle Systems Support. Being able to identify and describe clearly the issue and gather existing data will help Oracle Systems Support to respond faster and resolve the issue.

Be prepare that remote access maybe require, this might happen using Oracle Shared Shell or other remote access tools.
Ref: Oracle Shared Shell Document 1194226.1


Preliminary work is required to identify boot issues, gathering necessary information
and data for analysis.

1. Identifying boot issue

- System does not power up
- System power up, but no output from console
- System power up, but hung at boot
- System power up, but panic at boot

 

2. Data collection

For system that does not power up
  • Visually inspect each power supply for the status of the AC Present, Power OK, and Fault LEDs. If the Fault LED is illuminated on any of the PSUs then further troubleshooting will be required.
  • If AC Present is NOT illuminated, ensure the AC power cords are securely plugged into the server and connected to working AC power outlet(s). Test using known good power cables and power source. Engage a qualified electrician to test voltage on the power cords.
  • If Power OK is NOT illuminated, but AC Present IS, then further troubleshooting will be required. Refer to the system Service Manual for additional troubleshooting steps.

https://www.oracle.com/technetwork/documentation/oracle-unified-ss-193371.html

If access to the iLOM/SP is possible, please follow the document below to gather additional data for troubleshooting.
- Sun Storage 7000 Unified Storage System: How to collect data from SP (Document 1395915.1)

Raise a service ticket with Sun Oracle for further assistance.

 

For system that power up, but no output from console
Please ensure that you are on the system console via the iLOM/SP CLI (Serial/ssh) connection and not via the browser by using the Java Web Console.

If it is confirmed that the system console is being used and the issue persist.
Please reset the SP by running the following command from the SP.

reset /SP

If after resetting the SP and still stay the same.
Collect any error that you may see when you start the console with the following command.

start /SP/console

Also, collect the data from the SP with the following document.
- Sun Storage 7000 Unified Storage System: How to collect data from SP (Document 1395915.1)
- Sun Storage 7000 Unified Storage System: How to check the SP BIOS revision level (Document 1174698.1)

Raise a service ticket with Sun Oracle for further assistance.

If you get any error from the start /SP/console or anything from show /SYS etc.
Check that you are running the latest supported SP f/w and BIOS for your 7x10 and 7x20.
You may able to do a quick recovery by cold reset the SP by removing all the power cables for 2 minutes.
Older versions of the Service Processor firmware on Sun Storage 7110, 7210, 7310 and 7410 can leak memory. (Document 1267544.1)

For system that power up, but hung or panic at boot
Please capture the full console log output starting from the point it power up and to the point it hung or panic.

Raise a service ticket with Sun Oracle for further assistance.


Other Resources:
If the system able to boot but failed to join the cluster. You can refer to the following doc.
How to Troubleshoot Cluster Problems (Document 1402545.1)


Known Issue:

- After removing a large dedup shares and customer reboot the system before it completes. It would lead to what appears to be a hung boot. As the zpool import would need to finish delete operation before it fully boot up.

- An example of
A bad drive that is locking up a system
http://ar-rotation.us.oracle.com/sup-commands/mpt-back-end-frame.htm

- Too many ak log files causing akd run out of file descriptor
Bug: 6914407 akd should globally enable extended FILE stdio

References

@ <BUG:6914407> - DUPLICATED DEPLOYABLE CHECKBOXES
<NOTE:1002941.1> - How to check why the system powered off, on Sun X64 servers.
<NOTE:1194226.1> - Oracle Shared Shell
<NOTE:1267544.1> - Older versions of the Service Processor firmware on Sun Storage 7110, 7210, 7310 and 7410 can leak memory.
@<NOTE:1395915.1> - Sun Storage 7000 Unified Storage System: How to collect data from SP
<NOTE:1402545.1> - Sun Storage 7000 Unified Storage System: How to Troubleshoot Cluster Problems
<NOTE:1174698.1> - Sun Storage 7000 Unified Storage System: How to check the SP BIOS revision level

Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback