Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1009944.1
Update Date:2012-07-31
Keywords:

Solution Type  Problem Resolution Sure

Solution  1009944.1 :   Sun StorageTek[TM] 5320 NAS: Fault LED on Head Lights Up With No Apparent Fault  


Related Items
  • Sun Storage 5320 NAS Appliance
  •  
  • Sun Storage 5320 NAS Gateway
  •  
  • Sun Storage 5320 NAS Cluster
  •  
  • Sun Storage 5220 NAS Appliance
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>NAS>SN-DK: SE5xxx NAS
  •  
  • .Old GCS Categories>Sun Microsystems>Storage - Disk>Network Attached Storage
  •  

PreviouslyPublishedAs
213627


Applies to:

Sun Storage 5320 NAS Appliance - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 5320 NAS Cluster - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 5320 NAS Gateway - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 5220 NAS Appliance - Version Not Applicable to Not Applicable [Release N/A]
All Platforms

Symptoms

Sun StorageTek[TM] 5320 NAS Head fault LED or the Service Action Required LED was lighted up but there were no apparent corresponding fault logged. In all cases reported, there is no disruption to the operation of the NAS appliance whatsoever, except the lighted LED caused concerns from customer on a possible hardware failure. Analyzing the collected data did not provide any indication for a fault present in the system, but the Service LED turned on.

 

The 5320 NAS head is based on the Galaxy server X4200 with the Integrated Lights Out Management (ILOM) facility. A closer look from the head's service processor (SP), accessed through the serial management port, it was found that there were voltage events that had occurred some time ago which had likely triggered the LED.

Example:

-> cd /SP/logs/event/list/SP/logs/event/list
-> show
/SP/logs/event/list
Targets:
Properties:
    Commands:
show
    Events:
EventId                 TimeStamp       SensorName  SensorType
100  Sat Mar 17 04:12:07 2007        ps1.pwrok  Power Supply
97  Fri Mar 16 17:52:42 2007         p1.v_vdd  Voltage
98  Fri Mar 16 17:52:49 2007          p1.fail  Processor
99  Sat Mar 17 04:12:05 2007        ps0.pwrok  Power Supply
:
: (snip...)
:

However, NAS OS did not log any messages or notification, but did show "-10000V" as value on some voltage reading. The values are clearly invalid.

 ~ Voltage levels
Battery 3V     : 3.08 V
Stndby 3.3V    : 3.28 V
Main 3.3V      : 3.30 V
Main 5V        : 5.01 V
Main 12V       : 12.22 V
Main -12V      : -12.27 V
Main 2.5V      : -10000.00 V
Main 1.8V      : -10000.00 V
Main 1.2V      : -10000.00 V

In effect, there were 2 problems:
1) Galaxy Server (X4200/NAS 5320 head): voltage event and fault LED lighted
2) NAS OS does not report this event

 

Cause

In effect, there were 2 causes identified

  1. Galaxy Server (X4200/NAS 5320 head):Service Processor detects a voltage event and fault LED lighted
  2. NAS OS does not report this event to get a relation between Service LED and voltage event

Both causes has been forwared to the responsible product develpment team

The first problem was reproduced after applying noise to the power line where the NAS head was plugged in, where a similar voltage event occurred and the LED lighted. This problem has been noted to Galaxy PDE as Bug ID 6862216. While the second problem has been filed as Bug ID 6568034 to the NAS development team. NAS uses the default value "-10000V" when not able to read a sensor.

 

Solution

One Workaround is lternatively, shutdown and power cycling the head which would clear the LED by resetting the Service Processor. This would be disruptive to the NAS operation, especially if it is a stand alone system, while a cluster could do a take over. This workaround can be carried out by system administrator during a scheduled mainteance window, there is no urgency to apply the workaround once a support engineer has identified there is no fault present.

Alternatively a FE could be dispatched to hook his laptop to the serial console covered by a sticker saying 'Service Only' and after connected to ILOM he could use the command below.

  -> reset /SP

This does not affect the normal operation of the NAS head since the SP is independent from the main system. This would be the preferred workaround method if customer insists on doing the SP reboot without interruption to the production, but it requires someone with the right equipment to connect to the SP and issue the reset command.

Even with this product in state of EOL and nearing its EOSL date it would be helpful to collect some additional diagnostic data like a ILOM snapshot. How to do so is described in Document <Note 1448069.1>

References

<BUG:6568034> - 8-525552743 - INTERCOMPANY REPORT IGNORES AGGRWEIGHT PARAMETER OF CUSTOM DIMENSI
@ <BUG:6862216> - [STATICCHARTS] PIE: NO IMAGE GENERATED WITH THE PATTERNS SUPPLIED
<NOTE:1448069.1> - How to Collect a Snapshot from an x86 Platform Service Processor (SP or ILOM)

Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback