Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1392313.1
Update Date:2012-02-09
Keywords:

Solution Type  Problem Resolution Sure

Solution  1392313.1 :   Random "BBU Overheated" Alarm for Battery Backup Unit on Sun Storage 2500-M2 and 6180 Arrays  


Related Items
  • Sun Storage 2540-M2 Array
  •  
  • Sun Storage 6180 Array
  •  
  • Sun Storage 2530-M2 Array
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>Arrays>SN-DK: 6130
  •  
  • .Old GCS Categories>Sun Microsystems>Storage - Disk>Modular Disk - 6xxx Arrays
  •  




In this Document
  Symptoms
  Cause
  Solution
  References


Applies to:

Sun Storage 6180 Array - Version: Not Applicable to Not Applicable - Release: N/A to N/A
Sun Storage 2530-M2 Array - Version: Not Applicable to Not Applicable   [Release: N/A to N/A]
Sun Storage 2540-M2 Array - Version: Not Applicable to Not Applicable   [Release: N/A to N/A]
Information in this document applies to any platform.

Symptoms

Due to the <Bug:7123598> "Battery Temp value out-of-range on 2500-M2/6180 Battery Backup Unit", a Sun Storage 2500-M2 or 6180 array can randomly report the event "BBU Overheated", where BBU means Battery Backup Unit.
You can take a supportdata from the array as per the instructions in the <Document 1002514.1> for Sun Storage Common Array Manager (CAM) or <Document 1014074.1> for SANtricity. Then open the file majorEventLog.txt and see if you have the following event:

Date/Time: Tue Dec 20 17:48:38 CET 2011
Sequence number: 4064
Event type: 7300
Event category: Notification
Priority: Critical
Description: BBU Overheated
Event specific codes: 0/0/0
Component type: Battery
Component location: Tray.85.Controller.A.Battery.A
Logged by: Controller in slot A

You may also have the following alarm from CAM:

Event Code         : xx.66.1261
Event Type         : ProblemEvent.REC_BATTERY_OVERTEMP
Severity           : 0
Sample Description : Battery X is over temperature.
Probable Cause     :
                     Notes:
                     If it was previously enabled, write caching will be
                     suspended for all volumes while this condition is present.
                    
                     The possible causes for this battery being over it's
                     critical temperature threshold are:
                     A fan has failed
                     An obstruction is blocking the air flow to or from the
                     tray, or
                     The ambient room temperature is too high.
Recommended Action :
                     Check the current alarms for a Fan fault.
                     If one or more Fan alarms exist, follow the recovery steps
                     for that alarm.
                     If there are no Fan alarms, check for obstructions or for
                     any room cooling problems.

In order to confirm that you are suffering from this issue, you need to proceed with the following steps:

  1. Follow the instructions in the <Document 1002514.1> for Sun Storage Common Array Manager (CAM) or <Document 1014074.1> for SANtricity in order to collect a supportdata from the array.
  2. Unzip the supportdata and open the file stateCaptureData.dmp using a text editor.
  3. Determine for what battery (A or B) the over temperature alarm is reported.
  4. Look for the line "bidShow(255,0,0,0,0,0,0,0,0,0) on controller A" or "bidShow(255,0,0,0,0,0,0,0,0,0) on controller B" depending on what battery is reported in the alarm.
  5. From this line, scroll down until you see the line "BID Log".
  6. In the "BID Log" output, look at the "Temp" column to see if there is a line reporting a temperature like "65xxx". Example:

    Battery Charging Log        Gas Gauge Attributes                                                                     Extended Attributes
    Date       Time       State MfgSts B_Mode Temp A_Volt A_Crnt MaxErr RelSOC RemCap FulCap C_Crnt C_Volt BatSts Op_Sts AbsSOC CellV4 CellV3 CellV2 CellV1 FetSts Safety PF_Sts ChgSts
    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    12/20/2011 11:52:49    Idle 0x010A 0x8081   26   6700      0      1     97    627    648      0   7200 0x40C0 0xE441     87      0      0   3348   3353 0x0006 0x0000 0x0000 0x1000
    12/20/2011 15:20:36    Idle 0x010A 0x8081   26   6695      0      1     97    624    648      0   7200 0x40C0 0xE441     86      0      0   3346   3349 0x0006 0x0000 0x0000 0x1000
    12/20/2011 16:48:38    Idle 0x010A 0x8081 65289   6694      0      1     97    623    648      0   7200 0x40C0 0xE441     86      0      0   3346   3348 0x0006 0x0000 0x0000 0x1000
    12/20/2011 16:48:49    Idle 0xC10A 0x8081   26   6694      0      1     97    623    648      0   7200 0x40C0 0xE441     86      0      0   3346   3348 0x0006 0x0000 0x0000 0x1000
    12/20/2011 16:49:00    Idle 0x010A 0x8081   26   6694      0      1     97    623    648      0   7200 0x40C0 0xE441     86      0      0   3346   3348 0x0006 0x0000 0x0000 0x1000
    12/20/2011 17:17:59    Idle 0x010A 0x8081   26   6694      0      1     96    622    648      0   7200 0x40C0 0xE441     86      0      0   3345   3348 0x0006 0x0000 0x0000 0x1000

  7. If you find such line for the date/time when the alarm was reported, this means that you are suffering from this bug covered in this document and you need to proceed to the "Solution" section of this document. If however you do not find such line, this likely means that you have a real over temperature issue.


Cause

This problem is caused by erroneous I2C errors when reading battery states. There is no real over temperature.

Solution

If you encounter this issue, you can ignore the alarm which should disappear after a few minutes. The final solution for this bug is in the firmware 07.80.51.10 or later. This firmware is bundled with CAM 6.9 which you can download as per the instructions in the <Document 1296274.1>.

References

<BUG:7123598> - BATTERY TEMP VALUE OUT-OF-RANGE ON 2500-M2/6180 BATTERY BACKUP UNIT
<NOTE:1002514.1> - Collecting Sun Storage Common Array Manager Array Support Data
<NOTE:1014074.1> - Collecting Support Data for Arrays Using Sun StorageTek[TM] SANtricity Storage Manager
<NOTE:1296274.1> - How to Download Common Array Manager (CAM) Software and Patches

Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback