Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-73-1021749.1
Update Date:2010-08-18
Keywords:

Solution Type  FAB (standard) Sure

Solution  1021749.1 :   ST9990V R1/L1 Disk Array Frames may not report SSB or SIM errors when power supplies or batteries fail in those frames.  


Related Items
  • Sun Storage 9990V System
  •  
Related Categories
  • GCS>Sun Microsystems>Sun FAB>Standard>Controlled Proactive
  •  

PreviouslyPublishedAs
274930


Product
Sun StorageTek 9990V System

Date of Resolved Release
24-Dec-2009

Disk Array Frames may not report all Power Supply or Battery failures (see details below).

Impact

When one of the below listed components/parts (DKUPS, Battery) in Sun StorageTek 9990V System fails in R1 and/or L1 DKU frame, it may not generate an error report.  If the component is not replaced and then additional components fail, it could result in loss of access to data.

When these conditions occur, the SSB/LOG will not be reported and service personnel will not be notified.  No impact will be experienced unless the failed component is not replaced and then a sufficient quantity of matching redundant components fail or electrical power fails or is removed.

Contributing Factors

This issue may occur when all the following conditions below are met:

1. Only ST9990V systems that have an attached R1 or an L1 frame are affected
   (without the fixed microcode installed).

   Note 1: Single Frame ST9990V with disks in R0 is NOT affected.
   Note 2: The lower B4 portion of the R1 DKU frame is NOT affected (Upper B4 only).
   Note 3: L2 and R2 Disk Array frames are NOT affected
   Note 4: ST9985V, ST9990 and ST9985 are NOT affected

2. All microcode in the ranges listed below are affected.

   60-03-03-00/00-M049 - 60-04-20-00/00-M128
   60-05-06-00/00-M108 - 60-05-15-00/00-M131
   60-06-05-00/00-M132 - 60-06-05-00/00-M138

   Note 1: Microcode versions earlier than 60-03-03-00/00-M049 are not impacted.
   Note 2: This list may not cover special releases intended for any specific customer
           or test code.  These are also affected if they are below the fixed versions
           listed in the Resolution section below.

3. A failure occurs in any of the components listed below:

   a. DKU-R1 Disk Array Frame

     i. DKUPS:  DKUPS-R12, DKUPS-R13.
     ii. BATTERY:  BATTERY-UR14, BATTERY-UR15, BATTERY-UR16, BATTERY-UR17.

   b. DKU-L1 Disk Array Frame (All power components):

     i. DKUPS:  DKUPS- L10, DKUPS- L11, DKUPS- L12, DKUPS- L13
     ii.  BATTERY:  BATTERY-UL10, BATTERY-UL11, BATTERY-UL12, BATTERY-UL13,
          BATTERY-UL14, BATTERY-UL15, BATTERY-UL16, BATTERY-UL17.

Note: You can only verify the failure visually by looking at the LED.

For diagrams, refer to the "conditions of occurance" section in HDS alert at below URL;

   http://se9990.eng/tech_docs/fab/dku_frame_ps_issue/USPV_062735R3.pdf

Also, refer to the "location section" of Maintenance Manual for DKU frame details via the below URL;

   http://pts-storage.west/products/T99x0/docs/maint/9990v/03loc.pdf

Symptoms

Due to a microcode bug affecting the ST9990V, the Disk Array Frames (DKU) designated as R1 and/or L1 sometimes may not report all Power Supply or Battery failures.  The SSB/Logs will not be reported for these failures.  Because field personnel are not notified of these failures, it is possible that these failed components were not replaced.  This can lead to loss of access to the sub-system in the event that a second redundant component fails.

In the event that batteries in these frames fail and are not replaced, the destage of data from cache to the parity groups may not complete, and the data that did not destage will be retained in cache memory under control unit battery power.

It is important to understand that there are no problems with the power components themselves, just a microcode bug preventing the reporting of any failures should they occur.

There is no way to determine that these components have failed other than by visual inspection of the operating LEDs on these components.

Note: Neither the SVP maintenance display nor a dump will show any indication that
      the noted components have failed.

Root Cause

Due to a microcode bug, the power recovery routines may access wrong information preventing the sub-system from reporting the power failures.

Corrective Action

Workaround:

No workaround available - see Resolution section below.

Resolution:

- Prior to Microcode Update

1. An initial visual inspection of the LEDs should be performed as soon as possible
   on all ST9990V systems containing R1 or L1 frames for below mentioned componenets.

This inspection should be performed prior to any maintenance activity performed by service personnel or prior to any electrical maintenance performed by an electrician or customer. 

 i) DKU-R1 Disk Array Frame:
     
    Power supplies: DKUPS-R12, DKUPS-R13.
    BATTERY: BATTERY-UR14, BATTERY-UR15, BATTERY-UR16, BATTERY-UR17.

 ii) DKU-L1 Disk Array Frame:

     Power Supplies: DKUPS- L10, DKUPS- L11, DKUPS- L12, DKUPS- L13
     BATTERY: BATTERY-UL10, BATTERY-UL11, BATTERY-UL12, BATTERY-UL13, BATTERY-UL14,
              BATTERY-UL15, BATTERY-UL16, BATTERY-UL17.

Power Supplies each have three LEDs (all on - good).  Each Battery will have one green LED and one Amber charging LED when it is switched ON.  Note that NOT all Array Frames will have batteries. This depends upon the features installed and the decision to implement Data Destage.

For Diagrams, refer to the "conditions of occurance" section in HDS alert at below link;

   http://se9990.eng/tech_docs/fab/dku_frame_ps_issue/USPV_062735R3.pdf

Also, refer to the "location section" of Maintenance Manual for DKU frame details;

   http://pts-storage.west/products/T99x0/docs/maint/9990v/03loc.pdf

2. Take below actions (First batteries and then PS) based on visual inspection done
   on R1 and L1 frame as mentioned above:

Batteries:

Replace all batteries with GREEN LED off (Not all DKU frames will have batteries) by following replacement procedure in Maintenance Manual.

Power Supplies:

1. If only one power supply has LEDs that are off in a frame, then replace it by
   following replacement procedure described in Maintenance Manual.

2. If some LEDs are off in both redundant power supplies, do not replace them without
   first contacting the backline TSC for guidance.

Note that each power supply (PS) is comprised of three independent supplies inside - each with its own power feed and output, and there is one LED for each of these supplies.  It is possible for one or more of these power supplies to fail (LED off) and the remaining power supplies will continue to work.  Therefore, a pair of redundant power supplies may each have one or more supplies broken, but combined they are supplying adequate power.  Depending upon the number of HDDs it is possible for two supplies to be broken in each of a redundant pair of power supplies.

If you encounter a case where some LEDs are off in both redundant supplies, then replacing one power supply may cause an outage.  In this case escalate to the backline TSC for guidance.

3. If all LEDs are ON in above mentioned power supplies and Batteries in R1/L1 frames, plan to update to the fixed microcode (or above) per the Microcode update section below as soon as possible.

- Microcode Update

1. Fixed V06 Microcode: 60-06-06-00/00-M141 or above.

   Systems currently running with 60-06-05-00/00-M138 or below can be ugraded to
   the above fixed code.

2. Fixed V05 Microcode 60-05-16-00/00-M139

   Systems currently running with 60-05-15-00/00-M131 or below can be ugraded to
   the above fixed code.

The microcode version(s) listed above may no longer be available or supported in future.  They are listed here only because they were the first versions containing the fix for the issue identified in this FAB.  Please obtain the highest available microcode version that has inherited this change and is appropriate for your machines environment.

Contact backline support for additional help as needed.

Comments

This FAB is initially being released with a Preliminary Customer List available (Internally Only) via the below URL...

  http://sdpsweb.central/FIN_FCO/FAB/274930/CustomerList.ods

...and the below Sun Legal approved Customer Letter to be used to communicate this
issue to your customers as needed;

    http://sdpsweb.central/FIN_FCO/FAB/274930/SPE/CustomerLetter.odt

Once a final Customer List is available it will be entered into SunFIT and this FAB will be re-issued notifying the field to begin tracking implementation in SunFIT.

Until then, please track remediation of this FAB manually with the intent to update SunFIT at a later date.

References:

Subscribe to ST9900 up-to-date Alerts by refering the below link;

  http://sejsc.ebay/alerts_via_alias.html

For ST9900 Maintenance Manuals go to;

  http://pts-storage.west/products/T99x0/documentation.html


Related URL(s):

HDS Alert:  http://se9990.eng/tech_docs/fab/dku_frame_ps_issue/USPV_062735R3.pdf



For information about FAB documents, its release processes, implementation strategies and billing information, go to the following URL:

For Sun Authorized Service Providers go to:

In addition to the above you may email:


Modification History
Changes made since initial Publication.

25-Jan-2010
  • In Resolution section changed Microcode Update section from microcode level 60-06-xx-xx/xx to 60-06-06-00/00-M141.

Internal Contributor/submitter
[email protected], [email protected]

Internal Eng Responsible Engineer
[email protected] Responsible Manager: [email protected]

Internal Services Knowledge Engineer
[email protected]

Internal Eng Business Unit Group
NWS (Storage)

Internal Sun Alert & FAB Admin Info
23-Dec-2009: Competed draft and sent to Extended Review.
24-Dec-2009: No feedback from Ext Rvw - sending to Publish.


Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback