Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-73-1278026.1
Update Date:2011-02-10
Keywords:

Solution Type  FAB (standard) Sure

Solution  1278026.1 :   ST9990V and ST9985V USPV/VM DKS2F-K450FC HDDs have higher exposure to dual HDD failures with down-level DKU Code and Hitachi Dynamic Provisioning (HDP) on the same machine.  


Related Items
  • Sun Storage 9990V System
  •  
  • Sun Storage 9985V System
  •  
Related Categories
  • GCS>Sun Microsystems>Sun FAB>Standard>Controlled Proactive
  •  




In this Document
  Symptoms
  Changes
  Cause
  Solution


Oracle Confidential (PARTNER). Do not distribute to customers
Reason: FABs available to Internals and Partners only

Applies to:

Sun Storage 9985V System - Version: Not Applicable to Not Applicable - Release: N/A to N/A
Sun Storage 9990V System - Version: Not Applicable to Not Applicable   [Release: N/A to N/A]
Information in this document applies to any platform.
__________

Affected Parts:

371-3059 - 72GB Disk Drive/15kmin-1
371-3060 - 146GB Disk Drive/15kmin-1
371-3637 - 300GB Disk Drive/15kmin-1
371-3061 - 300GB Disk Drive/10kmin-1
371-4206 - 400GB Disk Drive/10kmin-1
371-4474 - 450GB Disk Drive/15Kmin-1
371-4867 - 600GB Disk Drive/15Kmin Drive

Symptoms

Several instances of down-level DKS2F-K450FC HDDs experiencing high failure rates which lead to dual HDD failures in the same parity group.

Down-level firmware for DKS2F-K450FC HDDs can lead to dual HDD failures in the same parity group. This can lead to loss of data for the parity group and in the worst case if the parity group is part of an HDP pool, it could lead to loss of the entire pool.

Reference HDS Alert# 64803 for more information.

Impact:

Elevated disk drive failure rates can increase the chances of a dual drive failure within the same parity group. Because of this potential issue, please have service personnel upgrade the disk microcode as soon as possible.

This can lead to loss of data for the parity group and in the worst case, if the parity group is part of an HDP pool, it could lead to loss of the entire pool.

Disk model DKS2F-K450FC have seen higher failure rate than other disk models when disk microcode is lower than the above recommended code level.

To prevent this, it is critical that the firmware is upgraded as soon as possible. Note that we have not experienced this issue on other HDDs or the DKS2F-K450FC model HDD with updated DKU code.

Changes

Contributing Factors:

This problem may occur when all the following conditions below are met:

1. The system listed in the products section above are being used.
2. All current versions of microcode less than the fixed versions where
    the disk drives installed have not had their firmware upgraded.

All R600 FC Disk drives with the following model prefixes could be affected;

   DKS2E-K, DKS2F-K, DKS2G-K, DKS2E-J

Note: SATA drives and SSDs are not affected.

There has been several instances of down-level DKS2F-K450FC HDDs experiencing high failure rates which lead to dual HDD failures in the same parity group.

Using Hitachi Dynamic Provisioning (HDP) on the same machine with a larger number of DKS2F-K450FC HDDs increases potential for this failure. Therefore, if any of the HDP pools on these machines are made up of these down-level HDDs, there is the potential to lose the HDP pool.

Machines that have a larger percentage of DKS2F-K450HDDs are more at risk than those that have only a few of these drives.

If you need to set priorities, please target the machines on this list that are using HDP followed by those not using HDP.

Reference the attached for a matrix of affected code and code which incorporates the fix.

Cause

Root Cause:

Engineering has discovered a problem with a firmware change made to above mentioned disk drive models that are installable on the above mentioned R600 Hitachi Data Systems Storage Models.

Solution

Workaround

No workaround available - see Resolution section below.

Resolution

To prevent this, it is critical that the firmware is upgraded as soon as possible.
The recommended corrective action is to upgrade the microcode. Below is the MC with the fix incorporated.

   60-07-56-00/00-M198 (or higher)

The enhancements in this microcode can eliminate 70 percent of the time it takes to recover a DP pool that has experienced a dual HDD failure.  The enhanced microcode eliminated a number of previously required recovery steps, greatly improving the time required to restore a DP pool.  We therefore consider it mandatory that all USPV&VM machines running Dynamic Provisioning should minimally run this microcode level.

To find a list of remotely connect systems, and the priority to upgrade please reference this list of systems.

   http://se9990.eng/tech_docs/fab/DKU_Upgrade/DKU_Upgrade_CustList.xls

Comments

Subscribe to ST9900 up-to-date Alerts by referring the below link;

  http://sejsc.us.oracle.com/alerts_via_alias.html

For ST9900 Maintenance Manuals go to;

  http://pts-storage.us.oracle.com/products/T99x0/documentation.html

References:

Related URL(s):

HDS Alert -

   http://se9990.eng/tech_docs/fab/DKU_Upgrade/Alert_065418-2.pdf

How to Identify minimum DKU Code Level Doc -

   http://se9990.eng/tech_docs/fab/DKU_Upgrade/FAB_IdentifyDKUCode_66539.pdf


For information about FAB documents, its release processes, implementation strategies and billing information, click here.

In addition to the above you may email:

   [email protected]

Contacts:

Contributor: [email protected]
Responsible Engineer: [email protected]
Responsible Manager: [email protected]
Business Unit Group: NWS (Storage)

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback