Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type FAB (standard) Sure Solution 1019622.1 : Possible failure of multiple HDDs of certain type within the same Parity Group in ST9990V, ST9985V, ST9990 and ST9985.
PreviouslyPublishedAs 242466 Product Sun StorageTek 9990 System Sun StorageTek 9985 System Sun StorageTek 9990V System Sun StorageTek 9985V System Date of Resolved Release 29-Sep-2008 Possibility of multiple HDD failures of certain type within the same Parity Group during new installation, maintenance, or upgrade due to Microcode bug (see details below). Affected Parts: Affected HDD models: DKR2G-K72FC, DKR2G-K146FC, DKR2G-K300FC ImpactPossibility of multiple HDD failures in the same Parity Group causing loss of access to the LDEVs in the Parity Group (all LDEVs in that PG will be in blocked status).Contributing FactorsDue to a Microcode bug on the above listed products, there is a possibility that DKR2G-K series (K300FC, K146FC and K72FC) HDD(s) will get blocked.There is no bug open in Sun for this issue - HDS owns this bug. SymptomsBlocked DKR2G-K HDD(s) after one of the following HDD Initial Reset events occurs on these HDD when it is followed by the certain sequence of write/read access pattern as detailed in the "Root Cause" section...
Root CauseDue to a microcode bug there is a possibility that DKR2G-K series HDD(s) will become blocked. HDD blockade occurs due to an HDD being unresponsive triggered by HDD initial reset and then followed by certain specific Write/Read access pattern with a specific data length is issued.Note: HDD initial reset: HDD power supply is turned on or an HDD micro-program exchange occurs or an HDD replacement using hot plugging (upgrading, reconfiguration, replacement) all of which require the initial reset to be performed. Condition of Occurrence: When all the following conditions are met, HDD blockade occurs due to an HDD being unresponsive. (1) HDD model: DKR2G-K300FC/K146FC/K72FC (2) Write command is issued right after an HDD initial reset is performed. (3) The data length of a read command, issued right after the write command in (2), is longer than (229)hex sectors, and the leading part of data of previous (2) write command is hit. Note that this issue does not occur in the case of other hits, like all partial hits, intermediate partial hits, or a terminal partial hit. All impacted systems shipping from Sun Manufacturing begining on September 23, 2008 contains the new microcode to address this issue. Corrective ActionWorkaround:If you have multiple HDDs fail in the same parity group due to this bug DO NOT attempt to self replace or replace ANY of the affected HDDs. The first action that should be taken is to perform a Normal Restore procedure of the affected LDEVs. If the HDD failures are due to this Microcode bug then a Normal Restore procedure should restore the LDEVs on the affected parity group to Correction Access status thereby giving the customer back access to the LDEVs on the parity group. If for any reason the Normal Restore does not work, again, DO NOT attempt to self replace or replace ANY of the affected HDDs but instead contact TSC support for additional guidance. It is recommended to follow the steps in Resolution section below. However, this issue can be avoided by following either item A or B below: A. Run the LDEV Verify function after every time Powering On the system. (Verify check can be run at SVP, select maintenance > ldev tab > select each ECC group ) - To only the first LDEV in each Parity Group for approximately one minute. - After one minute has passed then cancel the verify. - Then move on to verify the next parity group. B. Run a DCR (Dynamic Cache Residency) function to all the first LDEVs in each Parity Group. However, it is required to install additional Cache to use this DCR. Refer to respective Maintenance Manuals for details. Resolution: Please upgrade as soon as possible all DKR2G-K disk drive DKU micro-program versions to 00-00-AZ or higher by upgrading the entire Microcode set to one of the versions listed below that contain the modified DKU micro-program or perform the optional DKU Only micro-program load option. Microcode Sets Containing Fixed DKU Version 00-00-AZ. For ST9990V and ST9985V: 60-03-27-00/00-M076 or higher 60-03-07-00/00-M075 For ST9990 and ST9985: 50-09-76-00/00-M251 or higher For instructions on acquiring updated Microcode reference How To doc id 1018586.1: Sun StorEdge[TM] 9900: Requesting Microcode, Software and License Updates. This knowledge asset can be accessed via the below URL; https://support.us.oracle.com/oip/faces/secure/km/DocumentDisplay.jspx?id=1018586.1 ************************************************************************ Optional DKU Only Micro-Program Load Optionally you may obtain one of the DKU Only Code Sets listed below by standard online ordering process (similar to system Microcode ordering) and load just the DKU micro-program into the DKR2G-K HDDs. Do not load the DKU code from other Microcode sets. Storage Model Product Description & Product Code (part number) To Order ST9990V and ST9985V DKU 00-00-AZ-H036 & MC-USPV-045 ST9990 and ST9985 DKU 00-00-AZ-H053 & MC-USP-NSC-079 Note that this can be done only if one of the pre-requisite Microcode sets below is already installed in the subsystem. For ST9990V and ST9985V: Microcode pre-requisite levels supporting online DKU only code loading to DKR2G-K FC drives (per ECN noted): Minimum version is a) 60-02-48-00/12 and higher b) 60-02-31-00/00 and higher (but not 60-02-48-00/00 or 60-02-48-00/10). c) 60-02-27-00/00 and higher if the system does not have below conditions: - Will be performing DKU exchange to an HDD in which I/O is being executed. - All HDD models are affected except SATA. - In one backend loop, six or more HDDs are installed and DKU updates will be to at least one of these HDDs. For ST9990 and ST9985: 50-09-70-00/00 and higher is the minimum Microcode support required to load DKU code into DKR2G-K HDDs. CommentsFor more details review below listed "Related URL(s)".References: FAB: 1019433.1 Escalation ID: 66057816 Related URL(s): http://se9990.eng/mc/mc.html - ST9900 Microcode Matrix: http://sejsc.ebay/alerts_via_alias.html - Subscribe to ST9900 important alerts http://se9990.eng/ecn.html - ST9900 ECNs and FCBs http://pts-storage.west/products/T99x0/documentation.html - Maintenance Manuals http://sccc-storage:5071/cgi-bin/microcode/request.cgi - ST9900 Microcode CD and DKU code request tool http://se9990.eng/mc/mc.html - ST9900 Current Microcode Matrix http://sejsc.ebay/alerts_via_alias.html - Subscribe to ST9900 important alerts For information about FAB documents, its release processes, implementation strategies and billing information, go to the following URL: In addition to the above you may email: Internal Contributor/submitter [email protected] Internal Eng Responsible Engineer [email protected] Responsible Manager: [email protected] Internal Services Knowledge Engineer [email protected] Internal Eng Business Unit Group NWS (Storage) Internal Sun Alert & FAB Admin Info 22-Sep-2008: Finalized draft and sent to Extended Review. 24-Sep-2008: Put onhold pending agreement between Services and PTeam on Implementation. 29-Sep-2008: Multiple modifications provided by submitter - sending to Publish. 02-Dec-2009: Corrected Product Name to swoRDFish inconsistency. Attachments This solution has no attachment |
||||||||||||
|