Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Problem Resolution Sure Solution 1018045.1 : Proper sequence of replacing disks in the event of disabled/substituted disks on Sun StorEdge[TM] T3 Raid5 configuration.
PreviouslyPublishedAs 229351 Symptoms In the event that we have a situation where two disks indicate a problem on a RAID 5 configuration on the following arrays:- Sun StorEdge[TM] T3 Sun StorEdge[TM] T3+ Sun StorEdge[TM] 6x20 The proper disk replacement sequence must be followed to prevent data loss. If the status indicated by one disk is "disabled" and the other disk is "substituted", then in order to keep the volume intact and mounted, we should replace the disk showing as "disabled" first, followed by the disk showing up as "substituted". Not following this sequence could make the volume dead and hence result in the loss of data. Detailed Description of the problem =================================== (This has been reproduced in the lab and described in Bug ID 4751163. 1) This is the output of a stable Sun StorEdge[TM] T3. Both volumes v0 and v1 are showing "mounted" and all drives are labeled "0" meaning healthy. Good to go. T3B Release 2.01.00 2002/03/22 18:35:03 (10.1.1.20) Copyright (C) 1997-2001 Sun Microsystems, Inc. All Rights Reserved. t3a:/:<1>vol stat v0 u1d1 u1d2 u1d3 u1d4 u1d5 u1d6 u1d7 u1d8 u1d9 mounted 0 0 0 0 0 0 0 0 0 v1 u2d1 u2d2 u2d3 u2d4 u2d5 u2d6 u2d7 u2d8 u2d9 mounted 0 0 0 0 0 0 0 0 0 2) We disable one of the drives, this will fail the drive but not activate the hot spare. t3a:/:<1>vol stat v0 u1d1 u1d2 u1d3 u1d4 u1d5 u1d6 u1d7 u1d8 u1d9 mounted 4D 0 0 0 0 0 0 0 0 v1 u2d1 u2d2 u2d3 u2d4 u2d5 u2d6 u2d7 u2d8 u2d9 mounted 0 0 0 0 0 0 0 0 0 Now the RAID 5 volume is degraded but still available. Note drive u1d1 has a status of "4D" indicating that the drive has failed. 3) Now we "Substitute" drive u1d4 and copy its contents to the HotSpare. so the status shows:- v0 u1d1 u1d2 u1d3 u1d4 u1d5 u1d6 u1d7 u1d8 u1d9 mounted 4D 0 0 0S 0 0 0 0 0 v1 u2d1 u2d2 u2d3 u2d4 u2d5 u2d6 u2d7 u2d8 u2d9 mounted 0 0 0 0 0 0 0 0 0 Drive u1d4 is "Substituted" and we have a degraded volume that consists of drives u1d2, u1d3, u1d5, u1d6, u1d7, u1d8 and u1d9(which is the HotSpare). We are still good. Volume is still available. 4) Now if we replace drive u1d4 in such condition, we will see the status as t3a:/:<1>vol stat v0 u1d1 u1d2 u1d3 u1d4 u1d5 u1d6 u1d7 u1d8 u1d9 unmounted 4D 0 0 4S 0 0 0 0 0 v1 u2d1 u2d2 u2d3 u2d4 u2d5 u2d6 u2d7 u2d8 u2d9 mounted 0 0 0 0 0 0 0 0 0 The volume v0 immediately gets "unmounted" and the drive u1d4 shows as "failed" (status 4S) and the volume becomes unavailable to the host. Ideally, we should still have a degraded RAID5 volume with a parity group capable of rebuilding the original failed drive, but instead we now have a crashed volume. Resolution In this condition, we should first replace the "failed" drive ie u1d1 in the above example, followed by u1d4 which is the substituted drive. The volume would then be mounted and available. Additional Information For details on the disk states for firmware versions 2.x and 3.x Please see Technical Instruction <Document: 1012433.1> Disk States from Vol Stat output using 3.x firmware Product Sun StorageTek 6120/6320 Controller Firmware 3.2 Sun StorageTek T3 Multi-Platform 1.1 Sun StorageTek T3 Array Sun StorageTek T3+/6X20 Controller Firmware 3.1 Sun StorageTek T3+ Array Controller FW 2.1 Sun StorageTek T3+ Array Internal Comments For internal Sun use only. Please see Bug ID 4751163 Case numbers 63159110 and 37206400. T3, T3+, T3B, double failure Previously Published As 79384 Change History Date: 2004-12-02 User Name: 71396 Action: Approved Comment: Reviewed document, publishing. Version: 3 Date: 2004-12-02 User Name: 71396 Action: Accept Comment: Version: 0 Date: 2004-12-02 User Name: 71396 Action: Accept Comment: Version: 0 Attachments This solution has no attachment |
||||||||||||
|