Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Problem Resolution Sure Solution 1007904.1 : Sun StorEdge[TM] 6120: all disks in several trays disabled after power cycle
PreviouslyPublishedAs 210908 Symptoms In a Sun StorEdge[TM] 6120 2x4 configuration (2 controllers, 4 trays), u2, u3 and u4 disks became "ready disabled" after power cycling. TLR STATUS STATE ROLE PARTNER TEMP ------ ------- ---------- ---------- ------- ---- u1ctr ready enabled master u3ctr 29 u2ctr missing u3ctr ready enabled alt master u1ctr 25 u4ctr missing DISK STATUS STATE ROLE PORT1 PORT2 TEMP VOLUME ------ ------- ---------- ---------- --------- --------- ---- ------ u1d01 ready enabled data disk ready ready 22 v1 u1d02 ready enabled data disk ready ready 22 v1 u1d03 ready enabled data disk ready ready 22 v1 u1d04 ready enabled data disk ready ready 24 v1 u1d05 ready enabled data disk ready ready 22 v1 u1d06 ready enabled data disk ready ready 25 v1 u1d07 ready enabled data disk ready ready 23 v1 u1d08 ready enabled data disk ready ready 22 v1 u1d09 ready enabled data disk ready ready 22 v1 u1d10 ready enabled data disk ready ready 21 v1 u1d11 ready enabled data disk ready ready 22 v1 u1d12 ready enabled data disk ready ready 23 v1 u1d13 ready enabled data disk ready ready 22 v1 u1d14 ready enabled standby ready ready 21 v1 u2d01 ready disabled data disk ready ready 22 v2 u2d02 ready disabled data disk ready ready 21 v2 u2d03 ready disabled data disk ready ready 23 v2 u2d04 ready disabled data disk ready ready 21 v2 u2d05 ready disabled data disk ready ready 22 v2 u2d06 ready disabled data disk ready ready 22 v2 u2d07 ready disabled data disk ready ready 23 v2 u2d08 ready disabled data disk ready ready 21 v2 u2d09 ready disabled data disk ready ready 22 v2 u2d10 ready disabled data disk ready ready 22 v2 u2d11 ready disabled data disk ready ready 22 v2 u2d12 ready disabled data disk ready ready 22 v2 u2d13 ready disabled data disk ready ready 22 v2 u2d14 ready enabled standby ready ready 21 v2 u3d01 ready disabled data disk ready ready 21 v3 u3d02 ready disabled data disk ready ready 22 v3 u3d03 ready disabled data disk ready ready 21 v3 u3d04 ready disabled data disk ready ready 22 v3 u3d05 ready disabled data disk ready ready 22 v3 u3d06 ready disabled data disk ready ready 22 v3 u3d07 ready disabled data disk ready ready 22 v3 u3d08 ready disabled data disk ready ready 21 v3 u3d09 ready disabled data disk ready ready 21 v3 u3d10 ready disabled data disk ready ready 21 v3 u3d11 ready disabled data disk ready ready 21 v3 u3d12 ready disabled data disk ready ready 23 v3 u3d13 ready disabled data disk ready ready 22 v3 u3d14 ready enabled standby ready ready 22 v3 u4d01 ready disabled data disk ready ready 21 v4 u4d02 ready disabled data disk ready ready 22 v4 u4d03 ready disabled data disk ready ready 23 v4 u4d04 ready disabled data disk ready ready 21 v4 u4d05 ready disabled data disk ready ready 23 v4 u4d06 ready disabled data disk ready ready 23 v4 u4d07 ready disabled data disk ready ready 22 v4 u4d08 ready disabled data disk ready ready 21 v4 u4d09 ready disabled data disk ready ready 23 v4 u4d10 ready disabled data disk ready ready 22 v4 u4d11 ready disabled data disk ready ready 22 v4 u4d12 ready disabled data disk ready ready 23 v4 u4d13 ready disabled data disk ready ready 22 v4 u4d14 ready enabled standby ready ready 21 v4 Resolution From syslog messages of the 6120, it was found that no user command, specifically the "shutdown" command, was entered around the time where apparently the 6120 was being "powered off". In fact, the master controller recorded that u2 was abruptly powered off: Mmm dd hh:mm:ss se6120 LPCT[1]: W: u2pcu1: Switch off Mmm dd hh:mm:ss se6120 LPCT[1]: N: u2ctr: set PS margin to off due to Fan 1 switch off Mmm dd hh:mm:ss se6120 LPCT[1]: W: u2pcu2: Switch off Mmm dd hh:mm:ss se6120 LPCT[1]: N: u2ctr: set PS margin to off due to Fan 2 switch off Therefore, the master controller u1 was still running when the power down of u2 happened (which was why the message could be logged). Since the controller was still running, it started disabling components that were affected by the power off. In a 2x4 config, the 4 trays are daisy chained on the 2 backend loops, powering off u2 causes all u2, u3 and u4 to be disconnected from u1. Since the master controller had no means to tell if the data in those trays are good, the disabled components remain disabled across reset of the 6120. After power up, if there is confidence that the data within the volume was intact, it is possible to re-create the volume without initializing the data area, in effect recovering the volume as it was without the need to restore from backup. This procedure can/should be carried out only by Sun service personnel. If all host i/o were quiesed prior to the power down event, it is likely that data integrity was maintained since any cached data by the 6120 controller would still be flushed with the built-in battery supplying the power. If it was not certain that there were on-going host i/o to the 6120, then the data should be restored from other sources. If host based mirroring was used, then simply re-create the volume as it was originally done, including volslices and lun permissions if used, and let the mirror re-sync. Otherwise, restoring from the most recent backup tape would be advisable. Product Sun StorageTek 6120 Array Internal Comments The following is strictly for the use of Sun employees:
Service Request ID: 10812614 For procedure on re-creating vol, volslice and lun permissions, several existing document depicted the detailed commands, but varied in situations, and covering T3+ (T3B) and 6120 (T4). T3+ and 6120 has fundamentally the same architecture, shares the same firmware, and commands concerning volume creation are the same. Essentially, to get multiple disk in the same volume out of the disabled state, the volume has to be re-created without destroying the data. The key command is therefore, ".vol init ?? fast", which depends on how the customer created the original volume, i.e. which are the data disk, standby. You may then need other commands to return access to normal, which depends on whether it was volslice'd, were there lun permissions, etc... The following references provide background information on how to adapt them to recover your specific situation. 80212 Recovery volumes after disks "Fault Disabled" 6120, multiple disks failure, ready disabled, .vol init fast, power cycle Previously Published As 84237 Change History Date: 2006-03-07 User Name: 7058 Action: Approved Comment: Trademarked where appropriate. No duplicates found. Minor grammar fixes. Spell ck OK. Tags OK. Adjusted review date. Attachments This solution has no attachment |
||||||||||||
|