Document Audience: | INTERNAL |
Document ID: | I0833-1 |
Title: | When hotswapping a drive the status of the new drive may stay at "2D", or u1d9 port 1 may intermittently go to "bypass" state |
Copyright Notice: | Copyright © 2005 Sun Microsystems, Inc. All Rights Reserved |
Update Date: | 2004-01-07 |
---------------------------------------------------------------------
- Sun Proprietary/Confidential: Internal Use Only -
---------------------------------------------------------------------
FIELD INFORMATION NOTICE
(For Authorized Distribution by SunService)
FIN #: I0833-1
Synopsis: When hotswapping a drive the status of the new drive may stay at "2D", or u1d9 port 1 may intermittently go to "bypass" stateCreate Date: Jun/27/02
Keywords:
When hotswapping a drive the status of the new drive may stay at "2D", or u1d9 port 1 may intermittently go to "bypass" state
SunAlert: No
Top FIN/FCO Report: No
Products Reference: Disks on T3/T3+ Storage Array
Product Category: Storage / Service
Product Affected:
Systems Affected:
-----------------
Mkt_ID Platform Model Description Serial Number
------ -------- ----- ----------- -------------
- ANYSYS ALL SYSTEM PLATFORM INDEPENDENT -
X-Options Affected:
-------------------
Mkt_ID Platform Model Description Serial Number
------ -------- ----- ----------- -------------
- T3 ALL StorEdge T3 Array -
- T3+ ALL StorEdge T3+ Array -
Parts Affected:
Part Number Description Model
----------- ----------- -----
- - -
References:
BugId: 4344979 - T3B: Disks configured as standbys go into bypass mode on
port 1.
4654238 - Drive hot swap results in state '2D' for new drive and
reconstruction fails.
URL: http://hes.west/nws/products/T3/tools/t3path_chk
Issue Description:
When performing a disk hot swap on a T3+ partner group, the new disk
fails to spin up properly and goes to state '2D' as shown in 'vol stat'
output. The T3+ must be reset to clear this condition and allow a
normal volume reconstruction to proceed.
Though these problems have been only reported at one customer site,
this FIN will address to the field that this problem may occur at other
customer site. However the probability of additional customers seeing
these problems is remote, the workarounds for each issue are provided.
Case 1: Drive is hotswapped and new disk fails to spin up and goes
to state '2D'.
---------------------------------------------------------------------------
State "2D" can be confirmed by reviewing the T3 error messages:
Please adhere to the following sample error messages:
Drive u1d7 fails:
Mar 15 17:04:29 purple31 LPCT[1]: N: u1d7: Bypassed on loop 1
Mar 15 17:04:30 purple31 LPCT[1]: N: u1d7: Bypassed on loop 2
Mar 15 17:04:32 purple31 ISR1[1]: N: u1d7 SVD_DONE: Command Error = 0x3
Mar 15 17:04:32 purple31 ISR1[1]: N: u1d7 sid 121054 stype 405 disk error 3
Mar 15 17:04:32 purple31 WXFT[1]: W: u1d7: Failed
Mar 15 17:04:32 purple31 WXFT[1]: W: u1d7 hard err in vol (v1) starting auto
disable
Mar 15 17:04:33 purple31 ISR1[1]: N: u1d7 sid 121054 stype 405 disk error 3
Mar 15 17:04:33 purple31 ISR1[1]: N: u1d7 sid 146326 stype 405 disk error 3
Mar 15 17:04:33 purple31 ISR1[1]: N: u1d7 sid 146783 stype 2023 disk error 3
After new drive is inserted; drive goes to state 2D and will not come online:
Mar 15 17:09:50 purple31 LPCT[1]: N: u1d7: Not bypassed on loop 1
Mar 15 17:09:51 purple31 LPCT[1]: N: u1d7: Not bypassed on loop 2
Mar 15 17:10:02 purple31 ISR1[1]: N: u1ctr ISP2200[1] Fatal timeout on u1d1
Mar 15 17:10:02 purple31 ISR1[1]: N: u1ctr ISP2200[1] QLCF_ABORT_ALL_CMDS:
Command Timeout Pre-Gauntlet Initiated
Mar 15 17:10:02 purple31 ISR1[1]: N: u1ctr ISP2200[1] Received LIP(f7,ef)
async event
Mar 15 17:10:04 purple31 SX01[1]: N: u1ctr sid 110913 copy XOR to BUF failed
Mar 15 17:09:35 purple31 ISR1[2]: N: u2ctr ISP2200[1] Received LIP(f7,ef)
async event
Mar 15 17:09:35 purple31 FCC2[2]: N: u2ctr <> on port 5, abort 0
Mar 15 17:10:15 purple31 MNXT[1]: N: u1d7 SVD_RW: device is unplugged
Mar 15 17:10:15 purple31 MNXT[1]: N: u1ctr Failed reading system area err=10
Mar 15 17:10:15 purple31 MNXT[1]: N: u1d7 could not read disk label
Mar 15 17:10:16 purple31 MNXT[1]: N: u1d7 SVD_RW: device is unplugged
Mar 15 17:10:15 purple31 MNXT[1]: N: u1ctr Failed reading system area err=10
Mar 15 17:10:15 purple31 MNXT[1]: N: u1d7 could not read disk label
Mar 15 17:10:16 purple31 MNXT[1]: N: u1d7 SVD_RW: device is unplugged
Mar 15 17:10:16 purple31 MNXT[1]: N: u1ctr Failed writing system area err=10
Mar 15 17:10:16 purple31 MNXT[1]: W: u1d7 could not create system area
Mar 15 17:10:16 purple31 MNXT[1]: N: u1ctr Internal Command error (Drive
sysarea create failed)
Mar 15 17:10:49 purple31 LT00[1]: W: u1d7 Installing U1D7 failed, Try
unplugging
and then plugging
Mar 15 17:10:49 purple31 LT00[1]: W: u1d7 Disk Bypassed
Mar 15 17:10:50 purple31 LPCT[1]: N: u1d7: Bypassed on loop 1
Mar 15 17:10:21 purple31 ISR1[2]: W: u1d7 SVD_PATH_FAILOVER: path_id = 0
Mar 15 17:10:51 purple31 LPCT[1]: N: u1d7: Bypassed on loop 2
Case 2: u1d9 (u1d9 and u2d9) port 1 will intermittently go to "bypass"
state.
-----------------------------------------------------------------------
u1d9 (u1d9 and u2d9) port 1 will intermittently go to "bypass" state.
Error messages:
Jun 09 08:56:13 LPCT[1]: N: u1d9: Bypassed on loop 1
Jun 09 08:56:13 LPCT[1]: N: u1l1: Controller off the loop
Jun 09 08:56:13 LPCT[1]: N: u1ctr: ISP not ready on loop 1
Jun 09 08:56:13 LPCT[1]: N: u2d9: Bypassed on loop 1
Jun 09 08:56:13 LPCT[1]: N: u2l1: Controller off the loop
Jun 09 08:56:13 LPCT[1]: N: u2ctr: ISP not ready on loop 1
Jun 09 08:56:44 ISR1[1]: W: u1d9 SVD_PATH_FAILOVER: path_id = 0
Jun 09 11:41:28 ISR1[2]: W: u2d9 SVD_PATH_FAILOVER: path_id = 0
Jun 09 17:22:26 LPCT[1]: N: u1d9: Not bypassed on loop 1
Jun 09 17:22:26 LPCT[1]: N: u2d9: Not bypassed on loop 1
Sample fru stat:
t3-1:/:<28>fru stat
CTLR STATUS STATE ROLE PARTNER TEMP
------ ------- ---------- ---------- ------- ----
u1ctr ready enabled master u2ctr 34.5
u2ctr ready enabled alt master u1ctr 34.5
DISK STATUS STATE ROLE PORT1 PORT2 TEMP VOLUME
------ ------- ---------- ---------- --------- --------- ---- ------
u1d1 ready enabled data disk ready ready 37 u1v1
u1d2 ready enabled data disk ready ready 36 u1v1
u1d3 ready enabled data disk ready ready 39 u1v1
u1d4 ready enabled data disk ready ready 40 u1v1
u1d5 ready enabled data disk ready ready 37 u1v1
u1d6 ready enabled data disk ready ready 37 u1v1
u1d7 ready enabled data disk ready ready 36 u1v1
u1d8 ready enabled data disk ready ready 42 u1v1
u1d9 ready enabled standby bypass ready 40 u1v1
u2d1 ready enabled data disk ready ready 35 u2v1
u2d2 ready enabled data disk ready ready 35 u2v1
u2d3 ready enabled data disk ready ready 37 u2v1
u2d4 ready enabled data disk ready ready 37 u2v1
u2d5 ready enabled data disk ready ready 41 u2v1
u2d6 ready enabled data disk ready ready 42 u2v1
u2d7 ready enabled data disk ready ready 40 u2v1
u2d8 ready enabled data disk ready ready 42 u2v1
u2d9 ready enabled standby bypass ready 35 u2v1
LOOP STATUS STATE MODE CABLE1 CABLE2 TEMP
------ ------- ---------- ------- --------- --------- ----
u2l1 ready enabled master installed - 31.0
u2l2 ready enabled slave installed - 33.0
u1l1 ready enabled master - installed 30.5
u1l2 ready enabled slave - installed 35.5
POWER STATUS STATE SOURCE OUTPUT BATTERY TEMP FAN1 FAN2
------ ------- --------- ------ ------ ------- ------ ------ ------
u1pcu1 ready enabled line normal normal normal normal normal
u1pcu2 ready enabled line normal normal normal normal normal
u2pcu1 ready enabled line normal normal normal normal normal
u2pcu2 ready enabled line normal normal normal normal normal
It has been observed that u1d9 (u1d9 and u2d9 when configured as a
partner pair) port 1 will intermittently go to "bypass" state. This is
a T3 firmware bug that is still being investigated. This does not
cause any data loss or access to data but is more of an irritation.
The workaround to get out of this state is provided below.
Implementation:
---
| | MANDATORY (Fully Proactive)
---
---
| | CONTROLLED PROACTIVE (per Sun Geo Plan)
---
---
| X | REACTIVE (As Required)
---
Corrective Action:
The following recommendation is provided as a guideline for authorized
Enterprise Services Field Representatives who may encounter the above
mentioned problem.
A workaround is provided for each potential problem.
Workaround: status of the replaced drive remains at "2D"
-------------------------------------------------------------
1. Obtain a replacement disk drive.
2. Execute 'vol stat' and note the drive state.
3. Execute 'fru stat' and note the drive state.
4. Remove the failed drive.
5. Wait for 5 minutes as indicated by receiving the system message:
"uXdX: Missing; system shutting down in 25 minutes".
6. Execute 'vol stat' and 'fru stat' noting the state of the removed
drive.
7. Insert the new drive and wait for it to spin up and install the system
area (~ 2 minutes).
8. Execute 'vol stat' and 'fru stat' noting the state of the new drive.
9. If the drive goes to state '0D', verify reconstruction starts using
'proc list'.
Workaround: u1d9 (u1d9 and u2d9) port 1 will intermittently go to "bypass" state
--------------------------------------------------------------------------------
1. Execute "fru stat" and note drive 9 port 1 status.
2. If "bypass" then execute ".disk unbypass uXd9 path 0"
NOTE - If Partner Pair then execute the above command on both u1d9
and u2d9.
Comments:
None
============================================================================
Implementation Footnote:
i) In case of MANDATORY FINs, Enterprise Services will attempt to
contact all affected customers to recommend implementation of
the FIN.
ii) For CONTROLLED PROACTIVE FINs, Enterprise Services mission critical
support teams will recommend implementation of the FIN (to their
respective accounts), at the convenience of the customer.
iii) For REACTIVE FINs, Enterprise Services will implement the FIN as the
need arises.
----------------------------------------------------------------------------
All released FINs and FCOs can be accessed using your favorite network
browser as follows:
SunWeb Access:
--------------
* Access the top level URL of http://sdpsweb.ebay/FIN_FCO/
* From there, select the appropriate link to query or browse the FIN and
FCO Homepage collections.
SunSolve Online Access:
-----------------------
* Access the SunSolve Online URL at http://sunsolve.Corp/
* From there, select the appropriate link to browse the FIN or FCO index.
Internet Access:
----------------
* Access the top level URL of https://infoserver.Sun.COM
--------------------------------------------------------------------------
General:
--------
* Send questions or comments to [email protected]
--------------------------------------------------------------------------