Document Audience:INTERNAL
Document ID:I0833-1
Title:When hotswapping a drive the status of the new drive may stay at "2D", or u1d9 port 1 may intermittently go to "bypass" state
Copyright Notice:Copyright © 2005 Sun Microsystems, Inc. All Rights Reserved
Update Date:2004-01-07

---------------------------------------------------------------------
- Sun Proprietary/Confidential: Internal Use Only -
---------------------------------------------------------------------  
                            FIELD INFORMATION NOTICE
                  (For Authorized Distribution by SunService)
FIN #: I0833-1
Synopsis: When hotswapping a drive the status of the new drive may stay at "2D", or u1d9 port 1 may intermittently go to "bypass" state
Create Date: Jun/27/02
Keywords: 

When hotswapping a drive the status of the new drive may stay at "2D", or u1d9 port 1 may intermittently go to "bypass" state

SunAlert: No
Top FIN/FCO Report: No
Products Reference: Disks on T3/T3+ Storage Array
Product Category: Storage / Service
Product Affected: 
Systems Affected:
-----------------  
Mkt_ID    Platform   Model   Description                     Serial Number
------    --------   -----   -----------                     -------------
  -        ANYSYS     ALL    SYSTEM PLATFORM INDEPENDENT           -


X-Options Affected:
-------------------
Mkt_ID    Platform   Model   Description                     Serial Number
------    --------   -----   -----------                     -------------
   -        T3        ALL    StorEdge T3 Array                     -
   -        T3+       ALL    StorEdge T3+ Array                    -
Parts Affected: 
Part Number   Description   Model
-----------   -----------   -----
     -             -          -
References: 
BugId: 4344979 - T3B: Disks configured as standbys go into bypass mode on 
                 port 1.
       4654238 - Drive hot swap results in state '2D' for new drive and 
                 reconstruction fails.

URL:   http://hes.west/nws/products/T3/tools/t3path_chk
Issue Description: 
When performing a disk hot swap on a T3+ partner group, the new disk
fails to spin up properly and goes to state '2D' as shown in 'vol stat'
output.  The T3+ must be reset to clear this condition and allow a
normal volume reconstruction to proceed.

Though these problems have been only reported at one customer site,
this FIN will address to the field that this problem may occur at other
customer site.  However the probability of additional customers seeing
these problems is remote, the workarounds for each issue are provided. 

Case 1:  Drive is hotswapped and new disk fails to spin up and goes
         to state '2D'.
--------------------------------------------------------------------------- 

State "2D" can be confirmed by reviewing the T3 error messages:

Please adhere to the following sample error messages:

  Drive u1d7 fails:

  Mar 15 17:04:29 purple31 LPCT[1]: N: u1d7: Bypassed on loop 1
  Mar 15 17:04:30 purple31 LPCT[1]: N: u1d7: Bypassed on loop 2
  Mar 15 17:04:32 purple31 ISR1[1]: N: u1d7 SVD_DONE: Command Error = 0x3
  Mar 15 17:04:32 purple31 ISR1[1]: N: u1d7 sid 121054 stype 405 disk error 3
  Mar 15 17:04:32 purple31 WXFT[1]: W: u1d7: Failed
  Mar 15 17:04:32 purple31 WXFT[1]: W: u1d7 hard err in vol (v1) starting auto 
         disable
  Mar 15 17:04:33 purple31 ISR1[1]: N: u1d7 sid 121054 stype 405 disk error 3
  Mar 15 17:04:33 purple31 ISR1[1]: N: u1d7 sid 146326 stype 405 disk error 3
  Mar 15 17:04:33 purple31 ISR1[1]: N: u1d7 sid 146783 stype 2023 disk error 3

After new drive is inserted; drive goes to state 2D and will not come online:

  Mar 15 17:09:50 purple31 LPCT[1]: N: u1d7: Not bypassed on loop 1
  Mar 15 17:09:51 purple31 LPCT[1]: N: u1d7: Not bypassed on loop 2
  Mar 15 17:10:02 purple31 ISR1[1]: N: u1ctr ISP2200[1] Fatal timeout on u1d1
  Mar 15 17:10:02 purple31 ISR1[1]: N: u1ctr ISP2200[1] QLCF_ABORT_ALL_CMDS: 
         Command Timeout Pre-Gauntlet Initiated
  Mar 15 17:10:02 purple31 ISR1[1]: N: u1ctr ISP2200[1] Received LIP(f7,ef) 
         async event
  Mar 15 17:10:04 purple31 SX01[1]: N: u1ctr sid 110913 copy XOR to BUF failed
  Mar 15 17:09:35 purple31 ISR1[2]: N: u2ctr ISP2200[1] Received LIP(f7,ef) 
         async event
  Mar 15 17:09:35 purple31 FCC2[2]: N: u2ctr <> on port 5, abort 0
  Mar 15 17:10:15 purple31 MNXT[1]: N: u1d7 SVD_RW: device is unplugged
  Mar 15 17:10:15 purple31 MNXT[1]: N: u1ctr Failed reading system area err=10
  Mar 15 17:10:15 purple31 MNXT[1]: N: u1d7 could not read disk label
  Mar 15 17:10:16 purple31 MNXT[1]: N: u1d7 SVD_RW: device is unplugged
  Mar 15 17:10:15 purple31 MNXT[1]: N: u1ctr Failed reading system area err=10
  Mar 15 17:10:15 purple31 MNXT[1]: N: u1d7 could not read disk label
  Mar 15 17:10:16 purple31 MNXT[1]: N: u1d7 SVD_RW: device is unplugged
  Mar 15 17:10:16 purple31 MNXT[1]: N: u1ctr Failed writing system area err=10
  Mar 15 17:10:16 purple31 MNXT[1]: W: u1d7 could not create system area
  Mar 15 17:10:16 purple31 MNXT[1]: N: u1ctr Internal Command error (Drive  
         sysarea create failed)
  Mar 15 17:10:49 purple31 LT00[1]: W: u1d7 Installing U1D7 failed, Try 
unplugging
         and then plugging
  Mar 15 17:10:49 purple31 LT00[1]: W: u1d7 Disk Bypassed
  Mar 15 17:10:50 purple31 LPCT[1]: N: u1d7: Bypassed on loop 1
  Mar 15 17:10:21 purple31 ISR1[2]: W: u1d7 SVD_PATH_FAILOVER: path_id = 0
  Mar 15 17:10:51 purple31 LPCT[1]: N: u1d7: Bypassed on loop 2


Case 2:  u1d9 (u1d9 and u2d9) port 1 will intermittently go to "bypass" 
         state.
-----------------------------------------------------------------------

u1d9 (u1d9 and u2d9) port 1 will intermittently go to "bypass" state.

Error messages:

  Jun 09 08:56:13 LPCT[1]: N: u1d9: Bypassed on loop 1
  Jun 09 08:56:13 LPCT[1]: N: u1l1: Controller off the loop
  Jun 09 08:56:13 LPCT[1]: N: u1ctr: ISP not ready on loop 1
  Jun 09 08:56:13 LPCT[1]: N: u2d9: Bypassed on loop 1
  Jun 09 08:56:13 LPCT[1]: N: u2l1: Controller off the loop
  Jun 09 08:56:13 LPCT[1]: N: u2ctr: ISP not ready on loop 1
  Jun 09 08:56:44 ISR1[1]: W: u1d9 SVD_PATH_FAILOVER: path_id = 0
  Jun 09 11:41:28 ISR1[2]: W: u2d9 SVD_PATH_FAILOVER: path_id = 0
  Jun 09 17:22:26 LPCT[1]: N: u1d9: Not bypassed on loop 1
  Jun 09 17:22:26 LPCT[1]: N: u2d9: Not bypassed on loop 1

Sample fru stat:

  t3-1:/:<28>fru stat 

  CTLR    STATUS   STATE       ROLE        PARTNER    TEMP
  ------  -------  ----------  ----------  -------    ----
  u1ctr   ready    enabled     master      u2ctr      34.5
  u2ctr   ready    enabled     alt master  u1ctr      34.5

  DISK    STATUS   STATE       ROLE        PORT1      PORT2      TEMP  VOLUME
  ------  -------  ----------  ----------  ---------  ---------  ----  ------
  u1d1    ready    enabled     data disk   ready      ready      37    u1v1
  u1d2    ready    enabled     data disk   ready      ready      36    u1v1
  u1d3    ready    enabled     data disk   ready      ready      39    u1v1
  u1d4    ready    enabled     data disk   ready      ready      40    u1v1
  u1d5    ready    enabled     data disk   ready      ready      37    u1v1
  u1d6    ready    enabled     data disk   ready      ready      37    u1v1
  u1d7    ready    enabled     data disk   ready      ready      36    u1v1
  u1d8    ready    enabled     data disk   ready      ready      42    u1v1
  u1d9    ready    enabled     standby     bypass     ready      40    u1v1
  u2d1    ready    enabled     data disk   ready      ready      35    u2v1
  u2d2    ready    enabled     data disk   ready      ready      35    u2v1
  u2d3    ready    enabled     data disk   ready      ready      37    u2v1
  u2d4    ready    enabled     data disk   ready      ready      37    u2v1
  u2d5    ready    enabled     data disk   ready      ready      41    u2v1
  u2d6    ready    enabled     data disk   ready      ready      42    u2v1
  u2d7    ready    enabled     data disk   ready      ready      40    u2v1
  u2d8    ready    enabled     data disk   ready      ready      42    u2v1
  u2d9    ready    enabled     standby     bypass     ready      35    u2v1

  LOOP    STATUS   STATE       MODE        CABLE1     CABLE2     TEMP
  ------  -------  ----------  -------     ---------  ---------  ----
  u2l1    ready    enabled     master      installed  -          31.0
  u2l2    ready    enabled     slave       installed  -          33.0
  u1l1    ready    enabled     master      -          installed  30.5
  u1l2    ready    enabled     slave       -          installed  35.5

  POWER   STATUS   STATE       SOURCE  OUTPUT  BATTERY  TEMP    FAN1   FAN2
  ------  -------  ---------   ------  ------  -------  ------  ------ ------
  u1pcu1  ready    enabled     line    normal  normal   normal  normal normal 
  u1pcu2  ready    enabled     line    normal  normal   normal  normal normal 
  u2pcu1  ready    enabled     line    normal  normal   normal  normal normal 
  u2pcu2  ready    enabled     line    normal  normal   normal  normal normal

It has been observed that u1d9 (u1d9 and u2d9 when configured as a
partner pair) port 1 will intermittently go to "bypass" state.  This is
a T3 firmware bug that is still being investigated.  This does not
cause any data loss or access to data but is more of an irritation. 

The workaround to get out of this state is provided below.
Implementation: 
---
        |   |   MANDATORY (Fully Proactive)
         ---    
         
  
         ---
        |   |   CONTROLLED PROACTIVE (per Sun Geo Plan) 
         --- 
         
                                
         ---
        | X |   REACTIVE (As Required)
         ---
Corrective Action: 
The following recommendation is provided as a guideline for authorized
Enterprise Services Field Representatives who may encounter the above
mentioned problem.

A workaround is provided for each potential problem.  

Workaround: status of the replaced drive remains at "2D"
-------------------------------------------------------------	      
	      
  1. Obtain a replacement disk drive.

  2. Execute 'vol stat' and note the drive state.

  3. Execute 'fru stat' and note the drive state.

  4. Remove the failed drive.

  5. Wait for 5 minutes as indicated by receiving the system message: 
     "uXdX:  Missing; system shutting down in 25 minutes".

  6. Execute 'vol stat' and 'fru stat' noting the state of the removed 
     drive.

  7. Insert the new drive and wait for it to spin up and install the system 
     area (~ 2 minutes).

  8. Execute 'vol stat' and 'fru stat' noting the state of the new drive.

  9. If the drive goes to state '0D', verify reconstruction starts using 
     'proc list'.


Workaround: u1d9 (u1d9 and u2d9) port 1 will intermittently go to "bypass" state
--------------------------------------------------------------------------------

  1. Execute "fru stat" and note drive 9 port 1 status.

  2. If "bypass" then execute ".disk unbypass uXd9 path 0"

	NOTE - If Partner Pair then execute the above command on both u1d9
	       and u2d9.
Comments: 
None

============================================================================
Implementation Footnote: 
i)   In case of MANDATORY FINs, Enterprise Services will attempt to    
     contact all affected customers to recommend implementation of 
     the FIN. 
   
ii)  For CONTROLLED PROACTIVE FINs, Enterprise Services mission critical    
     support teams will recommend implementation of the FIN  (to their  
     respective accounts), at the convenience of the customer. 

iii) For REACTIVE FINs, Enterprise Services will implement the FIN as the   
     need arises.
----------------------------------------------------------------------------
All released FINs and FCOs can be accessed using your favorite network 
browser as follows:
 
SunWeb Access:
-------------- 
* Access the top level URL of http://sdpsweb.ebay/FIN_FCO/

* From there, select the appropriate link to query or browse the FIN and
  FCO Homepage collections.
 
SunSolve Online Access:
-----------------------
* Access the SunSolve Online URL at http://sunsolve.Corp/

* From there, select the appropriate link to browse the FIN or FCO index.

Internet Access:
----------------
* Access the top level URL of https://infoserver.Sun.COM
--------------------------------------------------------------------------
General:
--------
* Send questions or comments to [email protected]
--------------------------------------------------------------------------
Statusactive