Document Audience: | INTERNAL |
Document ID: | I0727-1 |
Title: | Recovering A1000/A3x00 controller C numbers after a device path changed due to reboot -r |
Copyright Notice: | Copyright © 2005 Sun Microsystems, Inc. All Rights Reserved |
Update Date: | 2001-10-29 |
---------------------------------------------------------------------
- Sun Proprietary/Confidential: Internal Use Only -
---------------------------------------------------------------------
FIELD INFORMATION NOTICE
(For Authorized Distribution by SunService)
FIN #: I0727-1
Synopsis: Recovering A1000/A3x00 controller C numbers after a device path changed due to reboot -rCreate Date: Oct/17/01
Keywords:
Recovering A1000/A3x00 controller C numbers after a device path changed due to reboot -r
SunAlert: No
Top FIN/FCO Report: No
Products Reference: A1000/A3x00 Controllers
Product Category: Storage / Service
Product Affected:
Systems Affected
----------------
Mkt_ID Platform Model Description Serial Number
------ -------- ----- ----------- -------------
- ANYSYS - System Platform Independent -
X-Options Affected
------------------
Mkt_ID Platform Model Description Serial Number
------ -------- ----- ----------- -------------
SG-XARY1* A1000 - STOREDGE A1000/RACK -
SG-XARY3* A3500 - STOREDGE A3500/RACK -
UG/CU-A3500FC* A3500FC - ASSY,TOP OPT,1X5X9,MAX,9GB,10K -
UG-A3K-A3500FC - - ASSY,UPGRADE,A3500FC/TABASCO -
UG-A3500-A3500FC - - ASSY,UPGRADE,A3500FC/DILBERT -
X6538A - - X-OPT,A3500FC CONTROLLER -
6538A - - FCTY, CONTROLLER, A3500FC -
X2611A - - OPT INT I/O BD FOR EXX00 -
X2612A - - OPT INT I/O BD EXX00 W/FC-AL -
X2622A - - OPT INT GRAPHICS I/O BD EXX00 -
Parts Affected:
Part Number Description Model
----------- ----------- -----
704-6708-10 CD SUN STOREDGE RAID Manager6.22 -
704-7937-05 CD RM 6.22.1 -
References:
URL: http://www.sun.com/storage/disk-drives/raid.html
Issue Description:
Solaris device names for StorEdge A1000/A3000 controllers may sometimes
change when a reconfiguration reboot ('reboot -r') is performed. The
controller "C" numbers can have a different values after the reboot.
When the original A1000/A3000 C numbers are lost, then volume managers
like VxVM are not able to find the LUNs. As a result of this, the key
impact is potential loss of access to data when the C numbers change.
The A1000/A3x00 controller C numbers for array device links in /dev/dsk
and /dev/rdsk change after a reboot. When controller C numbers are
changed, then all places where those controller numbers are recorded
must change. For example, mount points in /etc/vfstab would need to be
changed. However, there is no record of the changes, ie if a
controller was c2t5 and there are lots of controllers on the host, we
don't know what controller c2t5 is. Therefore, the A1000/A3x00
controller C numbers are lost and then volume managers like VxVM are
not able to find the LUNs.
Some of the failing indications are:
Can't mount /dev/dsk/c2t5d0s2 when booting the host
Error messages from VxVm about not being able to find volume group.
Prior to a boot -r, the configuration format and 'lad' (list array
devices) shows something like the following:
AVAILABLE DISK SELECTIONS:
0. c0t0d0
/sbus@3,0/SUNW,fas@3,8800000/sd@0,0
1. c0t1d0
/sbus@3,0/SUNW,fas@3,8800000/sd@1,0
2. c1t5d0
/pseudo/rdnexus@1/rdriver@5,0
3. c1t5d1
/pseudo/rdnexus@1/rdriver@5,1
4. c1t5d2
/pseudo/rdnexus@1/rdriver@5,2
5. c1t5d3
/pseudo/rdnexus@1/rdriver@5,3
6. c1t5d4
/pseudo/rdnexus@2/rdriver@5,4
7. c2t4d0
/pseudo/rdnexus@2/rdriver@4,0
8. c2t4d1
/pseudo/rdnexus@2/rdriver@4,1
9. c2t4d2
/pseudo/rdnexus@2/rdriver@4,2
10. c2t4d3
/pseudo/rdnexus@2/rdriver@4,3
/usr/lib/osa/bin/lad shows
'lad' is a program that will list the names of all RAID devices
connected to the system on stdout.
c1t5d0 1T74750854 LUNS: 0 1 2 3 4
c2t4d0 1T71525434 LUNS: 0 1 2 3
|____| |________| |_____________|
| | |
v | |
This field is|the device |name of a particular RAID controller
| |
v |
This field is an internal|name that uniquely identifies the controller.
|
v
These fields are a list of logical units (LUNs) currently owned by
the controller.
NOTE: The original controller numbers for the rdac modules: 1 & 2.
After issuing boot -r, format and lad appear as such:
AVAILABLE DISK SELECTIONS:
0. c0t0d0
/sbus@3,0/SUNW,fas@3,8800000/sd@0,0
1. c0t1d0
/sbus@3,0/SUNW,fas@3,8800000/sd@1,0
2. c69t5d0
/pseudo/rdnexus@1/rdriver@5,0
3. c69t5d1
/pseudo/rdnexus@1/rdriver@5,1
4. c69t5d2
/pseudo/rdnexus@1/rdriver@5,2
5. c69t5d3
/pseudo/rdnexus@1/rdriver@5,3
6. c69t5d4
/pseudo/rdnexus@2/rdriver@5,4
7. c71t4d0
/pseudo/rdnexus@2/rdriver@4,0
8. c71t4d1
/pseudo/rdnexus@2/rdriver@4,1
9. c71t4d2
/pseudo/rdnexus@2/rdriver@4,2
10. c71t4d3
/pseudo/rdnexus@2/rdriver@4,3
/usr/lib/osa/bin/lad shows:
c69t5d0 1T74750854 LUNS: 0 1 2 3 4
c71t4d0 1T71525434 LUNS: 0 1 2 3
NOTE: The format and lad are in sync but the c#'s have been changed to 69
and 71.
Solaris determines the numbering of controllers based largely on the
order that were discovered during a reconfiguration boot. Solaris 8,
update 4 seems to handle the ordering slightly differently, depending
on whether Host Bus Adapters are connected to arrays or not. When
controllers are replaced or added dynamically the new ones are added
after the existing ones and holes are left for the missing ones.
The procedure under "Corrective Action" shows how to return to the
previous controller numbers. The procedure is not a permanent fix in
that the change of controller numbers could happen again. The system
administrator should keep a list of device paths, such as the output of
ls -l /dev/*dsk.
The permanent fix will be released on the next version of 6.22.
Implementation:
---
| | MANDATORY (Fully Pro-Active)
---
---
| | CONTROLLED PRO-ACTIVE (per Sun Geo Plan)
---
---
| X | REACTIVE (As Required)
---
Corrective Action:
The following recommendation is provided as a guideline for authorized
Enterprise Services Field Representatives who may encounter the above
mentioned problem.
Starting with this situation, prior to a reconfiguration boot,
format and lad shows something like the following:
AVAILABLE DISK SELECTIONS:
0. c0t0d0
/sbus@3,0/SUNW,fas@3,8800000/sd@0,0
1. c0t1d0
/sbus@3,0/SUNW,fas@3,8800000/sd@1,0
2. c1t5d0
/pseudo/rdnexus@1/rdriver@5,0
3. c1t5d1
/pseudo/rdnexus@1/rdriver@5,1
4. c1t5d2
/pseudo/rdnexus@1/rdriver@5,2
5. c1t5d3
/pseudo/rdnexus@1/rdriver@5,3
6. c1t5d4
/pseudo/rdnexus@2/rdriver@5,4
7. c2t4d0
/pseudo/rdnexus@2/rdriver@4,0
8. c2t4d1
/pseudo/rdnexus@2/rdriver@4,1
9. c2t4d2
/pseudo/rdnexus@2/rdriver@4,2
10. c2t4d3
/pseudo/rdnexus@2/rdriver@4,3
/usr/lib/osa/bin/lad shows
c1t5d0 1T74750854 LUNS: 0 1 2 3 4
c2t4d0 1T71525434 LUNS: 0 1 2 3
Notice that the original controller numbers for the rdac modules: 1 & 2.
After issuing boot -r, format and lad appear as such:
AVAILABLE DISK SELECTIONS:
0. c0t0d0
/sbus@3,0/SUNW,fas@3,8800000/sd@0,0
1. c0t1d0
/sbus@3,0/SUNW,fas@3,8800000/sd@1,0
2. c69t5d0
/pseudo/rdnexus@1/rdriver@5,0
3. c69t5d1
/pseudo/rdnexus@1/rdriver@5,1
4. c69t5d2
/pseudo/rdnexus@1/rdriver@5,2
5. c69t5d3
/pseudo/rdnexus@1/rdriver@5,3
6. c69t5d4
/pseudo/rdnexus@2/rdriver@5,4
7. c71t4d0
/pseudo/rdnexus@2/rdriver@4,0
8. c71t4d1
/pseudo/rdnexus@2/rdriver@4,1
9. c71t4d2
/pseudo/rdnexus@2/rdriver@4,2
10. c71t4d3
/pseudo/rdnexus@2/rdriver@4,3
/usr/lib/osa/bin/lad shows:
c69t5d0 1T74750854 LUNS: 0 1 2 3 4
c71t4d0 1T71525434 LUNS: 0 1 2 3
It can be noticed easily that the format and lad are in sync but the c#'s
have been changed to 69 and 71.
To fix above problem, remove the rdac logical devices (c#t#d#) as seen
by Solaris and Raid Manager in order to recreate the logical device
controller #s.
To perform the procedure for syncing up c#'s in lad and format with
RM6.22x and replacing c#'s back to an acceptable value:
cd /dev/dsk
rm c#'s for A1000/A3x00 devices
(In this case "# rm c69*" and "# rm c71*")
cd /dev/rdsk
rm c#'s for A1000/A3x00 devices
(In this case "# rm c69*" and "# rm c71*"
cd /dev/osa/dev/dsk
rm c#'s for A1000/A3x00 devices
(In this case "# rm c69*" and "# rm c71*")
cd /dev/osa/dev/rdsk
rm c#'s for A1000/A3x00 devices
(In this case "# rm c69*" and "# rm c71*")
Run the following rdac_disks command to remove all rdac devices from format.
/usr/lib/osa/bin/rdac_disks
Run the following hot_add command to recreate proper rdac device controller
#s for all of the following: format, lad, /dev/(r)dsk /dev/osa/dev/(r)dsk
instantly with no need to reboot or boot -r.
/usr/lib/osa/bin/hot_add
after the hot_add, everything should be as it was before, but the user
user of this procedure should verify the configuration.
Note: It is also possible that after a "boot -r", the rdac devices MIGHT
NOT show up in format at all. Simply follow the same guidelines as
above, to recreate the rdac devices and sync up Solaris with Raid
Manager.
While tempting, do not try to run devfsadm to create links in place of
hot_add, because it will create a Solaris physical device path such as
/sbus@3,0/QLGC,isp@3... as opposed to the correct
/pseudo/rdnexus@2,0.... path that is required for the device to be
properly addressed.
Comments:
devfsadm -C will remove links for devices that are no longer present but
this can compound the problem unless you are prepared for the controller
numbers to change.
----------------------------------------------------------------------------
Implementation Footnote:
i) In case of MANDATORY FINs, Enterprise Services will attempt to
contact all affected customers to recommend implementation of
the FIN.
ii) For CONTROLLED PROACTIVE FINs, Enterprise Services mission critical
support teams will recommend implementation of the FIN (to their
respective accounts), at the convenience of the customer.
iii) For REACTIVE FINs, Enterprise Services will implement the FIN as the
need arises.
----------------------------------------------------------------------------
All released FINs and FCOs can be accessed using your favorite network
browser as follows:
SunWeb Access:
--------------
* Access the top level URL of http://sdpsweb.ebay/FIN_FCO/
* From there, select the appropriate link to query or browse the FIN and
FCO Homepage collections.
SunSolve Online Access:
-----------------------
* Access the SunSolve Online URL at http://sunsolve.Corp/
* From there, select the appropriate link to browse the FIN or FCO index.
Supporting Documents:
---------------------
* Supporting documents for FIN/FCOs can be found on Edist. Edist can be
accessed internally at the following URL: http://edist.corp/.
* From there, follow the hyperlink path of "Enterprise Services Documenta-
tion" and click on "FIN & FCO attachments", then choose the appropriate
folder, FIN or FCO. This will display supporting directories/files for
FINs or FCOs.
Internet Access:
----------------
* Access the top level URL of https://infoserver.Sun.COM
--------------------------------------------------------------------------
General:
--------
* Send questions or comments to [email protected]
---------------------------------------------------------------------------