Sun Fire[TM] Servers (V280R, V480, V490, V880, V890): How to Replace Disks for Systems With Internal FC-AL Drives Under Solaris[TM] Volume Manager Control.

Asset ID:	1-71-1010753.1
Update Date:	2011-04-05
Keywords:

Solution Type Technical Instruction Sure

Solution 1010753.1 : Sun Fire[TM] Servers (V280R, V480, V490, V880, V890): How to Replace Disks for Systems With Internal FC-AL Drives Under Solaris[TM] Volume Manager Control.

Applies to:

Sun Fire V890 Server
Sun Fire 280R Server
Solaris SPARC Operating System - Version: 8.0 and later [Release: 8.0 and later]
Sun Fire V480 Server
Sun Fire V490 Server
All Platforms

To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community, Oracle Solaris Entrylevel Servers.

Goal

Beginning with Solaris[TM] 9 Operating System (OS), Solaris[TM] Volume Manager (SVM) software uses a new feature called Device-ID (or DevID), which identifies each disk not only by it's c#t#d# name, but also by a unique ID generated by the disk's World Wide Number (WWN), or serial number. The SVM software relies on the Solaris OS to supply it with each disk's correct DevID. When a disk fails and is replaced, a specific procedure is required to make sure that the Solaris OS is updated with the ne

Solution

Steps to Follow
Sun Fire[TM] Servers (V280R, V480, V490, V880, V890): How to Replace Disks for Systems With Internal FC-AL Drives Under Solaris[TM] Volume Manager

To replace a disk, use the luxadm(1M) command to remove it and insert the new disk. This procedure causes an update of the Solaris OS device framework, so that the new disk's DevID is inserted and the old disk's DevID is removed.

PROCEDURE FOR REPLACING MIRRORED DISKS

The following set of commands should work in all cases. Follow the exact sequence to ensure a smooth operation.

To replace a disk, which is controlled by SVM, and is part of a mirror, perform the following steps:

1. Run "metadetach" to detach all the submirrors on the failing disk from their
respective mirrors (see the following):

     # metadetach -f

   NOTE:  The "-f" option is not required if the metadevice is in an "okay"
state.

2. Run metaclear to remove the   configuration from the disk:

     # metaclear

   You can verify there are no existing metadevices left on the disk, by
running the following:

     # metastat -p | grep c#t#d#

3. If there are any replicas on this disk, note the number of replicas, and
remove them using the following:

     # metadb -i (number of replicas to be noted).
# metadb -d c#t#d#s#

   Verify that there are no existing replicas left on the disk by running the
following:

     # metadb | grep c#t#d#

4. If there are any open filesystems on this disk not under SVM control,
or non-mirrored metadevices, unmount them.

5. Run "format" or "prtvtoc/fmthard" to save the disk partition table
information.

     # prtvtoc /dev/rdsk/c#t#d#s2 > file

6. Run the 'luxadm' command to remove the failed disk.

     #luxadm remove_device -F /dev/rdsk/c#t#d#s2
At the prompt, physically remove the disk and continue.
The picld daemon notifies the system that the disk has been removed.

7. Initiate devfsadm cleanup subroutines by entering the following command:

     # /usr/sbin/devfsadm -C -c disk

   The default devfsadm operation is, to attempt to load every driver in the
system, and attach these drivers to all possible device instances. The
devfsadm command then creates device special files, in the /devices
directory, and logical links in /dev.

   With the "-c disk" option, devfsadm will only update disk device files.
This saves time, and is important on systems that have tape devices
attached. Rebuilding these tape devices could cause undesirable results on
non-Sun hardware.

   The -C option cleans up the /dev directory, and removes any lingering
logical links to the device link names.

   This should remove all the device paths for this particular disk. This can
be verified with:

      # ls -ld /dev/dsk/cxtxd*

   This should return no devices.

8. It is now safe to physically replace the disk. Insert a new disk, and
configure it. Create the necessary entries in the Solaris OS device tree,
with one of the following commands

     # devfsadm

or

     # /usr/sbin/luxadm insert_device

       where sx is the slot number

or

     # /usr/sbin/luxadm insert_device (if enclosure name is not known)

   Note: In many cases, luxadm insert_device does not require the enclosure
name and slot number.

   Use the following to find the slot number:

     # luxadm display

   To find the  use:

     # luxadm probe

   Run "ls -ld /dev/dsk/c1t1d*" to verify that the new device paths have
been created.

   CAUTION:
After inserting a new disk and running devfsadm (or luxadm), the old ssd
instance number changes to a new ssd instance number. This change is
expected, so ignore it.

   For Example:
When the error occurs on the following disk, whose ssd instance is given
by ssd3:

    WARNING: /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w21000004cfa19920,0 (ssd3):
Error for Command: read(10)                Error Level: Retryable
Requested Block: 15392944                  Error Block: 15392958

    After inserting a new disk, the ssd instance changes to ssd10 as shown
below. It is not a cause of concern as this is expected.

  picld[287]: [ID 727222 daemon.error] Device DISK0 inserted
qlc: [ID 686697 kern.info] NOTICE: Qlogic qlc(2): Loop ONLINE
scsi: [ID 799468 kern.info] ssd10 at fp2: name w21000011c63f0c94,0, bus
address ef
genunix: [ID 936769 kern.info] ssd10 is /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@
w21000011c63f0c94,0
scsi: [ID 365881 kern.info]    
genunix: [ID 408114 kern.info] /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w21000011c
63f0c94,0 (ssd10) online

9. Run "format" or "prtvtoc/fmthard" to put the desired partition table on the
new disk.

     # fmthard -s file /dev/rdsk/c#t#d#s2

      ['file' is the prtvtoc saved in step 5]

10. Use "metainit" and "metattach" to create and attach those submirrors to
the mirrors to start the resync:

      # metainit  1 1 c#t#d#s#
# metattach

11. If necessary, re-create the same number of replicas that existed
previously, using the -c option of the metadb(1M) command:

      # metadb -a -c# c#t#d#s#

12. Be sure to correct the EEPROM entry for the boot-device (only if one of
the root disks has been replaced).

PROCEDURE FOR REPLACING A DISK IN A RAID-5 VOLUME

Note: If a disk is used in BOTH a mirror and a RAID5, do not use the following
procedure; instead, follow the instructions for the MIRRORED
devices (above). This is because the RAID5 array just healed, is treated
as a single disk for mirroring purposes.

To replace an SVM-controlled disk, which is part of a RAID5 metadevice, the following steps must be followed:

1. If there are any open filesystems on this disk not under SVM control,
or non-mirrored metadevices, unmount them.

2. If there are any replicas on this disk, remove them using:

     # metadb -d c#t#d#s#

   Verify there are no existing replicas left on the disk by running:

     # metadb | grep c#t#d#

3. Run "format" or "prtvtoc/fmthard" to save the disk partition table
information.

     # prtvtoc /dev/rdsk/c#t#d#s2 > file

4. Run the 'luxadm' command to remove the failed disk.

     # luxadm remove_device -F /dev/rdsk/c#t#d#s2
At the prompt, physically remove the disk and continue.
The picld daemon notifies the system that the disk has been removed.

5. Initiate devfsadm cleanup subroutines by entering the following command:

     # /usr/sbin/devfsadm -C -c disk

   The default devfsadm operation, is to attempt to load every driver in the
system, and attach these drivers to all possible device instances. The
devfsadm command then creates device special files in the /devices
directory, and logical links in /dev.

   With the "-c disk" option, devfsadm will only update disk device files.
This saves time and is important on systems that have tape devices
attached. Rebuilding these tape devices could cause undesirable results on
non-Sun hardware.

   The -C option cleans up the /dev directory, and removes any lingering
logical links to the device link names.

   This should remove all the device paths for this particular disk.
This can be verified with:

     # ls -ld /dev/dsk/cxtxd*

   This should return no devices.

6. It is now safe to physically replace the disk. Insert a new disk, and
configure it. Create the necessary entries in the Solaris OS device tree,
with one of the following commands:

     # devfsadm

or

     # /usr/sbin/luxadm insert_device 
where sx is the slot number

or

     # /usr/sbin/luxadm insert_device (if enclosure name is not known)

    Note: In many cases, luxadm insert_device does not require the
enclosure name and slot number.

    Use the following to find the slot number:

      # luxadm display

    To find the  you can use:

      # luxadm probe

      Run "ls -ld /dev/dsk/c1t1d*" to verify that the new device paths have
been created.

  CAUTION:
After inserting a new disk and running devfsadm(or luxadm), the old ssd
instance number changes to a new ssd instance number. This change is
expected, so ignore it.

  For Example:

  When the error occurs on the following disks(ssd3).

  WARNING: /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w21000004cfa19920,0 (ssd3):
Error for Command: read(10)                Error Level: Retryable
Requested Block: 15392944                  Error Block: 15392958

  After inserting a new disk, the ssd instance changes to ssd10 as shown
below. It is not a cause of concern as this is expected.

  picld[287]: [ID 727222 daemon.error] Device DISK0 inserted
qlc: [ID 686697 kern.info] NOTICE: Qlogic qlc(2): Loop ONLINE
scsi: [ID 799468 kern.info] ssd10 at fp2: name w21000011c63f0c94,0, bus
address ef
genunix: [ID 936769 kern.info] ssd10 is /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@
w21000011c63f0c94,0
scsi: [ID 365881 kern.info]    
genunix: [ID 408114 kern.info] /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w21000011c
63f0c94,0 (ssd10) online

7. Run 'format' or 'prtvtoc' to put the desired partition table on the new disk

    # fmthard -s file /dev/rdsk/c#t#d#s2

      ['file' is the prtvtoc saved in step 3]

8. Run 'metadevadm' on the disk, which will update the New DevID.

     # metadevadm -u c#t#d#

Note: Due to BugID 4808079 a disk can show up as "unavailable" in the
metastat command, after running Step 8. To resolve this, run "metastat -i".
After running this command, the device should show a metastat status of "Okay".
The fix for this bug has been delivered and integrated in s9u4_08,s9u5_02 and
s10_35.

9. If necessary, recreate any replicas on the new disk:

     # metadb -a c#t#d#s#

10. Run metareplace to enable and resync the new disk:

      # metareplace -e  c#t#d#s#

Reference:

The following Infodocs were referenced:

<Document 1007367.1> *Synopsis:* Removing and Replacing the Sun Fire[TM] 280R, V480, or V880 hot-pluggable internal disk drives.

<Document 1006196.1> *Synopsis:* Solaris[TM] Volume Manager software: Replacing SCSI Disks (Solaris[TM] 9 Operating System and above)

<Document 1003122.1> *Synopsis:* Veritas Volume Manager - Procedure to Replace Internal FibreChannel (FC) Disks controlled by VxVM

Product
Solaris 10 Operating System
Solaris 9 Operating System
Sun Fire V880 Server
Sun Fire V480 Server
Sun Fire 280R Server
Sun Fire V890 Server
Sun Fire V490 Server

Internal Comments
Info.

ESCID: 550135
BUGID: 4973301
email:[email protected]
Manager: [email protected]

svm, s9, s10, sunfire, luxadm, Raid5, mirror, volume, raid5, mirrored volume, LVM, SVM, 880, 890, 480, 490, 280
Previously Published As
76438

Change History

Date: 2010-12-21
User name: Dencho Kojucharov
Action: Currency check
Comments: audited by Entry-Level SPARC Content Lead
corrected broken links for doc 1003122.1

@
Date: 2007-10-01
User Name: 146649
Action: Update Canceled
Comment: *** Restored Published Content *** Stop updating this document because Voyager will be replaced by IBIS soon.
I'll restart update it later on IBIS.
Version: 0
Date: 2007-04-17
User Name: 146649
Action: Update Started
Comment: Accepting the doc to update as per KGap #2590.
Version: 0
Date: 2006-11-08
User Name: 95826
Action: Approved
Comment: - verified metadata
- changed review date to 2007-11-08
- checked for TM - none added
- checked audience : contract
- no further edit required
Publishing
Version: 15
Date: 2006-11-08
User Name: 95826
Action: Accept
Comment:
Version: 0

Attachments

This solution has no attachment