Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Technical Instruction Sure Solution 1004100.1 : Sun[TM] Cluster 3.x: Rolling firmware update on SCSI JBOD disk with Solaris[TM] Volume Manager and root disk
PreviouslyPublishedAs 205707 Description This Technical Instruction explains how to do a rolling disk firmware update with minimum downtime in Sun[TM] Cluster using Solaris[TM] Volume Manager (also known as SDS Solstice DiskSuite[TM]). This disk could be the local disk (root/mirror) on the nodes or a external shared SCSI JBOD disk. Steps to Follow There are two different approaches to do the firmware update, one for shared SCSI JBOD disks and one for cluster node disks (root/mirror). The following procedure assumes that: 1) All metadevices (local and in disksets) are in the "Okay" state, all submirrores are attached and no resyncing of any metadevice is still in progress! 2) Your /etc/lvm/md.tab is current - compare output of "metastat -p" with the entries in /etc/lvm/md.tab. If they are missing or not uptodate then run: # metastat -p >> /etc/lvm/md.tab and for the disksets: # metastat -s <setname> -p >> /etc/lvm/md.tab Note: If you run a "sun explorer" all this info is saved in the output file! 3) Both cluster nodes are members and this will not change during the procedure. 4) The cluster 'did' namespace is current with no mismatches. 5) Check that all cluster 'did' id's do match the physical disk's id. Check /var/adm/messages for similar warnings like: device id for '/dev/rdsk/c2t8d0' does not match physical disk's id. The drive may have been replaced If this is the case then first identify the cluster DID device which does not match: [root]# scdidadm -L 1 msun0001:/dev/rdsk/c1t0d0 /dev/did/rdsk/d1 2 msun0001:/dev/rdsk/c1t1d0 /dev/did/rdsk/d2 3 msun0002:/dev/rdsk/c3t8d0 /dev/did/rdsk/d3 3 msun0001:/dev/rdsk/c3t8d0 /dev/did/rdsk/d3 4 msun0001:/dev/rdsk/c3t9d0 /dev/did/rdsk/d4 4 msun0002:/dev/rdsk/c3t9d0 /dev/did/rdsk/d4 5 msun0001:/dev/rdsk/c2t8d0 /dev/did/rdsk/d5 <<<<<< wrong ID 5 msun0002:/dev/rdsk/c2t8d0 /dev/did/rdsk/d5 <<<<<< 6 msun0001:/dev/rdsk/c2t9d0 /dev/did/rdsk/d6 6 msun0002:/dev/rdsk/c2t9d0 /dev/did/rdsk/d6 7 msun0002:/dev/rdsk/c1t0d0 /dev/did/rdsk/d7 8 msun0002:/dev/rdsk/c1t1d0 /dev/did/rdsk/d8 Now run the following commands on the identified DID device to update cluster config: (this commands are save to run on a productive cluster!) Check the current ID [root]# scdidadm -o asciidiskid -l d5 IBM 8RM838 Update DID [root]# scdidadm -R d5 Xcheck that ID is correctly updated [root]# scdidadm -o asciidiskid -l d5 SEAGATE 3JA97LEV00007503 The ID should match the label on the front of the physical disk! You can use "iostst -En" to check all real serial number (and revision too!) and "scdidadm -o asciidiskid -l dYX" an all DID for cross checking. ...... c2t8d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: SEAGATE Product: ST336607L SUN36G Revision: 0507 Serial No: 00007503 Size: 18.11GB <18110967808 bytes> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 ...... If there is no metadevice which is offline or in maintenance state and all DID ID's mach the physical ID's, the continue with A) and /or B). A) How to do a firmware update on SCSI JBOD cluster shared diskAs the disks is spinning down and does a hard reset while doing a F/W update, you cannot do this on a disk in use. You would lose a mirror half! The second problem is that the "download" routine is checking if SVM too if drive is in use. To overcome both problems you need to: First offline the disk for the period of updating, so will not lose the mirror half and the resync is quite quick! Then you need to run the "download" routine from the note which is currently NOT the owner of the diskset with the disk to update. In other words, if node 1 has the diskset imported, then run the "firmware "download" from node 2. root@msun0002 # scstat -D .... Device Group Primary Secondary ------------ ------- --------- Device group servers: nfs-set msun0001 msun0002 .... root@msun0002 # metastat -s nfs-set Proxy command to: msun0001 nfs-set/d300: Mirror Submirror 0: nfs-set/d301 State: Okay Submirror 1: nfs-set/d302 State: Okay Pass: 1 Read option: roundrobin (default) Write option: parallel (default) Size: 142239915 blocks nfs-set/d301: Submirror of nfs-set/d300 State: Okay Size: 142239915 blocks Stripe 0: (interlace: 128 blocks) Device Start Block Dbase State Hot Spare d3s0 0 No Okay d4s0 0 No Okay nfs-set/d302: Submirror of nfs-set/d300 State: Okay Size: 142239915 blocks Stripe 0: (interlace: 128 blocks) Device Start Block Dbase State Hot Spare d5s0 0 No Okay d6s0 0 No Okay root@msun0002 # metaoffline -s nfs-set d300 d301 Change directory to the firmware patch directory: root@msun0002 # cd /var/tmp/116369-11 root@msun0002 # ./download Firmware Download Utility, V4.2 ************************** WARNING ************************** NO OTHER ACTIVITY IS ALLOWED DURING FIRMWARE UPGRADE!!! No other programs including any volume manager (e.g. Veritas, SDS, or Vold) should be running. Other host systems sharing any I/O bus with this host must either be offline or disconnected. Any interruption (e.g. power loss) during upgrade can result in damage to devices being upgraded. Any disk to be upgraded should first have its data backed up. *************************************************************** Searching for devices... rmt/0: Mode Sense for default pages failed! DISK DEVICES Device Rev Product c1t0d0: 0507 ST336607L -- SUN36G c1t1d0: 1804 MAN3367M -- SUN36G c2t8d0: 0507 ST336607L -- SUN36G <<<<<<<<<<<<<<< c2t9d0: 0507 ST336607L -- SUN36G c3t8d0: S96H DDYST3695 -- SUN36G c3t9d0: 0507 ST336607L -- SUN36G Total Devices: 6 Enter command: p c2t8d0 <<<<<<<<<<<<<< NOTE: select ONLY the one disk to update!!! NOTICE: Cannot access kernel, kvm_open did not succeed! Upgrading devices... c2t8d0: Successful download Enter command: inq " check if new firmware in place!" DISK DEVICES Device Rev Product S/N ........ c2t8d0: 0707 ST336607L -- SUN36G <<<<<<<<<<<<<<< ........ Enter command: q Now online the disk again and observe syncing..... root@msun0002 # metaonline -s nfs-set d300 d301 Proxy command to: msun0001 root@msun0002 # metastat -s nfs-set | grep % Proxy command to: msun0001 32 % done root@msun0002 # Repeat with other disks if nessessary. B) How to do a firmware update on cluster node diskIf you have to update local disk, the just switch all the resource groups to the node. root@msun0002 # scswitch -z -g <resourcegroup> -h msun0002 Then reboot this node into "none cluster mode": root@msun0002 # init 0 > OK boot -xs Once booted, you will have to delete the metadb on the disk to be updated and detach and clear the metadevice: root@msun0002 # metadb flags first blk block count a m p luo 16 4096 /dev/dsk/c1t0d0s7 a p luo 4112 4096 /dev/dsk/c1t0d0s7 a p luo 8208 4096 /dev/dsk/c1t0d0s7 a p luo 16 4096 /dev/dsk/c1t1d0s7 a p luo 4112 4096 /dev/dsk/c1t1d0s7 a p luo 8208 4096 /dev/dsk/c1t1d0s7 root@msun0002 # metadb -d /dev/dsk/c1t0d0s7 root@msun0002 # metadb flags first blk block count a p luo 16 4096 /dev/dsk/c1t1d0s7 a p luo 4112 4096 /dev/dsk/c1t1d0s7 a p luo 8208 4096 /dev/dsk/c1t1d0s7 Save your rootdisk configuration before you start and save the original md.tab file. root@msun0002 # cp /etc/lvm/md.tab /etc/lvm/md.tab.orig root@msun0002 # metastat -p > /etc/lvm/md.tab root@msun0002 # metastat -p d200 -m d201 d202 1 d201 1 1 c1t0d0s0 d202 1 1 c1t1d0s0 d210 -m d211 d212 1 d211 1 1 c1t0d0s1 d212 1 1 c1t1d0s1 d230 -m d231 d232 1 d231 1 1 c1t0d0s3 d232 1 1 c1t1d0s3 d240 -m d241 d242 1 d241 1 1 c1t0d0s4 d242 1 1 c1t1d0s4 d250 -m d251 d252 1 d251 1 1 c1t0d0s5 d252 1 1 c1t1d0s5 d260 -m d261 d262 1 d261 1 1 c1t0d0s6 d262 1 1 c1t1d0s6 root@msun0002 # metadetach d200 d201 ...... Repeat this with all other submirrors ..... The metastat should now look something like this: root@msun0002 # metastat -p d200 -m d202 1 d202 1 1 c1t1d0s0 d210 -m d212 1 d212 1 1 c1t1d0s1 d230 -m d232 1 d232 1 1 c1t1d0s3 d240 -m d242 1 d242 1 1 c1t1d0s4 d250 -m d252 1 d252 1 1 c1t1d0s5 d260 -m d262 1 d261 1 1 c1t0d0s6 d262 1 1 c1t1d0s6 d201 1 1 c1t0d0s0 d211 1 1 c1t0d0s1 d231 1 1 c1t0d0s3 d241 1 1 c1t0d0s4 d251 1 1 c1t0d0s5 Now save this configuration again, you will see further down the reason for this. root@msun0002 # cp /etc/lvm/md.tab /etc/lvm/md.tab.bothmirrors root@msun0002 # metastat -p > /etc/lvm/md.tab Now clear all the subrirrors root@msun0002 # metaclear d201 .... repeat for all submirrors .... Now you can update your firmware but for this disk only. (Follow the procedure in patch readme or like in A) When finished, just run: root@msun0002 # metainit -a metainit is now using the entries in /etc/lvm/md.tab by default and it will recreate all the missing submirrors again. Ther will be a lot of messages telling that some mirrors exist, that is OK, ignore. root@msun0002 # metattach d200 d201 .... repeat for all submirrors .... Repeat with other local disk (if nessessary). Reboot node into cluster again and repeat with other node (if necessary). Product Solaris Volume Manager Software Solstice DiskSuite 4.0 Solstice DiskSuite 3.0 Sun StorageTek 3510 FC Array JBOD Sun StorageTek D2 Array Sun StorageTek D1000 Array Sun Cluster 3.1 Sun Cluster 3.0 scsi, jbod, disk, replacement, suncluster, cluster, scdidadm, firmware, download, update Previously Published As 88595 Change History Date: 2007-03-04 User Name: 97961 Action: Approved Comment: - Converted to STM formatting for better readability - Made simple sentence/grammatical corrections Version: 4 Date: 2007-03-04 User Name: 97961 Action: Accept Comment: Version: 0 Attachments This solution has no attachment |
||||||||||||
|