Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Problem Resolution Sure Solution 1310404.1 : VTL - Failover and failback issue due to timeout
Problem with VTL failback due to timeout. In this Document Created from <SR 3-2558289527>
Applies to:Sun StorageTek VTL Plus Storage Appliance - Version: 2.0 - Build 1656Information in this document applies to any platform. SymptomsVTL node1 failed over to node2, but node1 did not come Ready for failbackNode2 panic'd, causing complete outage. Possible mutual failover situation ChangesIssue occurred after applying VTL Get Well Plan (GWP), but this does not appear to be a cause of the eventsCauseProblem was due to the fact that the VTL node hasn't finished loading the resources when it starts taking over the partner node during simultaneous node boot up. SolutionThis issue is resolved by introducing a delay during the startup sequence, to allow self monitoring module to finish loading resources before taking over the partner. To add the delay, modify the "ipstorfm.sh" script (adding 2 lines) as indicated in code segment below (/usr/local/vt/bin): ... # check if ipstorfm is running already # if it is, return with an error APID=`$IS_BIN/pidof ipstorfm` NUM_P=`echo $APID | awk 'BEGIN{} {print NF}'` if [ $NUM_P -ne 0 ] then RET=1 else logger -p daemon.notice Sleeping 500 seconds before starting FM. <<< added line >>> sleep 500 <<< added line >>> $IS_BIN/ipstorfm $2& sleep 1 APID=`$IS_BIN/pidof ipstorfm` NUM_P=`echo $APID | awk 'BEGIN{} {print NF}'` if [ $NUM_P -eq 0 ] then RET=1 else RET=0 fi fi ... NOTE: It is also recommended that in a failover configuration, each node is rebooted one at a time and not simultaneously, which will also avoid this situation. Attachments This solution has no attachment |
||||||||||||
|