Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1310404.1
Update Date:2011-10-04
Keywords:

Solution Type  Problem Resolution Sure

Solution  1310404.1 :   VTL - Failover and failback issue due to timeout during simultaneous reboots  


Related Items
  • Sun StorageTek VTL Plus Storage Appliance
  •  
Related Categories
  • PLA-Support>Sun Systems>TAPE>Virtual Tape>SN-TP: VTL
  •  
  • .Old GCS Categories>Sun Microsystems>Storage - Tape>Tape Virtualization
  •  


Problem with VTL failback due to timeout.

In this Document
  Symptoms
  Cause
  Solution


Created from <SR 3-2558289527>

Applies to:

Sun StorageTek VTL Plus Storage Appliance - Version: 2.0 - Build 1656 and later   [Release: 2.0 and later ]
Information in this document applies to any platform.

Symptoms

VTL node1 failed over to node2, but node1 did not come Ready for failback

Node2 panic'd, causing complete outage.

Possible mutual failover situation

Cause


Problem was due to the fact that the VTL node hasn't finished loading the resources when it starts taking over the partner node during simultaneous node boot up.

Solution

Recommended resolution:
It is recommended that in a failover configuration, each node is rebooted one at a time and not simultaneously, which will avoid this situation.

@Possible workaround:
@NOTE: This is not recommended except at the direction of VTL support.

This issue is resolved by introducing a delay during the startup sequence, to allow self monitoring module @to finish loading resources before taking over the partner.

To add the delay, modify the "ipstorfm.sh" script (adding 2 lines) as indicated in code segment below
(/usr/local/vt/bin):

...
# check if ipstorfm is running already
# if it is, return with an error
APID=`$IS_BIN/pidof ipstorfm`
NUM_P=`echo $APID | awk 'BEGIN{} {print NF}'`
if [ $NUM_P -ne 0 ]
then
     RET=1
else
      logger -p daemon.notice Sleeping 500 seconds before starting FM.     <<< added line >>>
    sleep 500                                                                                                   <<< added line >>>
      $IS_BIN/ipstorfm $2&
      sleep 1
      APID=`$IS_BIN/pidof ipstorfm`
      NUM_P=`echo $APID | awk 'BEGIN{} {print NF}'`
      if [ $NUM_P -eq 0 ]
      then
            RET=1
       else
             RET=0
       fi
fi
...

NOTE: The "@" at start of above lines makes this section internal use only...is not part of script...

Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback