Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-75-1007046.1
Update Date:2011-05-23
Keywords:

Solution Type  Troubleshooting Sure

Solution  1007046.1 :   Troubleshooting Sun StorageTek[TM], Sun StorEdge[TM], and Sun Storage[TM] Management Communication Faults with Arrays  


Related Items
  • Sun Storage Flexline 280 Array
  •  
  • Sun Storage 2540 Array
  •  
  • Sun Storage 2510 Array
  •  
  • Sun Storage 6140 Array
  •  
  • Sun Storage Common Array Manager (CAM)
  •  
  • Sun Storage Flexline 210 Array
  •  
  • Sun Storage 2530 Array
  •  
  • Sun Storage Flexline 380 Array
  •  
  • Sun Storage 6540 Array
  •  
  • SANtricity Storage Manager
  •  
  • Sun Storage 6130 Array
  •  
  • Sun Storage Flexline 240 Array
  •  
Related Categories
  • GCS>Sun Microsystems>Storage Software>Modular Disk Device Software
  •  

PreviouslyPublishedAs
209726


Applies to:

Sun Storage Common Array Manager (CAM) - Version: 5.0 and later   [Release: and later ]
Sun Storage 6130 Array - Version: Not Applicable and later    [Release: N/A and later]
Sun Storage 2540 Array - Version: Not Applicable and later    [Release: N/A and later]
Sun Storage 2530 Array - Version: Not Applicable and later    [Release: N/A and later]
Sun Storage 2510 Array - Version: Not Applicable and later    [Release: N/A and later]
All Platforms

Purpose

To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community - Storage Disk 6000 and 2000 Series RAID Arrays

The purpose of this document is to assist in the identification and resolution
of issues related communication issues between Sun StorageTek Common Array Manager(CAM), Sun StorageTek SANtricity Storage Manager, and any supported Sun StorEdge[TM], Sun StorageTek[TM], StorageTek[TM], or Sun Storage[TM]array.

Symptoms include:

  • ASR Summary with SCRK:oob Component Name:OutOfBand Id:oob 
  • ASR Summary with ASR:oob
  • ASR Summary with ASR:ib
  • CAM Alert or ASR event for 2510 - 73.12.31 2510.CommunicationLostEvent.oob
  • CAM Alert or ASR event for 2530 - 69.12.31 2530.CommunicationLostEvent.oob
  • CAM Alert or ASR event for 2540 - 70.12.31 2540.CommunicationLostEvent.oob
  • CAM Alert or ASR event for 6130 - 48.12.31 6130.CommunicationLostEvent.oob
  • CAM Alert or ASR event for 6140 - 57.12.31 6140.CommunicationLostEvent.oob
  • CAM Alert or ASR event for 6540 - 63.12.31 6540.CommunicationLostEvent.oob
  • CAM Alert or ASR event for flx380 - 59.12.31 flx380.CommunicationLostEvent.oob
  • CAM Alert or ASR event for flx280 - 72.12.31 flx280.CommunicationLostEvent.oob
  • CAM Alert or ASR event for flx240 - 74.12.31 flx240.CommunicationLostEvent.oob
  • CAM Alert or ASR event for 6580 79.12.31 6580.CommunicationLostEvent.oob
  • CAM Alert or ASR event for 6780 80.12.31 6780.CommunicationLostEvent.oob
  • CAM Alert or ASR event for 6180 90.12.31 6180.CommunicationLostEvent.oob
  • CAM Alert or ASR event for 2510 - 73.12.21 2510.CommunicationLostEvent.ib
  • CAM Alert or ASR event for 2530 - 69.12.21 2530.CommunicationLostEvent.ib
  • CAM Alert or ASR event for 2540 - 70.12.21 2540.CommunicationLostEvent.ib
  • CAM Alert or ASR event for 6130 - 48.12.21 6130.CommunicationLostEvent.ib
  • CAM Alert or ASR event for 6140 - 57.12.21 6140.CommunicationLostEvent.ib
  • CAM Alert or ASR event for 6540 - 63.12.21 6540.CommunicationLostEvent.ib
  • CAM Alert or ASR event for flx380 - 59.12.21 flx380.CommunicationLostEvent.ib
  • CAM Alert or ASR event for flx280 - 72.12.21 flx280.CommunicationLostEvent.ib
  • CAM Alert or ASR event for flx240 - 74.12.21 flx240.CommunicationLostEvent.ib
  • CAM Alert or ASR event for 6580 79.12.21 6580.CommunicationLostEvent.ib
  • CAM Alert or ASR event for 6780 80.12.21 6780.CommunicationLostEvent.ib
  • CAM Alert or ASR event for 6180 90.12.21 6180.CommunicationLostEvent.ib
  • Failure to register an array in SANtricity or CAM
  • Array is listed as Unresponsive in the Enterprise Management window in SANtricity
  • Array is listed as Unresponsive in the Array Summary Page in CAM

Last Review Date

March 22, 2011

Instructions for the Reader

A Troubleshooting Guide is provided to assist in debugging a specific issue. When possible, diagnostic tools are included in the document to assist in troubleshooting.

Troubleshooting Details

Please validate that each troubleshooting step below is true for your environment. Each step will provide instructions via a link to a document, for validating the step and taking corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Please do not skip a step.

A. Verify whether the issue is based on an Alert/Alarm, or a problem with Registration/Adding.

  • If the issue is based on an Alert/Alarm received from your management host, or observed in the Array Summary(CAM) or Enterprise Management(SANtricity), go to Step B.
  • If the issue is based on a problem with registering/adding the array to the management software, go to Step C.

B.  Verify whether the array is being managed in band or out of band.

For CAM:

In the Array Summary window, the management type will be listed in parentheses as In-Band or Out-of-Band.  Alternatively the alarm code listed as XX.12.YY in the alarm dictates whether the array is in band or out of band.  XX is the array type as listed in the Description section of this document.  YY can be 21 or 31, indicating In-Band or Out-of-Band respectively.

For SANtricity via GUI:

This is displayed in the enterprise management window under management connections.  This is either Out of Band or In Band

NOTE 1:  There is no easy way to identify whether the array is managed in-band or out-of-band via the CLI for either application.

  • If the array is being managed Out-of-Band, go to Step C.
  • If the array is being managed In-Band, go to Step D.

C. Validate that you can communicate with each array controller out of band

Reference: <Document: 1008327.1> Validating Sun StorageTek[TM] 2500, 6000, and Flexline Array Controller Out of Band Communication

  • If the controllers can be communicated with properly, continue to Step F.
  • If the controllers communicated with properly, but the array still shows up as unresponsive, go to Step E.

D. Validate that you can communicate with each array controller in band

Reference: <Document: 1021058.1> Validating Sun StorageTek[TM] 2500, 6000, and Flexline Array Controller In Band Proxy Agent Communication

  • If the array and the in-band agent can be communicated with properly, but the array still cannot be registered, go to Step F.
  • If the array and the in-band agent can be communicated with properly, but the array still shows up as unresponsive, go to Step E.

E. Validate array status after initializing CAM or SANtricity Services 

CAM

Solaris 10 : svcadm restart svc:/system/fmservice:default
Solaris 8,9: /opt/SUNWsefms/sbin/fmservice.sh restart
Linux : /opt/sun/cam/private/fms/sbin/fmservice.sh restart
Windows : Use control panel to restart Sun_STK_FMS

Then check status:

Solaris 10 : svcs svc:/system/fmservice:default
Solaris 8,9: /opt/SUNWsefms/sbin/fmservice.sh status
Linux : /opt/sun/cam/private/fms/sbin/fmservice.sh status
Windows : Use control panel to check status of Sun_STK_FMS

Status should be online.

SANtricity

Simply Closing the Enterprise Management window to exit the application,
and launching it again, takes care of this task.

  • If the array is still unresponsive, or shows an alert/alarm, go to Step F.
  • If the array alert/alarm is gone, you have corrected the problem, no further action is required.

F. Re-register the array

If possible, remove and register the array from CAM or SANtricity. Attempt doing so by alternating between controller IP's during registration.

  • If you can register the array, continue to Step G.
  • If you cannot register the array, using either IP address, continue to Step H.

G. Validate whether issue is intermittent or not

  • If the array slips between having a communication issue and communicating ok, check your CAM version for 6.4.1 or below.  There are issues with long running jobs for arrays, or with the scripting client that are addressed in 6.5 and later.
  • If the array status slips between having a communication issue and communicating ok, check your network for the following traits:
    1. Whether your management LAN is a private LAN. This makes the network software on the array controllers subject to attack, and network congestion can cause the poll to fail.
    2. Whether any type of port scanning is taking place on the LAN. Port scanning can cause TCP port connections to max out, which will result in a failed poll of the array.
  • If you suspect either of the above issues, and your connection is intermittent, try to tune the polling interval larger.

CAM

  1. Click General Configuration
  2. Click Health Monitoring
  3. Change Monitoring Frequency
  4. Click Save

By default, CAM has a five(5) minute polling interval, will retry twice, and after fifteen(15) minutes, will throw an Alarm for loss of communication.

SANtricity

You cannot tune the polling interval in SANtricity Storage Manager.

If your issue is not intermittent, or if tuning the polling interval has not helped, continue to Step H.

H. Collect Data

At this point, if you have validated that each troubleshooting step above is true for your environment, and the issue still exists, further troubleshooting is required.

  • If possible, collect CAM array support data(will not be available if the array cannot be communicated with). Reference <Document: 1002514.1> : Collecting Support Data for Arrays Using Sun StorageTek[TM] Common Array Manager
  • If possible, collect SANtricity support data(will not be available if the array cannot be communicated with).  Reference: <Document: 1014074.1> Collecting Support Data for Arrays Using Sun StorageTek[TM] SANtricity Storage Manager
  • If using CAM, collect CAM host support data.  Reference <Document: 1021091.1> Collecting Sun StorageTek[TM] Common Array Manager Host Support Data
  • Provide a network map of the management LAN
  • Provide a network map of the in-band management if applicable
  • Provide Polling Interval
  • Indicate whether array is on a public or private LAN
  • Indicate whether array has a Static or DHCP assigned IP address
  • Indicate which of the above steps were attempted
  • Provide host type of management software
  • Provide Array Model
  • Provide Management Software name and version

And contact Support




Internal Comments
This document contains normalized content and is managed by the the Domain Lead
(s) of the respective domains. To notify content owners of a knowledge gap
contained in this document, and/or prior to updating this document, please
add a comment to the document and it will be processed.


Most Customers will be resolved by following the path of Steps C or D. The remaining few have either @ CAM services issues, or an intermittent problem on their network.

Due to CR6830106
running several long running jobs, or using the sscs scripting client
for multiple operations in succession, can cause the array to stop
communicating for a period of time.  Upgrade to release 6.5 or
later and re-evaluate your scripts/jobs.

Ensure that they are at the latest version of Common Array Manager

SANtricity, CAM, Common Array Manager, oob, out of band, 6140, 6540, flx240, flx210, flx380, 6540, 2540, 25x0, 2500, 6000, 2530, communication, 6180, 6580, 6780, ib, in band, normalized
Previously Published As
91322

Change History
Date: 2008-01-04
User Name: 7058
Action: Approved
Comment: No further edits required.
OK to publish.
Version: 9
Date: 2008-01-04
User Name: 7058
Action: Accept
Comment:
Version: 0
Date: 2008-01-04
User Name: 88109
Action: Approved
Comment: no technical change. link is correct
Version: 0
Date: 2008-01-03
User Name: 71066
Action: Approved
Comment: Link corrected as requested by the Final Review team.
Nicolas
Version: 0
Date: 2008-01-03
User Name: 31620
Action: Rejected


Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback