Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-75-1117584.1
Update Date:2010-06-19
Keywords:

Solution Type  Troubleshooting Sure

Solution  1117584.1 :   Troubleshooting Sun Storage[TM] 6580/6780 Cache Memory DIMM Faults  


Related Items
  • Sun Storage 6780 Array
  •  
  • Sun Storage 6580 Array
  •  
  • Sun Storage Common Array Manager (CAM)
  •  
Related Categories
  • GCS>Sun Microsystems>Storage Software>Modular Disk Device Software
  •  




In this Document
  Purpose
  Last Review Date
  Instructions for the Reader
  Troubleshooting Details


Applies to:

Sun Storage 6580 Array - Version: Not Applicable and later   [Release: NA and later ]
Sun Storage 6780 Array - Version: Not Applicable and later    [Release: NA and later]
Sun Storage Common Array Manager (CAM) - Version: 6.2 to 6.5   [Release: 6.2 to 6.5]
Information in this document applies to any platform.

Purpose

This document is intended to provide a basic overview on how to troubleshoot faults with the array RAID controller data cache memory DIMMs.

Please validate that each troubleshooting step below is true for your environment. Each step will provide instructions via a link to a document, for validating the step and taking corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Please do not skip a step.

Symptoms:

  • Controller Offline Critical Fault.
  • Messages for Cache Memory DIMM Missing MEL Event 0x1901.
  • Seven Segment Display of 0E+L2+dash+CF+C#+blank (repeats).
  • Seven Segment Display of SE+dF+dash+CF+C#+blank (repeats).
  • Cache DIMM State and Status of Unknown or Missing.
Cache Memory DIMM faults can present with the existence of a controller in a lock down or offline state as indicated by the Controller OFFLINE critical fault, the presence of a repeating Seven Segment display on the controller along with the Controller OFFLINE fault, or the existence of DIMM Missing events in the array logs.

Last Review Date

June 4, 2010

Instructions for the Reader

A Troubleshooting Guide is provided to assist in debugging a specific issue. When possible, diagnostic tools are included in the document to assist in troubleshooting.

Troubleshooting Details

1) Verify the existence of a Controller Offline critical fault.

Reference <Document: 1021057.1> Verify Sun StorageTek[TM] 2500 and Sun Storage[TM] 6000 Critical Faults via the User Interface.

  • If there is no Controller OFFLINE critical fault, it is unlikely that a Data Cache Memory DIMM has completely failed, but may require a check for cache memory messages. Continue to Step 3.
  • If there is a Critical Fault for Controller OFFLINE, Continue to Step 2.

2) Verify the Seven Segment display on the array controller module.

The best indicator of a Cache Memory Failure is using the Seven Segment Display on the Array Controller Module. This display also serves as the array controller module Tray ID indicator.  It is located in the rear (cable side) of the tray.  For more information on the 7-segment display, reference <Document: 1021110.1> Sun Storage[TM] 6180, 6580, and 6780 Array Controller 7-Segment Display.

  • If the display shows in a repeating pattern:

0E+L2+dash+CF+C#+blank

OR

SE+dF+dash+CF+C#+blank

Go to Step 5.


Note 1:  All repeating patterns end with a blank display.

Note 2:  The hash(#) symbol represents a number 1 through 8 which represents the faulty DIMM slot.

  • If the display shows another repeating pattern other than that shown above, the problem is not a Cache Memory Problem, but something else with the controller.  Please reference <Document: 1021113.1> Troubleshooting Sun StorageTek[TM], Sun StorEdge[TM], and Sun Storage[TM] RAID Controller Failures
  • If the display shows the tray ID (defaults to 99) of the module, OR you are unable to confirm the ID but the controller is OFFLINE, continue to Step 3.

3) Verify existence of Cache Memory DIMM Missing messages in array event logs.

The Cache DIMMs can have a handful of messages, but will transition to MISSING on a fault.  Here we will validate this transition using the array logs.

Sun StorageTek Common Array Manager:

Browser:

  1. Expand Storage Arrays in the left menu pane.
  2. Expand your storage array name in the left menu pane.
  3. Expand Troubleshooting in the left menu pane.
  4. Click on Events.
  5. In the right pane, click on the -|-> icon.  If you mouse over it it will state Advanced Filter.
  6. Set Event to Log Events.
  7. Set Event Type to Component.
  8. Set Read the last X Kbytes From Log File to 100.
  9. Set String Filter to DIMM.
  10. Click on the Details of any alarm that is shown.
  11. Review the Description Field.
  12. Get the value of the array log event ID from the description.

Example:

Description : Apr 08 21:31:31 6780-array Tray.99.Controller.A.DIMM01: [ID 0x1901] NOTICE: Cache memory dimm is missing

Note:  The filter in Step 9 is case sensitive.


SSCS CLI:

Get the list of events:

sscs list -d <array_name> -t LogEvent -f DIMM event

Get the event details:

sscs list -d <array_name> event event_id

Note:  The -f option is case sensitive.

Get the value of the array log event ID from the description:

Example:

Description : Apr 08 21:31:31 6780-array Tray.99.Controller.A.DIMM01: [ID 0x1901] NOTICE: Cache memory dimm is missing


SANtricity Storage Manager

GUI:
  1. Launch SANtricity.
  2. Double Click on your array name to open the Array Management Window.
  3. Click on the Advanced Menu.
  4. Click on the Troubleshooting Sub-Menu.
  5. Click on View Event Log.
  6. Un-Check View Only Critical Events.
  7. Click on the Component Type field header to sort the events.
  8. Look for Cache DIMM in the list of events.
  9. For any Cache DIMM event, highlight it, and check the View Details box.
  10. Get the value of the Event type field for each DIMM event.
SMcli:

Get the list of events by saving off the event log:

SMcli -n array_name -c "save storageArray allEvents file=\"some/file/path/log.txt\";"

Open a text viewing application to look at the individual events.
Get the value of the Event type field for each DIMM event.

Example Event

Date/Time: 6/8/10 21:52:00 ET
Sequence Number: 12345
Event Type: 1901
Description:  Cache memory dimm is missing

  • If there is the existence of an Event ID of 0x1901, go to Step 5.
  • If there are no 0x1901 Event ID's, but your array controller is OFFLINE, reference <Document: 1021113.1> Troubleshooting Sun StorageTek[TM], Sun StorEdge[TM], and Sun Storage[TM] RAID Controller Failures.
  • If there are no 0x1901 Event ID's, and your array controller is Online/OK go to Step 4.

4) Verify the state/status of the DIMM slots on your controller.

Follow the instructions below to look at the state and status of your DIMM slots, based on your user environment:

Sun StorageTek Common Array Manager:

Browser:

  1. Expand Storage Arrays in the left menu pane.
  2. Expand your storage array name in the left menu pane.
  3. Expand Troubleshooting in the left menu pane.
  4. Click on FRUs.
  5. In the right display pane, click on Cache Memory DIMMs.

SSCS CLI:

sscs list -d <array_name> -t dimm fru


Sun StorageTek SANtricity Storage Manager:

GUI:

  1. Launch SANtricity.
  2. Double Click on your array name to open the Array Management Window.
  3. In the left pane click on the controller icon for the controller DIMMs you want to view. 
  4. In the right pane the DIMM status, for each slot, will be listed after the base controller information.

SMcli:

SMcli -n <array_name> -c "show storageArray profile;"

Note 1:  The DIMM information will be listed in the CONTROLLERS section of the resultant SMcli output.
Note 2:  The DIMM information will be Unknown or Unavailable if the controller it resides on is currently OFFLINE.

  • If your DIMMs are Enabled/OK, you have validated your Cache Memory, no further work is required.
  • If your DIMMs are NOT Enabled/OK, go to Step 6.

5) Open a Service Call with Oracle to have the DIMM indicated replaced.

The DIMM slot for the controller is indicated in either the 0x1901 Event ID or by the Seven Segment Display.

Please open a service call with Oracle with:
  • Support Data Collection
Reference <Document: 1002514.1> Collecting Support Data for Arrays Using Sun StorageTek[TM] Common Array Manager.
Reference <Document: 1014074.1> Collecting Support Data for Arrays Using Sun StorageTek[TM] SANtricity Storage Manager.

OR
  • DIMM Slot location.
  • Array Critical Faults.
  • Array Event Log.
  • Seven Segment Display Code cited.

6) Open a Service Call with Oracle to for further research.

Please provide a Support Data Collection

Reference <Document: 1002514.1> Collecting Support Data for Arrays Using Sun StorageTek[TM] Common Array Manager.
Reference <Document: 1014074.1> Collecting Support Data for Arrays Using Sun StorageTek[TM] SANtricity Storage Manager.

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback