Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Troubleshooting Sure Solution 1019953.1 : Troubleshooting Sun StorEdge 6320[TM] Raid Controller Problems
PreviouslyPublishedAs 249667 Description This document addresses the identification of failed or failing raid controllers in the array via various symptoms provided. Symptoms:
Steps to Follow Please validate that each troubleshooting step below is true for your environment. The steps will provide instructions or a link to a document, for validating the step and taking corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Please do not skip a step. 1. Validate the symptom from the symptom list above. 1a. If you received an email from the array or SSRR opened a Service Request then go to Step 2. 1b. If you See a Device Alert, per the symptom set above, then go to Step 5. 1c. If you noticed a fault LED on your raid controller or the global fault LED is lit, then go to Step 2. 1d. If none of the above skip to Step 9. 2. Validate the ability to log into your 6320. Go to https://<array_IP>:9443, and log in. See chapter 2 of the Sun StorEdge 6320 System 1.2 Reference and Service Manual for details on logging into array. If you have trouble logging into the service processor on the 6320, Refer to Solution 249670: Troubleshooting Sun StorEdge 6320[TM] Loss Of management Access Faults If you can log in, continue to Step 3. 3. Validate "Overall Health Status" box in Configuration Service page. 3a. If the status shows: "Error" Go to Step 4 3b. If the status shows: "Ok" Go to Step 5 4. Validate the existence of a StorADE Alarm against the raid controller by logging into the Storage Automated Diagnostic Environment (StorADE). use the following URL: https://system_ip_address:7443 From the "Home" window that comes up, check the "Device Health Summary" to see if there are alerts listed. To verify that an alarm is related to a raid controller issue, select the "Alerts" link in the "Device Health Summary" and search for raid controller alarms. Selecting the alarm link will display details about the alarm. 4a. If there's an alarm, go to Step 5 4b. If there's no alarm, go to Step 9 4c. If there's no alarm AND an LED lit on the raid controller or Global Fault Indicator, go to Step 5 5. Validate raid controller status from a Detailed FRU Report. a) Login into STORade, see Step 4 for details, and select the "Reports" tab. b) In the "Reports" window select "General Reports" and then "Fru Reports". c) From the "Fru Reports" window under "Select report to display or Email", select in the "Display" link in the "Detailed Fru Report" row. 5a. If status is ready-disabled, go to Step 9 5b. If status is ready-enabled, go to Step 6 6. Validate LED and/or Alarm existence against controller in ready-enabled state. 6a. If there is an Alarm AND a fault LED lit for the raid controller, go to Step 9. 6b. If there is an Alarm and no fault LED lit for the raid controller, go to Step 8. 6c. If there is an LED and NO Alarm for the raid controller, go to Step 9. 6d. If there are no Alarms or LED's lit, go to Step 9. 8. Validate the RAID controller firmware version against minimums. a) Log into Configuration Services b) Click on the Administration Tab c) Click on the General Link d) The RAID controller firmware is in the lower portion of this screen for all configured arrays. 8a. If the raid controller firmware is below the version listed for T3+/6120 arrays in Solution 200077: Minimum supported releases for the Sun StorageTek T3+, 6120, 6320 and 6920, Clear the alarm for the controller in StorADE, and contact Sun Support to schedule an array upgrade for your unit. 8b. If the raid controller firmware is at or above the version listed in Solution 200077: Minimum supported releases for the Sun StorageTek T3+, 6120, 6320 and 6920, go to step 9 9. At this point, if you have validated that each troubleshooting step above is true for your environment, and the issue still exists, further troubleshooting is required. Please open a Service Request with Sun Microsystems. Please include: * StorADE Alarm text if available * Statement of Symptoms you see that pertain to the raid controller * Solution Extract Refer to Solution 230665: Sun StorEdge[TM] 6320:How To:How to collect 6320 extractor output using StorADE command line or GUI * Status of raid controller as shown in the Detailed FRU report in Step 5 * Email text received from the 6320 storage system. Product Sun StorageTek 6320 System Sun StorageTek 6120/6320 Controller Firmware 3.2 Internal Comments This is a continuation of steps that are not possible using the customer user interface to troubleshoot. If you have not performed steps 1-9 above, please do so first with the customer, then continue to step 10 below. 10. Validate message entries for ready-disabled controller Review the <extractor>/Arrays/<array>/filesystem/syslog file, or the <extractor>/Sp/messages/var_adm/messages.array for the array in question, and check whether there were any disk media errors prior to the controller fault. Reference <Document: 1019954.1> : Troubleshooting Sun StorEdge 6320[TM] Disk Faults
Arrays may go down unexpectedly and lose Host Connectivity after 994 days of Continuous Operation. If array does not have the symptoms of 237605, continue to Step 12. If the array does have the symptoms of 237605, follow the resolution path provided by the Sun Alert. 12. Validate that the controller can boot successfully. a) Enable the controller b) Check the status of the controller against fru stat. It should go to "ready enabled" c) Wait a period of 5 minutes to ensure that the array controller is stable and stays in an optimal state. If the controller fails to reach a "ready enabled" state, replace the controller. If the controller fails to reach a "ready enabled" state, and the controller has been replaced, continue to Step 13. If the controller reaches "ready enabled", but does not stay that way for at least 5 minutes, continue to Step 13. 13. At this point, if you have validated that each troubleshooting step above is true for your environment, and the issue still exists, further troubleshooting is required. Please Escalate to your next level of support with the following information:
(s) of the respective domains. To notify content owners of a knowledge gap contained in this document, and/or prior to updating this document, please contact the domain engineers that are managing this document via the “Document Feedback” alias(es) listed below: [email protected] The Knowledge Work Queue for this article is KNO-STO-MIDRANGE_DISK 6320,controller,reset,failed,RAID,read-disabled,normalized Attachments This solution has no attachment |
||||||||||||
|