Asset ID: |
1-75-1005508.1 |
Update Date: | 2011-05-27 |
Keywords: | |
Solution Type
Troubleshooting Sure
Solution
1005508.1
:
Analyzing Internal LSI RAID Disk Failures
Related Items |
- Sun Fire X4600 M2 Server
- Sun Fire X4200 M2 Server
- Sun Ultra 20 Workstation
- Sun Fire X4100 M2 Server
- Sun Ultra 40 M2 Workstation
- Sun Fire X4640 Server
- Sun Netra X4200 Server
- Sun Fire X4270 Server
- Sun Netra X4450 Server
- Sun Ultra 27 Workstation
- Sun Fire X4140 Server
- Sun Fire X4100 Server
- Sun Fire X2100 M2 Server
- Sun Ultra 40 Workstation
- Sun Fire X4600 Server
- Sun Fire V20z Server
- Sun Fire X4200 Server
- Sun Fire X4240 Server
- Sun Fire X4150 Server
- Sun Fire X4170 Server
- Sun Ultra 20 M2 Workstation
- Sun Fire X4275 Server
- Sun Fire X4450 Server
- Sun Netra X4200 M2 Server
- Sun Netra X4270 Server
- Sun Fire V40z Server
- Sun Fire X2200 M2 Server
- Sun Ultra 24 Workstation
- Sun Netra X4250 Server
- Sun Fire X4540 Server
- Sun Fire X4440 Server
- Sun Fire X4250 Server
|
Related Categories |
- GCS>Sun Microsystems>Servers>x64 Servers
|
PreviouslyPublishedAs
207637
Applies to:
Sun Fire V40z Server - Version: Not Applicable to Not Applicable - Release: N/A to N/A
Sun Fire X4100 M2 Server - Version: Not Applicable to Not Applicable [Release: N/A to N/A]
Sun Fire X4100 Server - Version: Not Applicable to Not Applicable [Release: N/A to N/A]
Sun Fire X2100 M2 Server - Version: Not Applicable to Not Applicable [Release: N/A to N/A]
Sun Fire X4170 Server - Version: Not Applicable to Not Applicable [Release: N/A to N/A]
All Platforms
Purpose
Purpose/Scope:
This
document attempts to address failures of internal LSI MPT RAID disks under the
Solaris, Red Hat, SuSE/Novell and Windows operating systems.
The
LSI storage controller may be embedded into the platform or provided
as an optional PCI card.
The 6Gigabit SAS RAID, Sun StorageTEK
Intel/Adaptec RAID, and embedded Intel and NVIDIA RAID controllers are not
discussed in this document.Symptoms:
- Disk service LED
illuminated
- Disk errors in system messages files
- Disk errors on
console
- Disk SMART errors during the boot process
Last Review Date
March 1, 2011
Instructions for the Reader
A Troubleshooting Guide is provided to assist
in debugging a specific issue. When possible, diagnostic tools are included in the document
to assist in troubleshooting.
Troubleshooting Details
Steps to Follow:
Please validate that each troubleshooting step below is true for your environment. The steps will provide instructions or a link to a
document, for validating the step and taking corrective action as necessary. The steps are ordered in the most appropriate sequence to
isolate the issue and identify the proper resolution. Please do not skip a step.
Step 1.
Verify a supported platform disk and part number:
The following
link references a support document that assists in the identification
of a disk part number. In addition, the document provides the public
web location of the Oracle SunSolve Handbook to confirm the disk in
question is a supported disk for your platform:
1010055.1 Identifying Sun Supported Platform Disks
Disks that are
not listed in a platforms documentation and deemed unsupported. This
is because they have not been tested and therefore have unknown
properties and as such may produce unknown errors.
Even if an
unsupported disk appears to work correctly, it is recommended that
you always use supported disks for contracted platforms.
Step 2. Verify
disk is or is not a member of a RAID array:
The following
links reference support documents that assist in identifying if your
Solaris, Linux or Windows operating environments are installed as
part of a RAID array or not. The Windows instructions are in line:
Windows:
1017961.1 How to Identify if a Solaris[TM] Operating
Environment is Installed on a Hardware RAID Controller
Linux:
1013003.1 How to Identify if a Linux Operating
Environment is Installed on a Hardware RAID Controller
Windows:
Click on the
following:
Right Click on "My Computer" and select "Properties".
Select the "Hardware" tab from the window that appears.
Click on "Device Manager".
Click on "Disk Drives". Installed disk(s) are listed.
If the drive(s)
listed are display with the disk name LSI then your platforms
drives are under the control of an LSI RAID device.
If
however the drive(s) listed display the name(s) Fujitsu,
Hitachi or Seagate then your platform is not configured
under the control of a RAID device therefore is a JBOD only (Just a
Bunch Of Disks).
Troubleshooting
steps differ for platforms that are installed under the control or a
RAID management device. This is because disks under RAID control are
hidden from the operating environment and are referenced as a pseudo
or meta-device.
Step 3. Verify
RAID status:
We must now
identify which disk has failed if the fault is a persistent
fault.
The following link references a support document that
assists in identifying the current RAID status of a configured
array.
This document details checking of RAID status in BIOS and
within the Solaris operating environment:
1013107.1 How to Identify BIOS and Solaris Hardware RAID Status
Linux and Windows
platforms will need to check RAID status from within BIOS using the
above document, unless they install additional monitoring software
from the distributed Tools and Drivers CD-ROM.
The package
name is MSM-IR (also known as MegaRAID Storage Manager).
If this package
is unavailable to the user via CD-ROM, then it can be downloaded from Oracle Support
Click on the appropriate platform then select
"Software Downloads" from the Downloads link on the right
menu bar.
Step 4. Verify
disk is online has has not been going offline and no physical disk
hardware problem:
If the disk fault
is not persistent and the RAID controller does not report a
"Degraded" RAID, we must now attempt to identify
intermittent failures.
The following links reference support
documents that assist in identifying the online/offline status of
directly attached platform disks. These documents also discuss the
location of your operating system error logs and the format in which
disk errors should appear:
Solaris:
1005530.1 How to Check for Solaris[TM] x64 Disk Errors
and Online/Offline Status
Linux:
1002936.1 How to Check for Linux Platform Disk Errors and
Online/Offline Status
Windows:
1011590.1 How to check for Windows platform disk errors
and online/offline status
Disks that are
not directly attached to the platform (for example, those installed
in an external storage array), are not discussed in this
document.
Storage array disks may have different properties when
connected to and behind an external controller and as such change the
error syntax and tools used for collection and configuration.
Step 5. Verify
disk firmware revision and known applicable issues:
The following
link references a support document that assists in identifying the
disk model number and firmware revision to check for known issues and
if applicable patch updates:
1008396.1 How to Identify Optical and Hard Disk Firmware Revisions
for Checking of Known Issues
Patches and
firmware updates are often available for disks under multiple
operating systems.
Checking for known issues and updates results
in decreased downtime.
Step 6. Run
information gathering programs and raise an Oracle service request:
The following
links reference support documents that assist in the gathering of
information from your Solaris, Red Hat, Novell/SuSE and Windows
platforms using their own information gathering tools.
Solaris:
1018748.1 How to Run Sun[TM] Explorer and Forward the
Data to a Sun Engineer
Novell/SuSE
Enterprise Linux:
1010057.1 How to gather
information on SuSE Linux Enterprise Systems
Red Hat
Enterprise Linux:
1010058.1 How to Gather
Information on Red Hat Enterprise Linux Systems
Windows msinfo32:
Click on Start and select Run.
Type "msinfo32" in the text box that appears.
Select the File menu and then select Export.
Provide a file name and send this file to Sun.
This is necessary if
the resolution steps above did not resolve your issue and Sun needs
to be engaged to continue diagnosis for you. Information gathering
programs gather operating system parameters and configuration
information from your platform.
At this point, if
you have validated that each troubleshooting step above is true for
your environment, and the issue still exists, further troubleshooting
is required. For additional support contact Oracle Support.
Previously Published As 91626
Attachments
This solution has no attachment