Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Technical Instruction Sure Solution 1004799.1 : Decoding Sense codes for A1000/A3x000 Products
PreviouslyPublishedAs 206662 Description Users may reference the sense code listing below. This is a copy of the raidcode.txt file that is included with Raid Manager 6.22. RAID ERROR CODE DESCRIPTIONS SENSE KEYS The possible Sense Keys returned by the RAID controller in thesense data, on receipt of a Request Sense command are shown below. The Sense Key is returned in byte 2 (zero-referenced) of theRequest Sense data. The Sense Key may be thought of as a summarycode for the error. More detailed information about the erroris provided by the FRU and ASC/ASCQ codes described in the nextsections. (0x00)-No Sense The controller has no errors to report at this time. (0x01)-Recovered Error The controller detected the error, but was able to recover fromit. (0x02)-Not Ready The controller is in the process of finishing initialization,and will not allow hosts access to user data until it is ready. (0x03)-Media Error A drive attached to the controller detected a media error onitself. (0x04)-Hardware Error This Sense Key is typically returned by the controller on mostunrecoverable errors. (0x05)-Illegal Request A command was issued to the controller that is not allowed (forexample, access to a non-existent logical unit). (0x06)-Unit Attention The controller is informing the host of an action it took toremedy an exception condition (for example, the controller marked adrive Failed, because the drive could no longer be accessed). (0x0B)-Aborted Command The controller could not finish the requested operation. However, in the typical scenario, it will have taken someaction to ensure that the error condition would not occur again. Therefore, the next time this same command is received, thesame error condition should not occur. (0x0E)-Miscompare A failed Verify operation, or a Verify with Parity Checkoperation failure will return a Sense Key of Miscompare. FIELD REPLACEABLE UNITS (FRU) CODE DEFINITIONS Each time an error is detected, the controller will put theField Replaceable Unit (FRU) code of the failed component in thesense data (byte 14 (zero-referenced) in the sense data for thefirst error and bytes 26-33 (zero-referenced) for additionalerrors). To provide meaningful information for troubleshooting, theFRU codes have been grouped. The defined FRU groups are listedbelow. FRU Code Description 0x01 Host Channel Group 0x02 Controller Drive Interface Group 0x03 Controller Buffer Group 0x04 Controller ASIC Group 0x05 Controller Other Group 0x06 Subsystem Group 0x07 Not Used 0x08 Sub-enclosure Group 0x09-0x0F Reserved 0x10-0xFF DriveGroups (0x01)-Host Channel Group This group consists of the host SCSI bus, its SCSI interfacechip, and all initiators and other targets connected to thebus. (0x02)-Controller Drive Interface Group This group consists of the SCSI interface chips on thecontroller which connect to the drive buses. (0x03)-Controller Buffer Group This group consists of the controller logic used to implementthe on-board data buffer. (0x04)-Controller Array ASIC This group consists of the ASICs on the controller associatedwith the RAID functions. (0x05)-Controller Other Group This group consists of all controller-related hardware notassociated with another group. (0x06)-Subsystem Group This group consists of subsystem components that are monitoredby the RAID controller, such as power supplies, fans, thermalsensors, and AC power monitors. (0x08)-Sub-Enclosure Group This group consists of the devices such as power supplies,environmental monitor, and other subsystem components in thesub-enclosure. (0x10-0xFF)-Drive Group This group consists of a drive (embedded controller, driveelectronics, and Head Disk Assembly), its power supply, and theSCSI cable that connects it to the controller; or supportingsub-enclosure environmental electronics. An FRU code denoting adrive contains the channel number (1-relative) in the upper nibble,and the drive's SCSI ID in the lower nibble. For example, a driveon the third channel, SCSI ID 2 would be denoted by an FRU code of0x32.
ADDITIONAL SENSE CODES AND QUALIFIERS This section lists the Additional Sense Code (ASC), andAdditional Sense Code Qualifier (ASCQ) values returned by the RAIDcontroller in the sense data. The ASC and ASCQ providedetailed information about the specific error. SCSI-2 defined codes are used whenever possible. Arrayspecific error codes are used when necessary, and are assignedSCSI-2 vendor unique codes 0x80 to 0xFF. The most probable Sense Keys (listed below for reference)returned for each error are also listed in the table. SenseKeys of 6 in parentheses indicate that 6 (Unit Attention) would bethe nominal Sense Key reported; however, the actual value would bethat set in the "Sense Key for Vendor-unique Conditions" field inthe User-configurable options of the NVSRAM. ASCs and ASCQs are normally returned in bytes 12 and 13(zero-referenced) of the sense data. On multiple errors(defined as errors that occurred on the same command, notnecessarily as errors that occurred simultaneously), there may beadditional ASCs and ASCQs in the ASC/ASCQ stack, which are bytes22-25 (zero-referenced) of the sense data. In most cases, thefirst error detected is stored in bytes 12 and 13 of the sensedata; subsequent errors are stored in the ASC/ASCQ stack. The following section lists all possible ASC/ASCQ combinationsreturned by the controller.
ASC ASCQ Sense Key 00 00 0 No Additional Sense Information The controller has no errors to report for the requesting hostand addressed logical unit combination.
ASC ASCQ Sense Key 04 01 2 Logical Unit In Process Of Becoming Ready The controller is executing its initialization functions on theaddressed logical unit. This includes drive spin-up andvalidation of the drive and logical unit configuration information.This error is normally returned on commands following the initialInquiry command after a power-up/reset.
ASC ASCQ Sense Key 04 02 2 Logical Unit Not Ready, Initializing Command Required The controller is configured to wait for a Start/Stop Unitcommand before spinning up the drives, but the command has not yetbeen received.
ASC ASCQ Sense Key 04 04 2 Logical Unit Not Ready, Format In Progress The controller previously received a Format Unit command from aninitiator, and is executing that command on this logical unit. Other commands cannot be sent to this logical unit until theFormat Unit completes.
ASC ASCQ Sense Key 04 81 02 Firmware Versions Incompatible The versions of firmware on the redundant controllers areincompatible/inconsistent. This is probably because you replaced afailed controller with a new controller that does not have the sameversion of firmware. Controllers with an incompatible version offirmware may cause unexpected results. Therefore, you must downloadnew firmware as soon as possible. Use the Recovery Guru/HealthCheck in the Recovery Application to obtain instructions on how todownload firmware to make the versions consistent. ASC ASCQ Sense Key 04 A1 2 Quiescence Is In Progress or Has Been Achieved ASC ASCQ Sense Key 0C 00 4,(6) Unrecoverable Write Error If this error is reported during normal operation, thecontroller has detected an error on a write operation to a drive,but was unable to recover from the error. The drive thatfailed the write operation will be marked Failed. Drive Marked Offline Due To Internal Recovery Procedure An error has occurred during interrupted write processing causing the LUN to transition to the Dead state. Drives in the drive group that did not experience the read error will transition to the Offline state (0x0B) and log this error.
ASC ASCQ Sense Key 3F BD (6) Drive Has Incorrect Critical Parameters Set The controller was unable to query the drive for its currentcritical mode page settings, or was unable to change these to thecorrect setting. Currently, this indicates the Qerr bit isset incorrectly on the drive specified in the FRU field of theRequest Sense data.
ASC ASCQ Sense Key 3F C3 (6) Channel Failure The controller failed a channel, and will not access drives onthis channel any more. The FRU Group Qualifier (byte 26) inthe sense data will indicate the 1-relative channel number of thefailed channel. This condition is typically caused by a driveignoring SCSI protocol on one of the controller's destinationchannels. The controller typically fails a channel if itissued a reset on a channel, and it continued to see drives ignorethe SCSI Bus Reset on this channel.
ASC ASCQ Sense Key 3F C7 (6) Non-Media Component Failure (1) A subsystem component other than a drive or controller hasfailed (for example, fan, power supply, battery) or (2) Anover-temperature condition has occurred (some RAID modules containa temperature sensor). The fans, power supplies, and battery areusually located in the controller module tray. The FRU codes willindicate the faulty component. The user should replace thecomponent indicated.
ASC ASCQ Sense Key 3F C8 (6) AC Power Fail The Uninterruptible Power Source (UPS) has indicated that ACpower is no longer present and the UPS has switched to standbypower. While there is no immediate cause for concern, usersshould save their work frequently, in case the battery is suddenlydepleted.
ASC ASCQ Sense Key 3F C9 (6) Standby Power Depletion Imminent The Uninterruptible Power Source (UPS) has indicated that itsstandby power source is nearing depletion. The host shouldtake actions to stop IO activity to the controller. Normally,the controller will change from a write-back caching mode to awrite-through mode. The user should not change again towrite-back mode until full AC power has been restored. ASC ASCQ Sense Key 3F CA (6) Standby Power Source Not At Full Capacity The Uninterruptible Power Source (UPS) has indicated that itsstandby power source is not at full capacity. To prevent lossof data in the event of the failure of AC power, the user shouldnot activate write-back caching mode until full UPS power has beenrestored.
ASC ASCQ Sense Key 3F CB (6) AC Power Has Been Restored The Uninterruptible Power Source (UPS) has indicated that ACpower is now being used to supply power to the controller.
ASC ASCQ Sense Key 3F D0 (6) Write-Back Cache Battery Discharged The controller has detected that its battery is no longercharged. If a power failure were to occur, any dirty userdata in cache will be lost. To prevent the loss of any userdata, the user should either: (1) replace this controller withanother, or (2) turn off write-back cache.
ASC ASCQ Sense Key 3F D1 (6) Write-Back Cache Battery Charged The controller has detected that its battery is now fullycharged, and will be capable of holding up the cache contents inthe event of a power failure. The user may switch towrite-back mode, if desired. ASC ASCQ Sense Key 3F D8 (6) Battery Reached Expiration The controller has failed the battery because the battery hasreached its expirations date. You should replace the battery assoon as possible. ASC ASCQ Sense Key 3F D9 (6) Battery Near Expiration The controller has detected that the battery is nearing itsexpiration date. You should replace the battery as soon aspossible.
ASC ASCQ Sense Key 3F E0 (6) Logical Unit Failure The controller has placed the logical unit in a "Dead" state. User data and/or parity can no longer be maintained to ensureavailability. The most likely cause is the failure of asingle drive in non-redundant configurations or a second drive in aconfiguration protected by one drive. The data on the logicalunit is no longer accessible. ASC ASCQ Sense Key 3F EB (6) LUN Marked Dead Due To Media Error Failure An error has occurred during interrupted write processing duringStart of Day causing the LUN to transition to the Dead state.
ASC ASCQ Sense Key 40 NN 4,(6) Diagnostic Failure On Component NN (0x80 - 0xFF) The controller has detected the failure of an internalcontroller component. This failure may have been detectedduring operation as well as during an on-board diagnostic routine. The values of NN supported in this release are listed asfollows: > 80 - Processor RAM > 81 - RAID buffer > 82 - NVSRAM > 83 - RAID Parity Assist (RPA) chip > 84 - Battery-backed NVSRAM or clock failure > 91 - Diagnostic self test failed non-data transfercomponents test most likely controller cache holdup batterydischarge) > 92 - Diagnostic self test failed data transfer componentstest > 93 - Diagnostic self test failed drive Read/Write Bufferdata turnaround test > 94 - Diagnostic self test failed drive Inquiry accesstest > 95 - Diagnostic self test failed drive Read/Write dataturnaround test > 96 - Diagnostic self test failed drive self test In a dual controller environment, the user should place thiscontroller offline (hold in reset) (unless the error indicatescontroller battery failure, in which case the user should wait forthe batteries to recharge). In single controllerenvironments, the user should not use this subsystem until thecontroller has been replaced.
ASC ASCQ Sense Key 43 00 4 Message Error The controller attempted to send a message to the host, but thehost responded with a Reject message.
ASC ASCQ Sense Key 44 00 4,B Internal Target Failure The controller has detected a hardware or software conditionthat does not allow the requested command to be completed. If the Sense Key is 0x04 indicating a Hardware Failure, the controllerhas detected what it believes is a fatal hardware or softwarefailure and it is unlikely that just a retry of the command wouldbe successful. If the Sense Key is 0x0B indicating an AbortedCommand, the controller has detected what it believes is atemporary software failure that is likely to be recovered ifretried.
ASC ASCQ Sense Key 45 00 4 Selection Time-out On A Destination Bus A drive did not respond to selection within a selection time-outperiod. Possible reasons for this error include drivefailure, channel failure, or the possibility of an incompletehot-swap holding the whole channel in reset.
ASC ASCQ Sense Key 47 00 1,B SCSI Parity Error The controller detected a parity error on the host SCSI bus orone of the drive SCSI buses.
ASC ASCQ Sense Key 48 00 1,B Initiator Detected Error Message Received The controller received an Initiator Detected Error Message fromthe host during the operation. ASC ASCQ Sense Key 49 00 B Invalid Message Error The controller received a message from the host that is notsupported or was out of context when received.
ASC ASCQ Sense Key 49 80 B Drive Reported Reservation Conflict A drive returned a status of Reservation Conflict. ASC ASCQ Sense Key 4B 00 1,4 Data Phase Error The controller encountered an error while transferring datato/from the initiator or to/from one of the drives. ASC ASCQ Sense Key 4E 00 B Overlapped Commands Attempted The controller received a tagged command while it had anuntagged command pending from the same initiator, or it received anuntagged command while it had a tagged command(s) pending from thesame initiator. ASC ASCQ Sense Key 5D 80 6 Drive
Sun StorageTek A3500 Array This solution has no attachment |
||||||||||||
|