![]() | Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||
Solution Type Technical Instruction Sure Solution 1010810.1 : Sun Storage 33x0/351x Arrays: How to resolve "DRAM parity error detected"
PreviouslyPublishedAs 214938 Applies to:Sun Storage 3510 FC Array - Version Not Applicable to Not Applicable [Release N/A]Sun Storage 3511 SATA Array - Version Not Applicable to Not Applicable [Release N/A] Sun Storage 3320 SCSI Array - Version Not Applicable to Not Applicable [Release N/A] Sun Storage 3310 Array - Version Not Applicable to Not Applicable [Release N/A] All Platforms GoalDescription Memory parity errors on the Sun Storage 3310/3320/3510/3511 Array results in the following messages being logged in the Event log which can be viewed using the software CLI "sccli> show event" command. [0104] #4287: StorEdge Array SN#xxxxxxx Controller ALERT: DRAM parity error detected However it doesn't tell us which controller has logged the messages in a dual controller
array and we might end up replacing the wrong controller. Caution must be taken when
determining the affected controller, as the messages in the Event log reflect the
controller's functional role when that event is logged. However, after that event message
was logged, the controller functional roles might have changed. For example, if the customer
or an engineer has manually failed the other controller, or if a "Controller Unrecoverable
Error" event occurred on the other controller, or if the array was power-cycled.
This document describes steps to assist with determining the affected controller in certain instances (when the controller role has not been changed). Fix
Steps to Follow:
DRAM parity errors are usually ok if the errors are single bit errors. Only replace the controller if you observe many occurrences of "DRAM parity error detected" in the event log. Unfortunately, while using sccli to view the event log, the error shows up as:
It doesn't indicate which controller generated the error in the case of a dual controller configuration. The onsite engineer may erroneously replace the primary controller, but the errors may still continue even after the controller is replaced. In this case, the secondary controller generated this error at the time the event was logged. To find out correctly as to which controller is generating this error, use the 3310/3320/3510/3511 telnet/serial interface and view the event log. Tip or telnet into the Sun Storage 3310, 3320, 3510,or 3511 Array. From the Main Menu: 1) Choose "view and edit Event logs" and you should see something similar to the following for the same DRAM parity error: [0104] Controller SDRAM ECC Single-bit Error Detected The S in the bottom line indicates that this error was generated by the Secondary controller, and requires the secondary controller be replaced and NOT the primary controller. If there is a "P" instead of a "S", this means the primary controller reported the error and should be replaced. This will only be an accurate indicator of which controller requires replacement if the roles of the controllers have not changed. If the controller roles have changed, additional fault isolation steps will be required to identify the correct controller generating the errors.
Additional Information: The same error messages can be seen as follows using the diagnostic reporter and the sscs software, if they are configured. These components are also a part of the array management software. ### From Diagnostic Reporter email:
@ User Name: [email protected] Date: 2010-07-06 Action: Currency & Update Attachments This solution has no attachment |
||||||||||||
|