Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1390664.1
Update Date:2012-01-02
Keywords:

Solution Type  Problem Resolution Sure

Solution  1390664.1 :   Exadata Storage/Cell Node Hung and Rebooted Due to Temporary IO Stall Caused by Drive Medium Errors  


Related Items
  • Exadata Database Machine V2
  •  
Related Categories
  • PLA-Support>Sun Systems>x64>Engineered Systems HW>SN-x64: EXADATA
  •  
  • .Old GCS Categories>Sun Microsystems>Specialized Systems>Database Systems
  •  




In this Document
  Symptoms
  Cause
  Solution
  References


Created from <SR 3-5117237331>

Applies to:

Exadata Database Machine V2 - Version: Not Applicable to Not Applicable - Release: N/A to N/A
Information in this document applies to any platform.

Symptoms

Exadata Storage/Cell Node has been completely hung/frozen due to temporary IO stall caused by drive medium errors and system gets rebooted (forced power cycle).

- Command "cellcli -e list alerthistory" lists following alert
63 2011-12-29T01:42:10-02:00 info "IO hang detected on CD_09_dm02cel13. Power cycle forced."

- "ipmitool sel list" contains following event (with message "OEM record c0") at the time of failure
214 | 12/29/2011 | 01:42:10 | OEM record c0 | 004301 | 97cd9c999865
215 | 12/29/2011 | 01:42:11 | System Boot Initiated | System Restart | Asserted

- File "$CELLTRACE/ms-odl.trc" shows following error (with time stamp of after cell node reboot)
[2011-12-29T09:50:04.539-02:00] [ossmgmt] [NOTIFICATION] [] [common.hwadapter.HardwareImpl] [tid: 15] [ecid: 180.128.211.98:90252:1325159256252:5,0] Adding alert: time: 1325130130000 msg: OEM Record :: IO hang detected. Force Power cycle. Detail: 43 0 - 97 cd - 9c 99 - 98 65
[2011-12-29T09:50:04.549-02:00] [ossmgmt] [NOTIFICATION] [] [ms.hwadapter.MSHardwareImpl] [tid: 15] [ecid: 180.128.211.98:90252:1325159256252:5,0] IO hang detected on CD_09_dm02cel13. Power cycle forced. Detail: 43 0 - 97 cd - 9c 99 - 98 65
[2011-12-29T09:50:04.550-02:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSAlertHistory] [tid: 15] [ecid: 180.128.211.98:90252:1325159256252:5,0] AlertHistory 63 created. Severity: info. Message: IO hang detected on CD_09_dm02cel13. Power cycle forced.


Cause

Hitting following unpublished bug
  Bug 12626126: STBH:IO HANG DETECTED AND POWER CYCLE FORCED

Solution

This issue has been fixed in following patch.
  Patch 13517481 : EXADATA COLLECTION OF ONE OFFS FOR RELEASES OLDER THAN 11.2.2.4.2

Install the <Patch 13517481> by following the installation instructions mentioned in the patch README.

Details about inclusion of the fix for the bug 12626126 is documented in following alert note
ALERT - FLASH CARDS OFFLINE AFTER 6 MONTHS OF UPTIME (Doc ID 1386617.1)

References

<NOTE:1386617.1> - ALERT - FLASH CARDS OFFLINE AFTER 6 MONTHS OF UPTIME

Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback