![]() | Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||
Solution Type Problem Resolution Sure Solution 1447496.1 : Exadata :Cell Rebooted SCSI error: return code & kernel: end_request: I/O error
In this Document Created from <SR 3-5525182881>
Applies to:Exadata Database Machine V2 - Version: Not ApplicableInformation in this document applies to any platform. SymptomsConfiguration:- Errors Like :- ----------------- Alert History: info "IO hang detected on CD_01_dm51cel06. Power cycle forced." ASM Log:- Wed Mar 28 14:03:20 2012 WARNING: Disk in group 2 mode 0x7f is now being offlined ORA-27603: Cell storage I/O error, I/O failed on disk at offset 8392704 for data length 4096 ORA-27626: Exadata error: 201 (Generic I/O error) WARNING: Read Failed. group:5 disk:52 AU:2 offset:4096 size:4096 WARNING: cache failed reading from group fn=4 blk=1 count=1 from disk= kfkist=0x20 status=0x02 file=kfc.c line=11366 system log: kernel: sd 0:2:6:0: SCSI error: return code = 0x00040000 dm51cel06 kernel: end_request: I/O error, dev sdg, sector 2006351888 kernel: sd 0:2:6:0: SCSI error: return code = 0x00040000 disk LSI MR9261-8i 2.12 /dev/sdac CauseThe sequence of events are :Disk in a slot failed. Then IO to disk in another slot timed out. This caused the power cycle, as IOs should never be hung on other devices for more than 30 seconds when we are having trouble with 1 bad disk. If there is an outstanding IO hang on a disk for more than 95 seconds, then we pull the trigger and reboot the storage server. Previous to image 11.2.3.1.0 there was no mechanism to cancel an IO on a griddisk other than to reboot the server. So, to prevent the risk of hanging the entire database, we choose to reboot just one storage cell. Usually, the reboot provides quiet-time for background disk media scan to kick in on the offending disk and fix the bad sectors. SolutionThe fix is included in 11.2.3.1.0 (Patch 13536739) References<BUG:13922277> - CELL NODE REBOOTED - WITH ERRORS IN ASM LOG, MESSAGE & CELL LOGS<BUG:12592457> - FENCEMASTER: OSS_IOCTL_FENCE_ENTITY Attachments This solution has no attachment |
||||||||||||
|