Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Problem Resolution Sure Solution 1013668.1 : SL8500 - HBT Drive Communication Errors
PreviouslyPublishedAs 219289 Oracle Confidential (PARTNER). Do not distribute to customers Reason: Confidential for Partners and Oracle Support personnel
Applies to:Sun StorageTek SL8500 Modular Library SystemAll Platforms Checked for relevance on 25-May-2011. SymptomsHBT card hangEvent error 1301 Event error 3954 Drive communications error ChangesNACauseNASolutionResolutionIf drive communication errors are reported recommend:
-Check the status of the drive via the system detail screens in SLC. Example of normal status: Health State : ok Device State : Ready Access State online Drive State : empty Drive needs cleaning : false Host Activity : false -Analyze the library event log for the sequence of errors noted below on multiple drives. If the error occurs on just a single drive a reboot can be triggered via the SLC diagnostic menu screens. If drive communication errors exist on all drives and the HBT card appears not to be responding, contact Level 2 Tape Hardware Support to pull the engineering logs from the HBT card for follow up. Note: HBT card will reset in approximately 60 seconds and will not dismount any drives currently loaded. It should not be necessary to reset the HBC card; if the HBC is reset, it will generate a full library reset including all the HandBots which would be disruptive to host operations. Log examples below; engineering is evaluating a HBT reset driven automatically from the HBC when an HBT hang is detected. 2006-08-17T20:17:45.808, 1.2.2.1.2, root, hli1, queryDrive18566301, error, 1301, "Device, response time-out", request=
2006-08-17T20:17:45.913, 1.0.0.1.0, root, hli1, queryDrive18566301, error, 3954, "Failure, in send output: ", Data=
2006-08-17T20:17:58.738, 1.1.-2.1.4, root, hli1, queryDrive18564201, error, 1301, "Device, response time-out", request=
2006-08-17T20:17:58.830, 1.0.0.1.0, root, hli1, queryDrive18564201, error, 3954, "Failure, in send output: ", Data=
2006-08-17 18:58:06 ACSLH[0]: 2378 N Co_ProcessResponses.C 1 1308 ACS: 1; LMU error: Co_4400:st_parse_error: Error: 1001 - Drive error: Drive is not communicating Request: Dismount, forced rewind and unload Volser: I01240, media domain: L, media type: 1 Source: Drive 1,3,1,13 Destination: Cell 1,3,14,18,0 2006-08-17 18:58:06 ACSSA[0]: 2468 E sa_demux.c 1 278 drive 1, 3, 1,13 reported a Unit Attention. 2006-08-17 18:58:06 DISMOUNT[0]: 546 N cl_log_lh_er.c 1 99 dm_lh_lib_fail: LH error type = LH_ERR_TRANSPORT_FAILURE . 2006-08-17 20:30:44 ACSSA[0]: 1431 N sa_demux.c 1 278 drive 1, 3, 1, 5: Library error, Transport failure . 2006-08-17 20:33:12 ACSMT[0]: 429 N mt_timeout.c 1 135 mt_timeout: mid:42023 Mount timeout after 4920 seconds. 2006-08-17 20:33:12 ACSSA[0]: 1435 W sa_demux.c 1 278 Unable to handle unusual status or event. See related messages. 2006-08-17 20:36:13 ACSMON process[0]: 126 N mon_drv_examine.c 1 506 mon_lsm_examine:st_req_error: Timed out waiting for message Additional Information Other problems can also cause drive communication errors, i.e. drive powered off, drive SNO, etc. HBT drive communication problem is under investigation by Library Engineering and exists in all current library code levels through FRS_3.08. One customer site that experienced the HBT hang also experienced problems varying the ACS online after bouncing ACSLS. ACSLS requires status from at least one CAP and one Drive before the ACS will vary online. 3.08, HBT, SL8500, Code Previously Published As STKKB78617 Attachments This solution has no attachment |
||||||||||||
|