Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1018761.1
Update Date:2009-12-03
Keywords:

Solution Type  Technical Instruction Sure

Solution  1018761.1 :   Sun StorEdge[TM] 9900 series: ECC/LRC pinned tracks and how to recover from them without data lost  


Related Items
  • Sun Storage 9970 System
  •  
  • Sun Storage 9990 System
  •  
  • Sun Storage 9910 System
  •  
  • Sun Storage 9960 System
  •  
  • Sun Storage 9985 System
  •  
  • Sun Storage 9980 System
  •  
Related Categories
  • GCS>Sun Microsystems>Storage - Disk>Datacenter Disk
  •  

PreviouslyPublishedAs
230488


Description
Sun StorEdge[TM] 9900 (SE9900) series arrays have sophisticated mechanisms to ensure that data integrity is maintained at all times which are documented in the Theory of Operation Section of the SE9900 series Maintenance Manuals. See references on Microcode CDROM.

Data integrity codes are appended to the data being transferred at each component within the subsystem. The ECC/LRC (error correction) mechanisms ensures that fault does not cause the data to be changed without the user's knowledge. In addition to the ECC/LRC mechanisms all write data is mirrored in cache. For reference see chapter 3.3 of the Theory Of Operation Section.

Taken together these data protection mechanisms provide very strong safeguards against silent data corruption. Data loss/corruption is extremely unlikely to occur on SE9900 arrays. When problems do occur the SE9900 arrays have unparalleled ability to detect and recover from hardware and logic faults.

What does a pinned track mean 

There are several possible causes and recovery paths for pinned track scenarios, which are documented in chapter 4 of the Trouble Shooting Section of the SE9900 series Maintenance Manuals. One will need to identify which of the two error types have occurred in order to determine which data recovery procedure to use. There are two kinds of pinned track SIMs, which can be distinguished by the first three characters of the reference code in the SIM:

  1. Reference code EF4X-YY  Unable to write a track to a PDEV  This is triggered when the data destaging process to a PDEV is unsuccessful due to a drive failure. This is usually resolved by replacing the failed hardware.

  2. Reference code FF4X-YY  Unable to process a track to or from Cache  This type of SIM is triggered when data becomes trapped in the cache. There are many causes for this type of pinned track, and this is the type that we're focusing on in this document.

This article will focusing how to recover from FF4X-YY  Unable to process a track to or from Cache  type of pinned track for Solaris[TM] 7 or higher.



Steps to Follow
Sun StorEdge[TM] 9900 series: ECC/LRC pinned tracks and how to recover from them without data lost


Certain types of transient or permanent hardware faults, and/or microcode bugs, may cause a mismatch between a cache slot's ECC/LRC data on either side of the mirrored cache. The customer data itself matches, or the array would not permit it to be read. Since the array has detected an ECC/LRC mismatch, it reports this error condition and "pins" the slot in cache, and the recovery method is left up to the customer.

The possibility of data corruption is very close to zero in these circumstances because data in both sides of cache matches.

Display and collect  Pinned Track  information

To display  Pinned Track  information refer to chapter 3.18 of the SVP Section of the SE9900 series Maintenance Manuals.

The [Detail] button will display the information whereas the [Archive] button will create an archive at c:\dkc200\others\pin\pin.lzh (SE9910, SE9960, SE9970, SE9980) or c:\dkc200\others\pin\pin.tgz (SE9985, SE9990) . This archive will be collected throughout an normal or detail autodump or downloaded individualy.

If the  Pinned Track  occurred on a LUSE volume (LUN Size expansion) the LDEV  reported in SIM and  Pinned Track  display might not match.  Pinned Track  display reports LDEV and PDEV within the LUSE.

The Recovery Procedure

Most of the time, the recovery procedures in chapter 4 of the Trouble Shooting Section are sufficient, and we should follow those recovery procedures first. Based upon the volume type used refer to the documented recovery flowcharts as below.

Recovery flowcharts

Volume Type

9900

9900V

NSC

USP

HRC, HORC & HODM

TRBL06-250

TRBL06-270

TRBL06-270

HMRCF, HOMRCF

TRBL09-10

TRBL08-10

OPEN

TRBL07-110

TRBL07-100

True Copy

Appendix A True Copy User and Reference Guide

HUR

NA

Chapter 9.6 Universal Replicator User and Reference Guide

However the Erasing procedure which does not require the Pin Track Tool calls for a destructive format procedure to "recover" a pinned track under Solaris[TM]. In cases in which the pinned data still can be read from cache a non-destructive method will clear the pin with no data loss. File systems residing on the LDEV that has the pinned track will need to be unmounted, and the disk will need to be completely quiesced (i.e. nothing should attempt to access the disk while we are recovering).

Other than documented one will need to run format > analyze > refresh . This will read the data and then write it back.

Recovery flowcharts for Solaris which does not require the Pin Track Tool

9900

9900V

NSC

USP

TRBL07-720 ~ 740

TRBL07-830 ~ 850

TRBL07-750 ~ 770

In order to refresh an area of LBA one should convert the LBA  information provided by Pinned Track Display from hexadecimal into decimal. Based upon the emulation type calculate the stop LBA by adding number of blocks for one track. If you choose a size less than one track the Pinned Track might not be cleared.

Track size (cache slot size) to block translation

Emulation Type

9900

9900V

NSC

USP

OPEN-3/8/9/E/K/L

48 KB = 96 Blocks

OPEN-V

NA

64 KB = 128 Blocks

256 KB = 512 KB

Glossary

HRC

Hitachi Remote Copy or True Copy for z/OS 

HORC

Hitachi Open Remote Copy or True Copy

HODM

Hitachi Online Data Migration

HMRCF

Hitachi Multiple Raid Coupling Feature or Shadow Image for z/OS 

HOMRCF

Hitachi Open Multiple Raid Coupling Feature or Shadow Image

HUR

Hitachi Universal Replicator

9900

SE9910 or SE9960

9900V

SE9970 or SE9980

NSC

SE9985

USP

SE9990



Product
Sun StorageTek 9990 System
Sun StorageTek 9985 System
Sun StorageTek 9980 System
Sun StorageTek 9970 System
Sun StorageTek 9960 System
Sun StorageTek 9910 System
Sun StorageTek 9900V Series Array

Internal Comments
The following is strictly for the use of Sun employees:

References at https://channelone.hds.com/indexmain.cfm. You
will need a registered account at HDS partner website in order to
browse below documents.







  1. HDS34177:
    Pin Track recovery
    for Sun Solaris







  2. HDS15325:
    Pinned Track Recovery
    for True Copy Volumes







  3. HDS5088:
    Recovery Procedures
    for Pinned Track Data on MVS,OPEN Systems and Shadow Image







  4. HDS17533:
    LUSE: LDEV on PIN
    track display does not have a corresponding SIM







  5. HDS48349:
    PIN archive function
    (RAID500/450/400)





se9990, se9985, se9980, se9970, se9960, se9910, se9900, hds, pinned tracks, pinned data, pinned slot
Previously Published As
50772

Change History
Date: 2007-06-14
User Name: 90779
Action: Update Canceled
Comment: *** Restored Published Content *** EOL Voygaer
Version: 0

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback