Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1435182.1
Update Date:2012-08-29
Keywords:

Solution Type  Problem Resolution Sure

Solution  1435182.1 :   Pillar Axiom: SnapLUNDataLost Event with CloneStorageFilling and CloneStorageFull Administrator Actions  


Related Items
  • Pillar Axiom 600 Storage System
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>Pillar Axiom>SN-DK: Ax600
  •  


SnapLUN Data Lost Events, CloneStorageFilling administrator actions, CloneStorageFull administrator actions.  Cause and resolution

In this Document
Symptoms
Cause
Solution


Created from <SR 3-5434857275>

Applies to:

Pillar Axiom 600 Storage System - Version Not Applicable to Not Applicable [Release N/A]
Information in this document applies to any platform.

Symptoms

A SnapLUN Data Lost event is generated.  Depending on configuration, this may result in an automatic callhome and/or administrator email alert. 

The SnapLUN Data Lost event is preceeded by an administrator action indicating CloneStorageFilling, followed by another administrator action indicating CloneStorageFull.

The administrator action indicating CloneStorageFull will typically have one option:  to delete the SnapLUN

Cause

The Axiom Snap LUN, also referred to as a Clone LUN, is a virtual point in time image of the source LUN, using partial block snapshot technology.

As changes occur to the source LUN, any block of data that is changed since the snapshot was taken must be stored in the repository space for the SnapLUN in order to maintain the data integrity of that snapshot image.

If the SnapLUN (Clone) is made active, and one or more SAN hosts write to the Clone, additional space in the repository is required to maintain the original image, plus any changes to the source, plus any changes to the Clone itself.

If the repository space allocated is not sufficient, the SnapLUN will no longer be a point in time consistent view of the original LUN.

The SnapLUN will be marked for deletion, since data integrity can no longer be guaranteed.   

Older releases automatically deleted the SnapLUN.  Current releases generate an administrator action to help notify the administrator that the allocated repository space may not be sufficient for the rate of data change on the source LUN or writes to the Clone.   



Solution

These events are configuration issues, where the amount or repository space configured may not be large enough. 

It is not necessary to take any action as a result of the SnapLUN Data Lost event as long as the loss of the Clone data is not an issue for the applications using the Clone. 

If retention of the Clone data is desired, the size of the repository will need to be increased.   There is no standard recommendation--the amount of repository required is dependent on the amount of data blocks changed on the source LUN after the Clone is created, the number of Clones, and any writes to the Clone.    

The Clone LUN space may be modified at any time on the Modify LUN:  Quality of Service tab. 

The space is allocated from the available free system capacity, not from the source LUN. 

The Axiom generates an administrator action indicating CloneStorageFilling as the repository for a LUN reaches 80% utilization.
That administrator action provides the name of the Clone, the name of the Source LUN, the current repository space used, the free repository capacity, and the maximum repository space allocated. 

An information level event, CloneStorageThresholdReached, is also generated at the 80% threshold with the same information. 

A warning level event CloneStorageFull is generated when the configured repository space is full.  If Email alerts are configured, this may also be used to detect LUNs that may need more repository space configured.  

There is no way to recover the SnapLUN once it has become inconsistent and is marked for deletion. The Administrator Action is provided to warn the storage administrator of the deletion and indicate the reason. 
 
Note:  There is a fix in release 04.05.01 and 05.02.01 for a software issue that may cause Clone space full events when the repository is not full if there are multiple repositories.   If this issue is encountered, the used Clone space will typically be less than half of the allocated space.


SnapLUN and Clone LUN are the same.  Some documents, menus, and messages use the term Clone, others use the term SnapLUN.

Although the ARU notice indicates the named SnapLUN will be deleted, all current releases do not automatically delete the SnapLUN. 

Instead, they create an Administrator Action to inform the storage administrator that the SnapLUN has become inconsistent and will provide the single option to delete the SnapLUN. 

The current ARU:

=ASR Alarm=
Automatic Service Request (ASR) Alarm
Generated: 2012-03-09 15:06:50
Severity : 2
Device : A002439BPX   Axiom System Serial Number
Eventcode: SnapLUNDataLost

Event num: SnapLUNDataLost

SnapLUN Data Lost
-----------------------------------
Hostname: A002439BPX
Product Type: AX600     Axiom System type
Summary:SnapLUN Data Lost
Description:See Additional Information

Additional Information: Data was lost while copying data to a SnapLUN. The SnapLUN will be deleted.   

event-parameter
config-properties
property name=callhomeLogLocation value=file://chprod02/callhome/callhome/A002439BPX/Call_HomeXML/ID1a39972b-d21d-b211-94b9-001b21217e48.chsh.xml    You may use this location to view the callhome header for more detailed information.
property name=associationGuid value=ID41303032-3433-3942-8479-081a210c15a2
property name=OriginatingNode value=0x2008000b08045e52  
property name=targetVlunGuid value=ID005b6d42-d21d-b211-a7ae-525e04080b00
property name=RawEventContents value=525E04080B0008200502060001000000155B6D42D21D04A38279081A210C02A241303032343339428479081A210C15A2005B6D42D21DB211A7AE525E04080B00005B6D42D21DB211A7AE525E04080B
property name=VolumeFqn value=/CLDWDB01_82_02   This is the name of the SnapLUN or Clone, not the name of the source LUN.
property name=ReportingControlUnitWWN value=2008000b08045e52
property name=VolumeSuid value=0xa3041dd2426d5b15
property name=metaDataVlunGuid value=ID41303032-3433-3942-8479-081a210c15a2
property name=VlunType value=PrimaryVlun
property name=VolumeId value=ID41303032-3433-3942-155b-6d42d21d04a3
property name=vlunSuid value=0xa2020c211a087982
property name=BsEventNumber value=0x0000000000060205
property name=VolumeName value=CLDWDB01_82_02  This is the name of the SnapLUN or Clone, not the name of the source LUN
sw-config
config-properties
property name=Pilot1IPAddress value=192.168.223.23
property name=Slammer PROM AX600 value=2062-00003-040000-029400
property name=Slammer Software AX600 value=2060-00003-040220-013000  The Slammer Software is the Axiom code version. e.g. 04.02.20.
property name=CompatibilityMatrixVersion value=2090-00001-999999-999999
property name=Brick Disk Drive Firmware 2052-00022 value=2052-00022-02
property name=Pilot Software value=2073-00001-040220-013000

---

More  information is in the callhome header, which can be viewed with /home/cs/tools/bin/showxml on the callhome 01-02, one of two, file. This provides the same view of the callhome header as the chsh.xml file in a set of scanned logs.
The ARU notice only names the SnapLUN:  CLDWDB01_82_02.

The callhome header file will indicate the initiating Event at the top of the file and the time of the event:

<InitiatingEvent>
<Event>
<EventID>ID1a39972b-d21d-b211-94b9-001b21217e48</EventID>
<EventType>SnapLUNDataLost</EventType>

<Timestamp>2012-03-09T14:06:50-08:00</Timestamp>   Time 14:06:50 is offset from GMT by -8 hours

<Name>VolumeName</Name>
<Value>CLDWDB01_82_02</Value>   This is the name of the SnapLUN, not the source LUN.

In the callhome header file, use the SnapLUN name to search for the <SnapLUN><Name> to locate the Source LUN.

<SnapLUN>
<Name>CLDWDB01_82_02</Name>
<SourceFQN>/CRSAN_DWDB01_82</SourceFQN>

<RootSourceLUNFQN>/CRSAN_DWDB01_82</RootSourceLUNFQN> 

The Source LUN is CRSAN_DWDB01_82 which is the LUN that should be modified to increase Clone Space if the administrator chooses to do so.  The RootSourceLUNFQN would be different from the SourceFQN only if the SourceFQN is itself a Clone or SnapLUN.

The <CurrentCapacity>1001</CurrentCapacity> indicates the size of the SnapLUN.  Other size values in this section of the callhome header are expressed in blocks and difficult to work with. 

The Administrator Action details for the CloneStorageFull will identify the Source LUN, amount of repository space allocated by the customer, and the number of Clones of the Source LUN.

AdministratorActionDetails>
<AdministratorActionID>ID0139972b-d21d-b211-94b9-001b21217e48</AdministratorActionID>
<AdministratorActionFQN>/CloneStorageFull/CRSAN_DWDB01_82</AdministratorActionFQN>  The Source LUN
<AdministratorActionType>CloneStorageFull</AdministratorActionType>
<CreationDate>2012-03-09T14:06:50-08:00</CreationDate>
<Name>CloneLUNStorageCurrentCapacity</Name>
<Value>100.486</Value>    Space currently in use by this SnapLUN
<Name>CloneLUNStorageFreeCapacity</Name>
<Value>0.000</Value>   Amount of free space in the Clone Space configured by the storage administrator
<Name>CloneLUNStorageMaximumCapacity</Name>
<Value>100.488</Value>    The total Clone Space configured by the storage administrator 
<Name>CloneLUNStorageUsedCapacity</Name>
<Value>100.486</Value>     The amount of space currently in use. 

The CloneLUNStorageUsedCapacity would only be different from the CloneLUNStorageCurrentCapacity if this is a thin provisioned Clone space

<Name>LUNFQN</Name>
<Value>/CRSAN_DWDB01_82</Value>   The source LUN.  The administrator may wish to increase the Clone Space for this LUN name
<Name>TotalNumberOfCloneLUNs</Name>
<Value>1</Value>   The source LUN only has one Clone, or SnapLUN. 


 



Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback