Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1392210.1
Update Date:2012-08-29
Keywords:

Solution Type  Problem Resolution Sure

Solution  1392210.1 :   Pillar Axiom: NAS File Systems May Experience Journal Loss and Possible Data Loss After a Dual Slammer CU Failure  


Related Items
  • Pillar Axiom 300 Storage System
  •  
  • Pillar Axiom 600 Storage System
  •  
  • Pillar Axiom 500 Storage System
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>Pillar Axiom>SN-DK: Ax600
  •  


Axiom NAS file systems maintain journal information in the Battery Backed Memory on their buddy CU.

In this Document
Symptoms
 NAS File Systems may experience journal loss and possible data loss after a dual Slammer CU failure
 Synopsis
 Affected Hardware and Software Versions
Changes
Cause
 Problem
Solution
 Solution/Workaround


Applies to:

Pillar Axiom 500 Storage System - Version Not Applicable to Not Applicable [Release N/A]
Pillar Axiom 300 Storage System - Version Not Applicable to Not Applicable [Release N/A]
Pillar Axiom 600 Storage System - Version Not Applicable to Not Applicable [Release N/A]
Information in this document applies to any platform.

Symptoms

NAS File Systems may experience journal loss and possible data loss after a dual Slammer CU failure

Synopsis

In the unlikely event where both NAS Slammer Control Units fail in succession, the subsequent filesystem journal replay operations could possible be interrupted and cause filesystem inconsistency.

Affected Hardware and Software Versions

This issue affects all NAS and NAS/SAN Axioms that are on AxiomONE 04.00.xx, 04.01.xx, or 04.02.xx (through 04.02.10) releases. This defect does not affect SAN specific systems or any AxiomONE release prior to Release 4.0.

HardwareSoftware
Ax600/Ax500/Ax600 NAS 04.00.xx - All versions
04.01.xx - All versions
04.02.00 to 04.02.10







Changes

N/A

Cause

Problem

A software defect has been discovered that may lead to NAS file system corruption following a condition whereby both Slammer Control Units experience failure.

Specifically, when both Slammer CU's fail in succession, the system will attempt data recovery on both CU's. Since the first CU failure will transition all filesystems into conservative mode, by design, the journals are then replayed to disk. During the journal replay, in very rare conditions, unrelated activities could potentially interrupt the replay process and cause an inconsistency in the journal. This could lead to filesystem inconsistency.

IMPORTANT!

Contact Oracle Support immediately if this issue is encountered.

Solution

Solution/Workaround

The fix for this issue is included in AxiomONE Release 04.02.11 and above. An upgrade to the currently recommended software level (04.02.11 or higher) will prevent this issue and is highly recommended.

Axiom NAS customers below AxiomONE Release 04.02.11 are urged to contact the Oracle Technical Support Center to open a Service Request and schedule an upgrade for remediation.

Common Questions

Question: How much risk is there?
Answer: While the triggering of this issue is reliant upon certain conditions to exist, it is Pillar's position that any risk to data is too much. It is, therefore, strongly recommended to take appropriate steps to remove that risk as soon as possible.

Question: How long have you known about this? Why didn’t we hear about this before?
Answer: This issue was discovered recently and a solution provided. Pillar is proactively issuing this alert following confirmation of risk.

Question: Is the fix non-disruptive?
Answer: Yes. Those customers already on AxiomONE release 4.0, 4.1, or 4.2 may upgrade non-disruptively.  Contact the Support Center for your specific upgrade options.


Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback