Asset ID: |
1-72-1389997.1 |
Update Date: | 2012-08-27 |
Keywords: | |
Solution Type
Problem Resolution Sure
Solution
1389997.1
:
Pillar Axiom: Adding new SATA-2 or Fibre Channel V2 Bricks to an Existing System May Fail
Related Items |
- Pillar Axiom 300 Storage System
- Pillar Axiom 600 Storage System
- Pillar Axiom 500 Storage System
|
Related Categories |
- PLA-Support>Sun Systems>DISK>Pillar Axiom>SN-DK: Ax600
|
A combination of code and manufacturing issues will prevent successful additions of SATA V2 or FC V2 Bricks to an Axiom
In this Document
Applies to:
Pillar Axiom 500 Storage System - Version Not Applicable to Not Applicable [Release N/A]
Pillar Axiom 600 Storage System - Version Not Applicable to Not Applicable [Release N/A]
Pillar Axiom 300 Storage System - Version Not Applicable to Not Applicable [Release N/A]
Information in this document applies to any platform.
Release 04.03.01 thru 04.03.02
Symptoms
Any attempt to add SATA V2 or FC V2 Bricks from manufacturing to an Ax300, Ax500, or Ax600 at release 04.03.01 thru 04.03.02 will result in repeated Slammer CU panics in the ConMan component leading to CU Failure, and possible Emergency System Shutdown.
The error results in buffer overflow, corrupting ConMan memory structures, typically resulting in SEGV for the dumper.
Scanlogs tdsextract will skip the errors in the tds logs, with entries such as:
WARN: Unprintable string parameter -- skipped the record 559677831 (CM_COD) ReconcileCodTask.cpp:391
WARN: Unprintable string parameter -- skipped the record 559677831 (CM_COD) ReconcileCodTask.cpp:391
WARN: Unprintable string parameter -- skipped the record 559677831 (CM_COD) ReconcileCodTask.cpp:391
WARN: Unprintable string parameter -- skipped the record 559677831 (CM_COD) ReconcileCodTask.cpp:391
WARN: Unprintable string parameter -- skipped the record 559677831 (CM_COD) ReconcileCodTask.cpp:391
WARN: Unprintable string parameter -- skipped the record 559677831 (CM_COD) ReconcileCodTask.cpp:391
Changes
Any attempt to add new Bricks to the affected systems at the affected release.
Cause
Reference: Pillar Defect 64667
A change in manufacturing process had resulted in the COD information being written at the wrong physical offset on newly manufactured Brick COD LUNs for the SATA V2 and FC V2 Bricks.
A change in ConMan component tracing in Release 04.03.00 to determine why live Brick additions of factory fresh Bricks would produce varying results and Administrator Actions resulted in the repeated panics when it discovered valid COD information, but at the wrong offset. The tracing would overflow buffers, corrupting memory structures just past the correct memory locations.
On releases R1.x thru R4.x, new Bricks from Manufacturing have the two COD LUNs created, with a valid COD signature and a System Serial Number of 9999999999. As these Bricks are added to an Axiom, this System Serial Number and initialized but vacant COD information should result in the Brick being automatically added to the Axiom storage pool, without generating any Administrator Actions or requiring any operations by the installer. R5.x and higher releases will generate an Administrator Alert for the purposes of asking the administrator for assignment of the new Brick to a Storage Domain.
New Bricks from Manufacturing had been generating Administrator Actions for Foreign Brick and/or Unknown Brick as they were added to running Axioms. Release 04.03.00 added the tracing code that resulted in the buffer overflows and the resulting software panics.
- Foreign Brick Administrator Actions should only be generated when the Axiom detects a new Brick, and notes that the System Serial Number is not 9999999999. The AA is generated to warn that accepting the Brick into the Axiom will delete any and all data that may be on the Brick. The assumption is that this Brick has been taken from another Axiom or has been removed from this Axiom and the COD generation is outdated.
- Unknown Brick Administrator Actions should be generated only when the Axiom detects a new Brick but the COD signature is invalid. The assumption is that this Brick has an unknown history and may or may not contain user data that will be deleted if it is accepted.
Solution
IMPORTANT!
Contact Pillar Data Systems Customer Support immediately if this issue is encountered.
Solution/Workaround
The fix for this issue is included in AxiomONE Release 4.3 (04.03.03 and above). An upgrade to 04.03.03 or the current recommended patch level above 04.03.03 will prevent this issue and is highly recommended.
Axiom customers below AxiomONE release 04.03.03 are urged to contact the Pillar World Wide Customer Support Center to open a Service Request and schedule an upgrade for remediation.
Common Questions
Question: How much risk is there?
Answer: While the triggering of this issue is reliant upon certain conditions to exist, it is Pillar's position that any risk to data is too much. It is, therefore, strongly recommended to take appropriate steps to remove that risk as soon as possible.
Question: How long have you known about this? Why didn’t we hear about this before?
Answer: This issue was discovered during internal testing very recently. Pillar is proactively issuing this notice following confirmation of risk.
Question: Is the fix non-disruptive?
Answer: Yes. Those customers already on AxiomONE release 4.0, 4.1, 4.2, or 4.3 may upgrade non-disruptively.
Attachments
This solution has no attachment