Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-77-1019331.1
Update Date:2012-07-25
Keywords:

Solution Type  Sun Alert Sure

Solution  1019331.1 :   Controller Firmware for SE6130, ST6140 and ST6540 on Solaris may not Failover Array LUNs  


Related Items
  • Sun Storage 6540 Array
  •  
  • Sun Storage 6130 Array
  •  
  • Sun Storage 6140 Array
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: Sun Alert
  •  
  • .Old GCS Categories>Sun Microsystems>Sun Alert>Release Phase>Resolved
  •  

PreviouslyPublishedAs
238545


Bug Id
SUNBUG: 6585914

Date of Resolved Release
06-Jun-2008

***Checked for relevance on 25-Jul-2012***

1. Impact

In the event of a controller RPA Memory Fault, the array controllers do not log out 
of their Fibre Channel Storage Area Networks(SAN) as expected.

This prevents the scsi_vhci driver from issuing a failover request and
allows continuous I/O retrys instead of failing them.

2. Contributing Factors

This issue may occur in the following releases:
  • Sun StorEdge 6130 Array (on Solaris) without array firmware 06.19.25.16
  • Sun StorageTek 6140 Array (on Solaris) without array firmware 06.19.25.16
  • Sun StorageTek 6540 Array (on Solaris) without array firmware 06.19.25.16

3. Symptoms

This is an example of an RPA Memory Fault:

Event Type : 6540.ProblemEvent.REC_RPA_ERR_CTL
Severity : 0

---- Sample Description ----
An RPA memory parity error was detected on controller {0}

---- Probable Cause ----
An RPA memory error has been reported on a controller.

---- Recommended Action ----
Replace the controller.


As a result the Array controller will get held in reset, with
the controller tray ID showing "88". The array logs will start
to fill up with target resets.

Date/Time: Thu May 22 20:44:02 MSD 2008
Sequence number: 10330
Event type: 1202
Event category: Error
Priority: Informational
Description: Fibre channel - TGT reset received
Event specific codes: 0/0/0
Component type: Controller
Component location: Controller in slot B
Logged by: Controller in slot B

The target resets are the result of the "Retryable" errors from the host(s).
The Solaris /var/adm/messges will fill up with messages similar to:

May 22 20:02:33 myhost scsi: [ID 107833 kern.warning] WARNING:
/scsi_vhci/ssd@g600a0b80001111110000121212121212 (ssd54):
May 22 20:02:33 myhost Error for Command: read(10)
Error Level: Retryable
May 22 20:02:33 myhost scsi: [ID 107833 kern.notice] Requested
Block: 113135664 Error Block: 113135664
May 22 20:02:33 myhost scsi: [ID 107833 kern.notice] Vendor:
STK Serial Number:
May 22 20:02:33 myhost scsi: [ID 107833 kern.notice] Sense Key:
Not Ready
May 22 20:02:33 myhost scsi: [ID 107833 kern.notice] ASC: 0x4
(<vendor unique code 0x4>), ASCQ: 0x1, FRU: 0x0

And will lack the "initiating failover" messages typical to controller array
faults as expected.

4. Workaround

The only workaround is to remove the array connection to the SAN for the
faulted controller, or to offline those ports connected to the faulted
controller by using a switch management interface for the SAN.

5. Resolution

This issue is addressed in the following releases:
  • Sun StorEdge 6130 Array (on Solaris) with firmware 06.19.25.16 or later
  • Sun StorageTek 6140 Array (on Solaris) with firmware 06.19.25.16 or later
  • Sun StorageTek 6540 Array (on Solaris) with firmware 06.19.25.16 or later
The above firmware is provided by Common Array Manager 6.0.0 or later
releases available at:

http://www.sun.com/download/index.jsp?tab=2


and specifically:

http://www.sun.com/download/products.xml?id=470d094a

Modification History:

25-Jul-2012: Maintenance check for relevance/currency, no change in content

Product
Sun StorageTek 6130 Array
Sun StorageTek 6140 Array
Sun StorageTek 6540 Array

Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback