Document Audience: | INTERNAL |
Document ID: | I1144-1 |
Title: | Clustered Netra D130 array causes various SCSI errors under load if LVD SCSI Disk Drives are installed to replace failed single ended SCSI disk drives in a cross cabled configuration. |
Copyright Notice: | Copyright © 2005 Sun Microsystems, Inc. All Rights Reserved |
Update Date: | 2005-03-29 |
------------------------------------------------------------
- Sun Proprietary/Confidential: Internal Use Only -
------------------------------------------------------------------------
*** Sun Confidential: Internal Use and Authorized VARs Only ***
________________________________________________________________________
This message including any attachments is confidential information
of Sun Microsystems, Inc. Disclosure, copying or distribution is
prohibited without permission of Sun. If you are not the intended
recipient, please reply to the sender and then delete this message.
________________________________________________________________________
FIELD INFORMATION NOTICE
(For Authorized Distribution by Sun Service)
FIN #: I1144-1
Synopsis: Clustered Netra D130 array causes various SCSI errors under load if LVD SCSI Disk Drives are installed to replace failed single ended SCSI disk drives in a cross cabled configuration.Create Date: Jan/12/05
SunAlert: No
Top FIN/FCO Report: No
Products Reference: Sun StorEdge st D130 Array
Product Category: Storage / Diag-Doc-Service
Product Affected:
Systems Affected:
-----------------
Mkt_ID Platform Model Description Serial Number
------ -------- ----- ----------- -------------
- N14 ALL Netra t1405 -
- N15 ALL Netra t1400 -
X-Options Affected:
-------------------
Mkt_ID Platform Model Description Serial Number
------ -------- ----- ----------- -------------
- st D130 ALL Netra st D130 Array -
X1032A - - OPT INT PCI 10/100BASET NIC -
Parts Affected:
----------------------
Part Number Description Model
----------- ----------- -----
390-0069-03 DRV SEA 36GB 10K 1-in SCSI3 -
390-0109-05 DRV SEA 36GB 10K 1-in SCSI4 -
390-0156-03 DRV FJ 36GB 10K1 SCSI4-T485-61 -
References:
ESC: 1-3909668
Issue Description:
Installation of LVD SCSI disk drives as replacements for failed single
ended SCSI disk drives in a cross cabled configuration with clustered
Netra D130 array causes various SCSI errors under load.
The following is a SunCluster 3.0 configuration in which the two host
nodes are Netra t 1400 servers connected to a pair of Netra D130
arrays. A Single-Ended Ultra/Wide SCSI/FastEthernet (SunSwift PCI) has
been installed in each host. These hosts are cabled to the D130 arrays
using 4 0.8 Meter SCSI cables in a cross connected fashion.
The onboard SCSI connecter of the Node 0 Netra 1400 is connected to
the SCSI in port of the Array 1 Netra D130, the SunSwift HBA of this
host is connected to the SCSI out port of the Array 2 netra D130.
The onboard SCSI connecter of the Node 1 Netra 1400 is connected to
the SCSI in port of the Array 2 Netra D130, the SunSwift HBA of this
host is connected to the SCSI out port of the Array 1 netra D130.
Diagrams are shown below:
+-------------+
Netra t 1400 | |-|-----+
Node 0 | = | |
+-|-----------+ |
+---------+ |
+-----------|-+ |
Netra st D130 | = | |
Array 1 | =-|--+ |
+-------------+ | |
| |
| |
+-------------+ | |
Netra st D130 | +--= | | |
Array 2 | | =-|--)--+
+--------|----+ |
| |
+------+ |
+-|-----------+ |
Netra t 1400 | | |-|--+
Node 1 | = |
+-------------+
There are several factors contributing to the observed issue.
. In this configuration given the 1 meter internal bus length of
the D130 array, the 0.8 meter SCSI cables used to connect the
hosts, and the 5 total targets on the SCSI bus, the entire overall
SCSI bus length exceeds the 1.5 meter maximum bus length for this
configuration. This introduces a potential SCSI signal
degradation issue and thus SCSI errors.
. The onboard SCSI adapter on one host is cross connected to the
SunSwift HBA on the second host which is typically not done.
. It appears that the LVD SCSI disk drives are more sensitive to the
extended bus length / cross cabled configuraton than single ended
SCSI disks and thus genenerate the SCSI errors.
All of these factors combined to produce the SCSI errors observed onsite.
The following are examples of the SCSI errors seen on the D130 array.
SCSI Errors reported against the SunSwift HBA:
Sep 20 03:16:47 DS2cable0 SCSI: [ID 107833 kern.warning] WARNING:
/pci@1f,4000/pci@5/SUNW,isptwo@4 (isp0):
Sep 20 03:16:47 DS2cable0 Interrupt bit still set after 10 seconds.
Card or firmware failure.
Sep 20 03:16:51 DS2cable0 SCSI: [ID 107833 kern.warning] WARNING:
/pci@1f,4000/pci@5/SUNW,isptwo@4/sd@a,0 (sd9):
Sep 20 03:16:51 DS2cable0 SCSI transport failed: reason 'reset':
retrying command.
SCSI Errors reported against the onboard SCSI adapter:
Sep 20 03:27:38 DS2cable0 SCSI: [ID 107833 kern.warning] WARNING:
/pci@1f,4000/scsi@3,1 (glm1):
Sep 20 03:27:38 DS2cable0 Resetting SCSI bus, Message-In was expected
from (11,0)
Sep 20 03:27:38 DS2cable0 genunix: [ID 408822 kern.info] NOTICE: glm1:
fault detected in device; service still available
Sep 20 03:27:38 DS2cable0 genunix: [ID 611667 kern.info] NOTICE: glm1:
Resetting SCSI bus, Message-In was expected from (11,0)
Sep 20 03:27:38 DS2cable0 SCSI: [ID 107833 kern.warning] WARNING:
/pci@1f,4000/scsi@3,1 (glm1):
Sep 20 03:27:38 DS2cable0 Target 11 reducing sync. transfer rate
Sep 20 03:27:38 DS2cable0 glm: [ID 923092 kern.warning] WARNING:
ID[SUNWpd.glm.sync_wide_backoff.6014]
Sep 20 03:27:38 DS2cable0 SCSI: [ID 107833 kern.warning] WARNING:
/pci@1f,4000/scsi@3,1 (glm1):
Sep 20 03:27:38 DS2cable0 got SCSI bus reset
Issue an 'iostat -En' command from an operating system prompt, review the
output and look for one of the three Vendor/Product combinations listed
below:
Vendor: FUJITSU Product: MAP3367NC
Vendor: SEAGATE Product: ST336605LC
Vendor: SEAGATE Product: ST336607LC
See the sample iostat -En output below:
#iostat -En
sd10 Soft Errors: 0 Hard Errors: 2 Transport Errors: 0
Vendor: FUJITSU Product: MAP3367NC SUN18G Revision: 0804 Serial No: 05P32232
Size: 18.11GB <18110967808 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 2 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
Under normal conditions, LVD drives can be used as a one for one
replacement for Single Ended SCSI disks. But in the configuration
detailed above, the LVD disks combined with an Extended SCSI bus length
for cluster configurations and a cluster configuration using differing
HBAs in cross connected fashion are contributing to the SCSI signal
degradaton which leads to the SCSI errors.
Removal and replacement of existing LVD disks in affected Netra D130
arrays resolves the SCSI error issue.
Implementation:
---
| | MANDATORY (Fully Proactive)
---
---
| | CONTROLLED PROACTIVE (per Sun Geo Plan)
---
---
| X | REACTIVE (As Required)
---
Corrective Action:
The following recommendation is provided as a guideline for authorized
Sun Services Field Representatives who may encounter the above
mentioned issue.
When replacing failed drives in the D130 arrays, only use a single ended
SCSI drive for a replacement disk. The following details the procedure that
must be followed to guarantee that an appropriate drive will be shipped
as a replacement.
1. NOTE that the FE/SSE needs to contact the RSL directly
(the Parts Call Center will not be able to do this for them).
Upon doing so, DO NOT advise the RSL to open the box. Having
RSLs open boxes will introduce risk to the part, as they do not
have any ESD or parts handling training. Instead, the RSL
should check the drive base part number that is noted on the
external box label:
TOP LEVEL FRU: 540-4689
Different versions of bare drives that go into this
same top level FRU part number are noted on the label:
* 390-0050 Single Ended SCSI (OK)
* 390-0051 Single Ended SCSI (OK)
* 390-0052 Single Ended SCSI (OK)
* 390-0069 LVD SCSI (not acceptable)
* 390-0109 LVD SCSI (not acceptable)
* 390-0156 LVD SCSI (not acceptable)
2. Make sure FE/SSEs are aware that having the RSLs manually check
the boxes may slightly increase the order processing time.
Comments:
Acronyms used for this FIN:
LVD - Low Voltage Differential
============================================================================
NOTE: FIN Tracking Instructions for Radiance/SPWeb:
--------------------------------------------------
If a Radiance case involves the application of a FIN to solve a customer
issue, please complete the following steps in Radiance/SPWeb prior to
closing the case:
o Select "Field Information Notice" in the REFERENCE TYPE field.
o Enter FIN ID number in the REFERENCE ID field.
For example; I1111-1.
If possible, include additional details in the REFERENCE SUMMARY field
(ie. implementation complete, customer declined, etc.)
--------------------------------------------------------------------------
Implementation Notes:
--------------------
In case of "Mandatory" FINs, Sun Services will attempt to contact
all known customers to recommend proactive implementation.
For "Controlled Proactive" FINs, Sun Services mission critical
support teams will initiate proactive implementation efforts for
their respective accounts as required.
For "Reactive" FINs, Sun Services and partners will implement
the necessary corrective actions as the need arises.
Billing Information:
-------------------
Warranty: On-Site Labor Rates are based on specified Warranty deliverables
for the affected product.
Contract: On-Site Labor Rates are based on the type of service contract.
Non Contract: On-Site implementation by Sun is available based on On-Site
Labor Rates defined in the Price List.
--------------------------------------------------------------------------
All FIN documents are accessible via Internal SunSolve. Type "sunsolve"
in a browser and follow the prompts to Search Collections.
For questions on this document, please email:
[email protected]
The FIN and FCO homepage is available at:
http://sdpsweb.central/FIN_FCO/index.html
For more information on how to submit a FIN, go to:
http://pronto.central/fin.html
To access the Service Partner Exchange, use:
https://spe.sun.com
--------------------------------------------------------------------------