![]() | Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||||||||||||||||
Solution Type Troubleshooting Sure Solution 1008193.1 : Sun Storage 351x Arrays: Troubleshooting the Cabling
PreviouslyPublishedAs 211252 Troubleshooting Sun Storage[TM] 351x Cabling. Applies to:Sun Storage 3511 SATA Array - Version Not Applicable and laterSun Storage 3510 FC Array - Version Not Applicable and later All Platforms PurposeDescription -Path degraded/offline -Loop offline -SCSI time-outs -Drives and/or Controllers missing or failed -No access to LUNs - LED unlit Sample Warnings from /var/adm/messages: Aug 4 23:25:09 edb mpxio: [ID 779286 kern.info] /scsi_vhci/ssd@g600c0ff000000000002eb07e180f0300 (ssd6) multipath status: degraded, path /pci@9,600000/SUNW,qlc@1/fp@0,0 (fp2) to target address: 226000c0ffb02eb0,0 is offline Aug 4 23:25:09 edb mpxio: [ID 779286 kern.info] /scsi_vhci/ssd@g600c0ff000000000002eb07e180f0300 (ssd6) multipath status: failed, path /pci@8,700000/SUNW,qlc@4/fp@0,0 (fp3) to target address: 216000c0ff802eb0,0 is offline Timeouts and SCSI transport errors: Nov 8 10:33:20 lccdb4 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,700000/SUNW,qlc@1,1/fp@0,0/ssd@w256000c0ffc83b5c,2 (ssd2): Nov 8 10:33:20 lccdb4 SCSI transport failed: reason 'timeout': retrying command Nov 8 13:57:54 lccdb4 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,700000/SUNW,qlc@1,1/fp@0,0/ssd@w256000c0ffc83b5c,0 (ssd4): Nov 8 13:57:54 lccdb4 SCSI transport failed: reason 'timeout': retrying command Nov 8 15:08:49 lccdb4 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,700000/SUNW,qlc@1,1/fp@0,0/ssd@w256000c0ffc83b5c,2 (ssd2): Nov 8 15:08:49 lccdb4 SCSI transport failed: reason 'timeout': giving up Loop Offline: qlc: [ID 686697 kern.info] NOTICE: Qlogic qlc(2): Loop OFFLINE fctl: [ID 517869 kern.warning] WARNING: 162=>fp(2)::OFFLINE timeout scsi: [ID 107833 kern.warning] WARNING: /pci@8,700000/SUNW,qlc@5,1/fp@0,0/ssd@w226000c0ff901ef4,0 (ssd12): transport rejected (-2) scsi: [ID 243001 kern.info] /pci@8,700000/SUNW,qlc@5,1/fp@0,0 (fcp2): offlining lun=0 (trace=0), target=a6 (trace=2800004) From array event log: Wed Feb 18 10:13:31 2004 [0111] #4: StorEdge Array SN#8002658 Controller ALERT: redundant controller failure detected Troubleshooting Steps
Steps to Follow:NOTE: This is a sub-set of Document: "Troubleshooting Sun Storage[TM] 33x0/351x Hardware." 1. Confirm supported cabling configuration is used: http://download.oracle.com/docs/cd/E19236-01/816-7300-21/appb_jbod.html#pgfId-1000275 http://download.oracle.com/docs/cd/E19236-01/816-7300-21/appb_jbod.html#pgfId-1000285 Note: 3511 FC does not support the use of JBOD arrays: http://download.oracle.com/docs/cd/E19236-01/816-7300-21/appb_jbod.html#pgfId-1000092 http://download.oracle.com/docs/cd/E19236-01/816-7300-21/ch04_cable.html#20126 For configurations with SunCluster, see SunCluster 3.x With Sun Storage 3510 or 3511 2. Confirm that the appropriate ports are in use: Dual controller
For Sun Storage 3511 SATA arrays, channels 0 and 1 automatically configure their ports to match the transfer speed and communication method of each connection. Channels 4 and 5 only support a 2-Gbyte transfer rate. If you connect two servers to channel 0 or to channel 1, use host filtering if you want to control host access to logical drives.
3. Confirm the connected SFPs are green (if visual inspection of the cable and SFP connection is possible). SFP link status : Solid green - Active good FC Connection Off - Empty or failed FC connection 4. Verify the correct cable is used. https://support.oracle.com/handbook_private/Systems/3510_R/components.html#Cables
https://support.oracle.com/handbook_private/Systems/3511/components.html#Cables 5. Verify channel/port connection type and speed by issuing a show channels command from the sccli prompt: 3510 example: sccli> show channels Ch Type Media Speed Width PID / SID -------------------------------------------- 0 Host FC(L) 2G Serial 40 / 41 1 Host FC(L) 2G Serial 43 / 42 2 DRV+RCC FC(L) 2G Serial 14 / 15 3 DRV+RCC FC(L) 2G Serial 14 / 15 4 Host FC(L) 2G Serial 44 / 45 5 Host FC(L) 2G Serial 47 / 46 6 Host LAN N/A Serial NA / NA From the above 3510 output and looking at the speed column we see hosts are attached to channels 0, 1, 4 and 5 with 2-Gbyte transfer speeds. There is no host connected to channel 6 (speed=N/A). Note - Speed values are displayed for the primary controller only. Therefore, if a user maps one LUN to the primary controller, and another LUN to a secondary controller, only the established connection to the primary controller is displayed. As a result, if a primary ID is not mapped to a channel, and a secondary ID is mapped, "Async" displays in the Speed field. Note - For FC or SATA, a speed value of Async may mean no link or link down if it is on a channel with a PID . 6. Confirm SFP connection by using the sccli show bypass command. The show bypass SFP command displays the bypass status of all small form-factor (SFP) transceivers on a specified loop. Note - Loop A and Loop B refer to the redundant FC loops that each device is connected to. The SES device in the top slot of the chassis is connected to Loop A, which is the first drive channel. The bottom SES device is connected to Loop B, which is the second drive channel. sccli> show bypass sfp ses-channel 2 loop loopa PORT ENCL-ID ENCL-TYPE LOOP BYP-STATUS ATTRIBUTES ---- ------- --------- ---- ---------- SH 1 0 RAID LOOP-A Not-Installed -- R 0 RAID LOOP-A Not-Installed -- 4 0 RAID LOOP-A Not-Installed -- Note: The L and the R as shown above or any designated drive channel must not have an unused SFP installed. It will show a BYPASS status of Bypassed and ATTRIBUTES of -H. LOOP-A refers to the controller in the top slot and LOOP-B is the controller in the bottom slot. If a device is bypassed, the Attributes returned values include S, F, or H. * An S means the device was bypassed due to a Sun Storage CLI command. * An F means a drive fault caused the bypass. * An H means the device was bypassed due to a hardware problem (no signal was present). The following example on channel 2 and shows the bypass information for a Sun Storage 3511 SATA array on loop A: sccli> show bypass sfp ses-channel 2 loop loopa PORT ENCL-ID ENCL-TYPE LOOP BYP-STATUS ATTRIBUTES ---- ------- --------- ---- ---------- SH-------- 0L 0 RAID LOOP-A Unbypassed -- 0R 0 RAID LOOP-A Unbypassed -- 1L 0 RAID LOOP-A Not-Installed -- 1R 0 RAID LOOP-A Not-Installed -- 2 0 RAID LOOP-A Bypassed -H 3 0 RAID LOOP-A Not-Installed -- 4 0 RAID LOOP-A Not-Installed -- 5 0 RAID LOOP-A Bypassed -H AL 1 JBOD LOOP-A Unbypassed -- AR 1 JBOD LOOP-A Unbypassed -- BL 1 JBOD LOOP-A Unbypassed -- BR 1 JBOD LOOP-A Bypassed -H The Port returned values indicate the type of device, FC or SATA, that is attached to the loop. On a Sun Storage 3510 FC RAID IOM board, from left to right, there are six ports: channel 0, channel 1, channel 2(3) Left, channel 2(3) Right, channel 4 and channel 5. Valid values for the Sun Storage 3510 FC RAID IOM board include 0, 1, 4, 5, L and R. On a Sun Storage 3510 FC JBOD IOM board, from left to right, there are two ports: Left and Right. Valid values for port include L and R. On a Sun Storage 3511 SATA RAID IOM board, from left to right, there are eight ports: channel 0 left, channel 0 right, channel 1 left, channel 1 right, channel 2, channel 3, channel 4 and channel 5. Valid values for the Sun StorEdge 3511 SATA RAID IOM board include 0L, 0R, 1L, 1R, 2, 3, 4 and 5. On a Sun Storage 3511 SATA JBOD IOM board, from left to right, there are four ports: loop A left, loop A right, loop B left and loop B right. Valid value ports for the Sun Storage 3511 SATA JBOD IOM include AL, AR, BL and BR. More Examples: Connection detected on Channel 0 and 1 (host ports) to a Server HBA. The second Left (L) Channel is connected to a JBOD sccli> show bypass sfp ses-channel 2 loop a PORT ENCL-ID ENCL-TYPE LOOP BYP-STATUS ATTRIBUTES ---- ------- --------- ---- ---------- SH-------- 0 0 RAID LOOP-A Unbypassed -- 1 0 RAID LOOP-A Unbypassed -- L 0 RAID LOOP-A Unbypassed -- R 0 RAID LOOP-A Bypassed -H 4 0 RAID LOOP-A Bypassed -H 5 0 RAID LOOP-A Bypassed -H L 1 JBOD LOOP-A Unbypassed -- R 1 JBOD LOOP-A Not-Installed -- With cables removed, we see connections hardware (-H) bypassed: sccli> show bypass sfp ses-channel 2 loop a PORT ENCL-ID ENCL-TYPE LOOP BYP-STATUS ATTRIBUTES ---- ------- --------- ---- ---------- SH-------- 0 0 RAID LOOP-A Bypassed -H 1 0 RAID LOOP-A Bypassed -H L 0 RAID LOOP-A Bypassed -H R 0 RAID LOOP-A Bypassed -H 4 0 RAID LOOP-A Bypassed -H 5 0 RAID LOOP-A Bypassed -H If the connection remains Bypassed after the connection has been verified , cable and SFP swapped or replaced, execute an Explorer from the host and escalate. More advanced diag commands can be run to test the bypass status. Note:The sccli show fru ouput , event log or /var/adm/messages will not report cable or SFP failure . Additional troubleshooting may be required to identify a marginally operating, failed component or connection. See: Troubleshooting Fibre Channel Devices from the OS. 7. If no problems are found during the course of this document, refer back to Document: Troubleshooting Sun Storage[TM] 33x0/351x Hardware .
Attachments This solution has no attachment |
||||||||||||||||||||||||||||||||||
|