Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1432935.1
Update Date:2012-03-12
Keywords:

Solution Type  Technical Instruction Sure

Solution  1432935.1 :   Sun Storage 7000 Unified Storage System: How to verify host connectivity to Unified Storage  


Related Items
  • Sun Storage 7310 Unified Storage System
  •  
  • Sun Storage 7410 Unified Storage System
  •  
  • Sun ZFS Storage 7120
  •  
  • Sun Storage 7110 Unified Storage System
  •  
  • Sun ZFS Storage 7320
  •  
  • Sun ZFS Storage 7420
  •  
  • Sun Storage 7210 Unified Storage System
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>NAS>SN-DK: 7xxx NAS
  •  
  • .Old GCS Categories>Sun Microsystems>Storage - Disk>Unified Storage
  •  




In this Document
  Goal
  Solution
  References


Applies to:

Sun Storage 7110 Unified Storage System - Version: Not Applicable to Not Applicable - Release: N/A to N/A
Sun Storage 7310 Unified Storage System - Version: Not Applicable to Not Applicable   [Release: N/A to N/A]
Sun Storage 7410 Unified Storage System - Version: Not Applicable to Not Applicable   [Release: N/A to N/A]
Sun ZFS Storage 7320 - Version: Not Applicable to Not Applicable   [Release: N/A to N/A]
Sun ZFS Storage 7420 - Version: Not Applicable to Not Applicable   [Release: N/A to N/A]
7000 Appliance OS (Fishworks)

Goal

How to verify host connectivity on Oracle/Sun Branded Host Bus Adapters (HBAs) with the appliance

Solution

1. Verify that the HBAs  are seen / available to the host, even before the operating system loads.

At the ok prompt:
ok> reset-all
ok> probe-scsi-all

The above output should list the device paths associated with the targets configured.

2. Confirm that the HBA is visible to the OS and the devices can be seen from the HBA.

The following example outputs assume that there are 4 LUNs configured over 2 device paths, on the 7xxx cluster , 2 owned by each node (or head)

a) Use the "luxadm -e port" command to verify HBA has established communication with a node.

Example:
/devices/pci@0,0/pci10de,5d@e/pci1077,143@0/fp@0,0:devctl          CONNECTED
/devices/pci@0,0/pci10de,5d@e/pci1077,143@0,1/fp@0,0:devctl        CONNECTED


If 'luxadm -e port' shows  NOT CONNECTED  for the device path identified,
then check the following:

 - Fabric connectivity including cabling, zoning, switch component health. Refer to <Document 1434045.1> Sun Storage 7000 Unified Storage System: Verifying the Fabric health for Fibre-Channel connectivity


 - Configuration on the appliance: Refer to <Document 1434004.1> Sun Storage 7000 Unified Storage System: Verifying that the Unified Storage has been configured correctly for Fibre-Channel connectivity


b) If 'luxadm -e port' shows "CONNECTED" then use the device path output and verify that the devices can be seen in "luxadm -e dump_map ."
Example:

#luxadm -e dump_map /devices/pci@0,0/pci10de,5d@e/pci1077,143@0/fp@0,0
Pos  Port_ID Hard_Addr Port WWN         Node WWN         Type
0    b1300   0        2101001b323baa4c 2001001b323baa4c 0x0  (Disk device)
1    b1200   0        21000024ff2e4ec7 20000024ff2e4ec7 0x0  (Disk device)
2    b0300   0        21000024ff2a967c 20000024ff2a967c 0x1f (Unknown Type,Host Bus Adapter)

# luxadm -e dump_map /devices/pci@0,0/pci10de,5d@e/pci1077,143@0,1/fp@0,0:devctl
Pos  Port_ID Hard_Addr Port WWN         Node WWN         Type
0    10a00   0        21000024ff2e4ec6 20000024ff2e4ec6 0x0  (Disk device)
1    10b00   0        2100001b321baa4c 2000001b321baa4c 0x0  (Disk device)
2    10d00   0        21000024ff2a967d 20000024ff2a967d 0x1f (Unknown Type,Host Bus Adapter)


c) Check the luxadm display output to check the active /standby paths to the target
Example:
Luxadm display output for lun owned by node A. The same is seen for the other lun owned by node A

# luxadm display /dev/rdsk/c4t600144F0F8D6944400004E6090280004d0s2
DEVICE PROPERTIES for disk: /dev/rdsk/c4t600144F0F8D6944400004E6090280004d0s2
Vendor:               SUN
Product ID:           ZFS Storage 7320
Revision:             1.0
Serial Num:
Unformatted capacity: 3072.000 MBytes
Read Cache:           Enabled
  Minimum prefetch:   0x0
  Maximum prefetch:   0x0
Device Type:          Disk device
Path(s):

/dev/rdsk/c4t600144F0F8D6944400004E6090280004d0s2
/devices/scsi_vhci/disk@g600144f0f8d6944400004e6090280004:c,raw
 Controller           /dev/cfg/c2
  Device Address              21000024ff2e4ec7,1
  Host controller port WWN    21000024ff2a967c
  Class                       secondary
  State                       STANDBY
 Controller           /dev/cfg/c2
  Device Address              2101001b323baa4c,1
  Host controller port WWN    21000024ff2a967c
  Class                       primary
  State                       ONLINE
 Controller           /dev/cfg/c3
  Device Address              21000024ff2e4ec6,1
  Host controller port WWN    21000024ff2a967d
  Class                       secondary
  State                       STANDBY
 Controller           /dev/cfg/c3
  Device Address              2100001b321baa4c,1
  Host controller port WWN    21000024ff2a967d
  Class                       primary
  State                       ONLINE

As seen above, 4 paths are visible, 2 of which are active and 2 are standby
The 2 active ones are to the 7xxx node which is serving the luns and the 2 standby ones are the paths to the other 7xxx node.


d) If the array is switch attached then verify that device is configured in the "cfgadm -al" command.
If not configured, use cfgadm -c configure <Ap-ID>
Example:

# cfgadm -al

c5                      fc-fabric connected    configured   unknown
c5::21000024ff2e4ec6    disk      connected    configured   unknown
c5::2100001b321baa4c    disk      connected    configured   unknown
c6                      fc-fabric connected    configured   unknown
c6::2101001b323baa4c    disk      connected    configured   unknown
c6::21000024ff2e4ec7    disk      connected    configured   unknown

If the luns presented are showing up as "unusable" or failing, then check if the lun from the back-end storage has been unconfigured, then
first get rid of these unusable LUNs by running correspondent command:
cfgadm -c unconfigure -o unusable_FCP_dev cX::WWN

for every cX::WWN with unusable LUNs
See also: <Document 1018716.1> How to Unconfigure a Single LUN from a Target which has multiple LUNs

Then configure the LUN correctly on the backend storage.
Verify by referring to the <Document 1434004.1> Sun Storage 7000 Unified Storage System: Verifying that the Unified Storage has been configured correctly for Fibre-Channel connectivity

If the cfgadm -al output shows the the attachment point as "failing", then attempt the following commands
#devfsadm -Cv -c disk

If that does not help,
#luxadm -e forcelip /devices/pci@0,0/pci10de,5d@e/pci1077,143@0,1/fp@0,0:devctl

Then if that does not help, check for any errors in the Solaris host's /var/adm/messages file. If errors are present, then verify the Fabric health by referring to <Document 1434045.1> Sun Storage 7000 Unified Storage System: Verifying the Fabric health for Fibre-Channel connectivity

If that does not help, attempt a reboot of the host ( last resort).
If the reboot does not help, then further troubleshooting is required.
Contact Oracle support citing the exact problem.


e) Use the format command to do a quick check if the configured LUNs are seen by the host

The host sees 4 FC luns
#format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
0. c16t500000E010CC3B30d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
/scsi_vhci/ssd@g500000e010cc3b30

1. c4t600144F0F8D6944400004E60900A0003d0 <DEFAULT cyl 2557 alt 2 hd 128 sec 32>    owned by node A of the 7xxx box
        /scsi_vhci/disk@g600144f0f8d6944400004e60900a0003

2. c4t600144F0F8D6944400004E6090280004d0 <DEFAULT cyl 1533 alt 2 hd 128 sec 32>     owned by node A
        /scsi_vhci/disk@g600144f0f8d6944400004e6090280004

3. c4t600144F0991DB38600004E6091E60001d0 <DEFAULT cyl 2045 alt 2 hd 128 sec 32>     owned by node B
        /scsi_vhci/disk@g600144f0991db38600004e6091e60001

4. c4t600144F0991DB38600004E6091FF0002d0 <DEFAULT cyl 2045 alt 2 hd 128 sec 32>     owned by node B
      /scsi_vhci/disk@g600144f0991db38600004e6091ff0002



3. Correct errors reported by host

i). If the host experiences errors like:

fctl: [ID 517869 kern.warning] WARNING: fp(2)::62200 NS failure pkt state=dreason=9, expln=1, NSCMD=0100, NSRSP=8001

ensure that the host is patched up with Patch:145957-06 ( sparc) or Patch:145958-06 (x86) delivered in S10u10

This fixes bugs:
<Bug:6995579> -  6956269 is incomplete fix and FP_NHEAD2 should be FP_NHEAD1 to fix it.
<Bug:6977521> - fctl driver produces "NS Failure pkt state" messages after installing fp patch:141874-07 or higher
<Bug:6977521> is a duplicate of <Bug:6956269> -   WARNING:fp(0)::fp_plogi_intr on NL nodes

ii). Occasional transport errors are possible if there is a marginal component in the fiber-channel path to the device.
In such cases, verify if a certain pattern is observed with respect to the device path itself.
If so, troubleshooting of the device path through the process of elimination of the components involved might be required.

iii). In case connectivity itself is verified but the host experiences occasional timeouts, then the following information might be applicable.

Check Queue depth:
a) For each 7000 target port

2048/(N *L )or 2048/(N *L * 2) for 7000 cluster.

L is the number of LUNs on a 7000 target port and N the number of hosts sharing the
LUNs


b) sd_max_throttle and ssd_max_throttle ( depending on the target driver used)
The variable controls the queue depth used per target LUN. Which one is used for a specific HBA depends on whether the HBA driver binds itself to either the sd or ssd driver.
The default setting for s(s)d_max_throttle is 256 for. This means that when using more than 8 LUNs per 7000 HBA port, the value of s(s)d_max_throttle has to be lowered.

Global method for setting s(s)d_max_throttle in Solaris kernel
To set s(s)d_max_throttle, add the following line to the kernel file, /etc/system

set ssd:ssd_max_throttle=x
or
set sd:sd_max_throttle=x

Where x is the max queue depth per LUN as calculated following the rule described above.
A system reboot is required to make the kernel use the newly configured queue depth.
See also the help file in the 7000 system:

https://<your.ip.address>:215/wiki/index.php/Configuration:SAN:FC#Queue_Overruns

If errors continue despite the above checks, contact Oracle Support to raise a Service Request.

References

<NOTE:1434045.1> - Sun Storage 7000 Unified Storage System: Verifying the Fabric health to troubleshoot Fibre-Channel connectivity
<NOTE:1434004.1> - Sun Storage 7000 Unified Storage System: Verifying that the Unified Storage has been configured correctly for Fibre-Channel connectivity
<BUG:6995579> - 6956269 IS INCOMPLETE FIX AND FP_NHEAD2 SHOULD BE FP_NHEAD1 TO FIX IT.
<BUG:6977521> - FCTL DRIVER PRODUCES "NS FAILURE PKT STATE" MESSAGES AFTER INSTALLING FP PATCH 141874-07 OR HIGHER
<BUG:6956269> - WARNING:FP(0)::FP_PLOGI_INTR ON NL NODES

Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback