Sun Storage 7000 Unified Storage System: Troubleshooting performance problems with Fibre-Channel Luns

Asset ID:	1-71-1434007.1
Update Date:	2012-03-12
Keywords:

Solution Type Technical Instruction Sure

Solution 1434007.1 : Sun Storage 7000 Unified Storage System: Troubleshooting performance problems with Fibre-Channel Luns

Applies to:

Sun Storage 7310 Unified Storage System - Version: Not Applicable to Not Applicable - Release: N/A to N/A
Sun Storage 7410 Unified Storage System - Version: Not Applicable to Not Applicable   [Release: N/A to N/A]
Sun ZFS Storage 7320 - Version: Not Applicable to Not Applicable   [Release: N/A to N/A]
Sun ZFS Storage 7120 - Version: Not Applicable to Not Applicable   [Release: N/A to N/A]
Sun ZFS Storage 7420 - Version: Not Applicable to Not Applicable   [Release: N/A to N/A]
Information in this document applies to any platform.

Goal

How to troubleshoot performance problems with FC luns on the Unified Storage.

Solution

Fibre-Channel (FC) performance can be observed via analytics, whereby one can breakdown operations or throughput by initiator, target, or LUN.
For operations, one can also breakdown by offset, latency, size and SCSI command, allowing one to understand not just the "what" but the "how" and "why" of FC operations.
Enable the datasets associated with fibre-channel operations in the analytics tab of the BUI.
Verify the host connectivity as explained the <Document:Sun Storage 7000 Unified Storage System: How to verify host connectivity (Doc ID 1432935.1)> and verify that the recommended Solaris patches are installed on the Solaris client and multipathing software is optimal using <Document:Sun Storage 7000 Unified Storage System: How to verify that the multipathing software (MPxIO / VxDMP) is optimal(Doc ID 1433014.1)>
Mirror pools with logzilla(s) are known to work best for FC.
Queue Overruns / Queue depth
It is generally not necessary to restrict queue depths on clients as the FC ports in the appliance can handle a large number of concurrent requests. Even so, there exists the remote possibility that these queues can be overrun, resulting in SCSI transport errors. Such queue overruns are often associated with one or more of the following :
- Overloaded ports on the front end - too many hosts associated with one FC port and/or too many LUNs accessed through one FC port
- Degraded appliance operating modes, such as a cluster takeover in what is designed to be an active-active cluster configuration

While the possibility of queue overruns is remote, it can be eliminated entirely if one is willing to limit queue depth on a per-client basis.

The 7000 FC target driver can handle up to 2048 commands in the queue for an HBA . From this feature, the max queue depth per LUN has to be derived for each Host HBA port present in the architecture. In all situations, the target HBA port queue should not be overrun as this will negatively impact the performance due to SCSI timeouts.
In a simple configuration with 1 LUN and one host, the max queue depth on the host cannot be set higher than 2048. If this LUN is going to be shared by N hosts, than for each host the queue depth has to be divided by the number of hosts sharing the LUN.

When configuring multiple LUNs on a 7000 the queue depth per LUN on the host must be set to 2048 divided by the number of LUNs. If those LUNs are going to be shared between multiple hosts, the number has to be further divided by the number of hosts, 2048/(#LUNs * # hosts) - Rounded off to the nearest lower integer number.
But considering that typically the 7000 is going to be used in a active/active cluster configuration, when one of the nodes fails, the corresponding FC port on the remaining node will serve the LUNs that were configured on its counterpart. To be safe use all LUNs on both cluster nodes corresponding HBA parts for the queue depth calculation.

To sum up:
For each 7000 FC target port :
2048/(N * L ) or 2048/(N * L * 2) for 7000 cluster. Where : L is the number of LUNs on a 7000 target port and N the number of hosts sharing the LUNs

Additionally,
For (Open)Solaris, there are two global variables, sd_max_throttle and ssd_max_throttle. Which one is used for a specific HBA depends if it's driver binds itself to either the sd or ssd driver. In both cases the variable controls the queue depth used per target LUN. The default setting for (s)d_max_throttle is 256 . This means that when using more than 8 LUNs per 7000 HBA port, the value of s(s)d_max_throttle has to be lowered.

To set s(s)d_max_throttle, add the following line to the kernel file, /etc/system
set ssd:ssd_max_throttle=x
or
set sd:sd_max_throttle=x

Where x is the max queue depth per LUN as calculated following the above described rules.
A system reboot is required to make the kernel use the newly configured queue depth.

5. Partition Alignment

Refer to <https://blogs.oracle.com/dlutz/entry/partition_alignment_guidelines_for_unified> for a detailed discussion on this topic.

Attachments

This solution has no attachment