![]() | Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||
Solution Type Problem Resolution Sure Solution 1390444.1 : T3-1 : Sub-optimal I/O write performance on multi initiator disks
In this Document
Oracle Confidential (PARTNER). Do not distribute to customers Created from <SR 3-4908878978>
Applies to:SPARC T3-1 - Version: Not Applicable to Not Applicable - Release: N/A to N/AInformation in this document applies to any platform. SymptomsT3-1 systems have dual LSI disk controllers and when configured with the x16 slot SAS backplane provides multi initiator access to all internal drives, allowing multipath access via MPxIO when SAS zoning is not enabled.By default MPxIO uses round robin based load balancing which means I/O will be spread across both target ports and can lead to poor write performance when the cache is overrun. # time dd if=/dev/zero of=/seagate/1g bs=1024k count=1024This is only seen when performing writes to disks under ZFS control, raw and UFS writes report faster completion times; # time dd if=/dev/zero of=/dev/rdsk/c0t5000C5000A913CEFd0 bs=1024k count=1024During testing it was found that Seagate drives performed significantly worse than Hitachi models; # time dd if=/dev/zero of=/hitachi/1g bs=1024k count=1024 CauseSequential I/O is not streamed when writes are sent down multiple paths in round robin based load balancing, this causes poor write throughput.SolutionThe fix is to switch to logical block based load balancing (LBA) - data up to a certain size range will be pushed down a single path, this range is determined by a variable defined in the scsi_vhci configuration.Firstly add the following to /kernel/drv/scsi_vhci.conf for the impacted drive, in this example we use Seagate (ST product ID); device-type-mpxio-options-list ='region-size' determines how much data goes down a single path - region in this instance refers to the multipathed device. This parameter is to the power of 2 which gives the byte blocks, divided by 2 for the kbytes and then 1024 for megabytes - for example; 16 to the power of 2 = 65,536 byte blocks /2 = 32,768 kbytes /1024 = 32MB is the region size, meaning data blocks of 32MB will be sent to a single path before switching to the second target. Tuning of the region-size will be dependent on customer requirements and load, however the default is 18 which gives a 128MB region size. This is in contrast to sequential writes in round robin that will ping pong between paths for each block size write. Reboot the system for the changes to take effect, review dmesg and mpathadm output to confirm the correct policy is applied; 0. c0t5000C5000A913CEFd0 <SEAGATE-ST930003SSUN300G-0D70-279.40GB>Multi initiator is disabled for systems with SAS zoning enabled, so this only impacts systems prior to 147034-01 being applied - or when the zoning FW has been updated but has been disabled via 'zoningcli'. References<BUG:6930636> - USE LBA LOAD BALANCING TO WORK AROUND 6929352 AND SIMILAR<NOTE:1332352.1> - Customers configuring HW RAID with 16-Slot Disk Backplanes on a SPARC T3-1 system could corrupt data. Attachments This solution has no attachment |
||||||||||||
|