Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1392492.1
Update Date:2012-05-29
Keywords:

Solution Type  Problem Resolution Sure

Solution  1392492.1 :   Sun Storage 7000 Unified Storage System: Performance issue when pool is almost Full  


Related Items
  • Sun Storage 7410 Unified Storage System
  •  
  • Sun Storage 7310 Unified Storage System
  •  
  • Sun ZFS Storage 7120
  •  
  • Sun Storage 7110 Unified Storage System
  •  
  • Sun ZFS Storage 7320
  •  
  • Sun ZFS Storage 7420
  •  
  • Sun Storage 7210 Unified Storage System
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>NAS>SN-DK: 7xxx NAS
  •  
  • .Old GCS Categories>Sun Microsystems>Storage - Disk>Unified Storage
  •  




In this Document
Symptoms
 Observe
Cause
Solution
References


Created from <SR 3-5113127821>

Applies to:

Sun Storage 7110 Unified Storage System - Version Not Applicable to Not Applicable [Release N/A]
Sun ZFS Storage 7120 - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 7410 Unified Storage System - Version Not Applicable to Not Applicable [Release N/A]
Sun ZFS Storage 7320 - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 7210 Unified Storage System - Version Not Applicable to Not Applicable [Release N/A]
7000 Appliance OS (Fishworks)
NAS head revision : [not dependent]
BIOS revision : [not dependent]
ILOM revision : [not dependent]
JBODs Model : [not dependent]
CLUSTER related : [not dependent]


Symptoms

Performance problems when storage pools are beyond 80% of usage.
Storage pools at more than 80% capacity may experience degraded I/O performance, especially when performing write operations. This degradation can become severe when the pool exceeds 96% full and can result in impaired manageability as the free space available in the storage pool approaches zero. When the capacity crosses the 80% threshold, we start using a better fit algorithm for writes, so it is a little slower. When we cross 96% full we use the best fit algorithm and writes are a lot slower. All storage systems get slower as they get full.

  • - Slow BUI/CLI operation, management hangs
  • - Very lengthy boot times
  • - Connected NFS or CIFs or iSCSI client timeout events due to increased latency
  • - Slow re-silvering times on Storage Pools following drive failures/replacements
  • - Deleting data files causing Command line and or BUI timeouts events
  • - Primary database backups via NDMP failing, because a NAS filesystem mount points showing it's 100% used.
  • - Inability to cancel an in-progress scrub on a pool
  • - Very lengthy or indefinite delays while restarting services such as NFS and SMB.

Observe

  • - Pool usage
cli> status show
Storage:
   pool-0:
      Used     8.58T bytes
      Avail     496B bytes
      State          online
      Compression:   1x
      Dedup:         1.03x
  • - Share usage
Here, default is the project, fredfs01 is a filesystem inside the project.
cli> shares select default select fredfs01 show
     ...
               space_available = 495G
                   space_total = 49K
                    root_group = other

From OS shell check:

# zpool list
  NAME   SIZE   ALLOC  FREE  CAP  DEDUP  HEALTH
  pool-0 9.06T  8.58T  495G  94%  1.03x  ONLINE

# zfs list -o space
  NAME                                              AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
  mypool01                                          8.92T   612K         0     31K              0       582K
  mypool01/local                                    8.92T   111K         0     31K              0        80K
  mypool01/local/default                            8.92T    80K         0     31K              0        49K
  mypool01/local/default/fredfs01                   8.92T    49K       18K     31K              0          0

 

To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community - 7000 Series ZFS Appliances

Cause

Storage Pools with a Capacity approaching full usage.

When a pool is close to full, the ZFS algorithm to look for space differs because it has to deal with any free (and small) blocks to retrieve some space.

When ZFS experiences pool fragmentation, higher IO and CPU consumption will occur as well during the search for suitable blocks for write operations.


Using mdb, look for routines such as "metaslab_alloc", "metaslab_ff_alloc", and "metaslab_activate, metaslab_passivate" as the highest consumers of CPU cycles.

> ::stacks -c space_map_load_wait
THREAD STATE SOBJ COUNT
ffffff007cd61c60 SLEEP CV 8
swtch+0x147
cv_wait+0x61
space_map_load_wait+0x2e
metaslab_activate+0x60
metaslab_group_alloc+0x246
metaslab_alloc_dva+0x2a6
metaslab_alloc+0x9c
zio_dva_allocate+0x57
zio_execute+0x89
taskq_thread+0x1

Solution

Administrators running into this type of performance problem with Arrays running Storage pools that are almost full, are advised to immediately seek to reduce their zpool utilization by removing older unwanted data including snapshots or to increase the size of their pools by adding further disk trays.

There are improvements in later Appliance code versions in the way ZFS looks for space when the pool becomes highly used. Check the current level of Array Appliance firmware running and upgrade to the latest Code:

For latest appliance software revisions see https://wikis.oracle.com/display/FishWorks/Software+Updates

There are improvements in the later Appliance versions in the way ZFS handles internal spacemap load and zio_wait algorithm's to improve performance when pool space usage is closer to 100%.

Back to <Document 1331769.1> Sun Storage 7000 Unified Storage System: How to Troubleshoot Performance Issues.

 

References

<NOTE:1331769.1> - Sun Storage 7000 Unified Storage System: How to Troubleshoot Performance Issues
APPLIANCE SOFTWARE WIKI PAGE: HTTPS://WIKIS.ORACLE.COM/DISPLAY/FISHWORKS/SOFTWARE+UPDATES
<BUG:6525233> - VDEV FULLNESS CAN DEGRADE PERFORMANCE, SHOULD CAUSE ZPOOL TO BECOME DEGRADED
<BUG:6975500> - CAN'T STOP ZFS SCRUB WHEN POOL IS FULL

Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback