![]() | Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||
Solution Type Problem Resolution Sure Solution 1377069.1 : Sun Storage 7000 Unified Storage System: Shadow Migration Copy Performance Is Slow
In this Document Created from <SR 3-4904244611>
Applies to:Sun ZFS Storage 7320 - Version: Not Applicable to Not Applicable - Release: N/A to N/ASun ZFS Storage 7120 - Version: Not Applicable to Not Applicable [Release: N/A to N/A] Sun Storage 7410 Unified Storage System - Version: Not Applicable to Not Applicable [Release: N/A to N/A] Sun Storage 7310 Unified Storage System - Version: Not Applicable to Not Applicable [Release: N/A to N/A] Sun Storage 7210 Unified Storage System - Version: Not Applicable to Not Applicable [Release: N/A to N/A] 7000 Appliance OS (Fishworks) SymptomsSun Storage 7000 Unified Storage System array Shadow Migration copy jobs have been observed via the BUI to be running for many days, even weeks in some extreme cases. The copy is still going on in the background for the share in question and the operation is taking longer than expected.Shadow Migration Supports NFS filesystems only at this time, use NFS v4 for best results. CauseAs long as Shadow Migration is making progress, even if it is slow, there isn't a lot that can be done to speed it up.If a share to be migrated contains lots (thousands or Millions) of little files and/or has lots of subdirectories, you probably don't want to use Shadow Migration, as this will take a long time to complete. Consider other options such as rsync. Shadow Migration just wasn't built for speed or performance. It was built for completeness and to complete seamlessly in the background. Monitoring progress of a Shadow Migration is difficult given the context in which the operation runs. A single filesystem can shadow all or part of a filesystem, or multiple filesystems with nested mountpoints. As such, there is no way to request statistics about the source and have any confidence in them being 100% accurate. In addition, even with migration of a single filesystem, the methods used to calculate the available size is not consistent across systems. For example, the remote filesystem may use compression, or it may or not include the meta data overhead. For these reasons, it's impossible to display an accurate progress bar for any particular migration. The appliance provides the following information that is guaranteed to be accurate: *Local size of the local filesystem so far *Logical size of the data copied so far *Time spent migrating data so far These values are made available in the BUI and CLI through both the standard filesystem properties as well as properties of the Shadow Migration node (or UI panel). If you know the size of the remote filesystem, you can use this to estimate progress. The size of the data copied consists only of plain file contents that needed to be migrated from the source. Directories, meta data, and extended attributes are not included in this calculation. While the size of the data migrated so far includes only remotely migrated data, resuming background migration may traverse parts of the filesystem that have already been migrated. This can cause it to run fairly quickly while processing these initial directories, and slow down once it reaches portions of the filesystem that have not yet been migrate. While there is no accurate measurement of progress, the appliance does attempt to make an estimation of remaining data based on the assumption of a relatively uniform directory tree. This estimate can range from fairly accurate to completely worthless depending on the set of data, and is for information purposes only. For example, one could have a relatively shallow filesystem tree but have large amounts of data in a single directory that is visited last. In this scenario, the migration will appear almost complete, and then rapidly drop to a very small percentage as this new tree is discovered. Conversely, if that large directory was processed first, then the estimate may assume that all other directories have a similarly large amount of data, and when it finds them mostly empty the estimate quickly rises from a small percentage to nearly complete. The best way to measure progress is to setup a test migration, let it run to completion, and use that value to estimate progress for filesystem of similar layout and size. SolutionAs long as the shadow migration job is making progress, even if it is slow, there isn't a lot that can be done. Shadow migration just wasn't built for speed. It was built for completeness and to be seamless.Increasing Shadow Migration Performance:
EXAMPLE procedure: CLI>:configuration services shadow> show The Advice here is to increase this thread value in stages and try to gauge the impact on other services and array functionality first before increasing it again However, See major Bug above (point 4.) ... Increasing the number of threads would give greater resources to shadow migration but it would also take away resources that may be needed for more critical work. But can potentially lead to deadlock issues and hangs, if not running appliance firmware 2011.1.0 Checking over supplied Support Bundle data from customers who have reported this type of situation has confirmed there are no problems or errors or alerts and no failures or FM events reported that would account for slow Shadow Migration progress. Array's are functioning correctly, just very slowly in terms of progress with Shadow Migration. If a Shadow migration job has been started and is taking a long time, you need to be patient and just let it complete. Dependent on multiple factors like incoming load or other requests and the amount and/or kind of data to copy it could take up to several weeks. Shadow Migration is a background function and will always be given lower priority in the Kernel than serving new IO for client requests. The following section contains internal information, do not share with customers. Useful shell commands: #df -h | grep shadow #df -h | grep shadow | wc -l #iostat -xcnz Possible influence of open bugs: Bug: 6985747 Improving shadow migration pending list processing. Bug: 6963751 shadow migration from netapp -> 7310 drops off to trickle Bug: 6967206 migrating fs having large number of smaller files cause appliance to hang Bug: 6988343 Need a summary for all shadow migration volume References<NOTE:1213714.1> - Sun ZFS Storage Appliance: Performance clues and considerations<NOTE:1213705.1> - Sun Storage 7000 Unified Storage System: Performance issues - Framing the problem <BUG:6985747> - IMPROVING SHADOW MIGRATION PENDING LIST PROCESSING. <BUG:6963751> - SHADOW MIGRATION FROM NETAPP -> 7310 DROPS OFF TO TRICKLE <BUG:6967206> - MIGRATING FS HAVING LARGE NUMBER OF SMALLER FILES CAUSE APPLIANCE TO HANG <BUG:6988343> - NEED A SUMMARY FOR ALL SHADOW MIGRATION VOLUME Attachments This solution has no attachment |
||||||||||||
|