Asset ID: |
1-72-1385644.1 |
Update Date: | 2012-10-10 |
Keywords: | |
Solution Type
Problem Resolution Sure
Solution
1385644.1
:
Sun Storage 7000 Unified Storage System: OpenOwner Entries count increases on ZFS Storage Appliance.
Related Items |
- Sun Storage 7410 Unified Storage System
- Sun Storage 7310 Unified Storage System
- Sun ZFS Storage 7120
- Sun ZFS Storage 7320
- Sun Storage 7110 Unified Storage System
- Sun ZFS Storage 7420
- Sun Storage 7210 Unified Storage System
|
Related Categories |
- PLA-Support>Sun Systems>DISK>NAS>SN-DK: 7xxx NAS
- .Old GCS Categories>Sun Microsystems>Storage - Disk>Unified Storage
- PLA-Support>Database Technology>Engineered Systems>Oracle Exalogic>MW: Exalogic Core
|
In this Document
Created from <SR 3-4913495111>
Applies to:
Sun ZFS Storage 7320 - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 7110 Unified Storage System - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 7410 Unified Storage System - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 7210 Unified Storage System - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 7310 Unified Storage System - Version Not Applicable to Not Applicable [Release N/A]
7000 Appliance OS (Fishworks)
Symptoms
In an environment where clients mount filesystems from the ZFS Storage Appliance utilizing NFSv4 protocol, it was reported that the requests for file access were hanging, and additionally commands such as ls and df were also unresponsive.
The same filesystems mounted via NFS v3 or SMB/CIFS are responsive.
Cause
The limit of entries for the OpenOwner table is exhausted.
This Defect is tracked as Sun <Bug 6976554> on ZFS Unified Storage Appliance side.
Solution
To resolve this issue, please upgrade to appliance software update 2011.1.1.0 or later (or 2011.1.2.1 or later for 7x10 systems with SAS-1 Disk Trays attached).
See <Bug 6976554>. If unable to upgrade the customer must contact Oracle support via an SR. TSC will associate the SR with the CR (assuming a match using confirmation method below) and request relief.
Confirmation:
Request a shared shell session to the appliance. From the OS shell, use the following command to dump out the OpenOwner table.
# echo '::rfs4_db' |mdb -k
The output should look similar to the following:
> ::rfs4_db
rfs4_database=ffffff315939a998
debug_flags=00000000 shutdown: count=0 tables=ffffff30fb27eb08
------------------ Table ------------------- Bkt ------- Indices -------
Address Name Flags Cnt Cnt Pointer Cnt Max
ffffff30fb27eb08 DelegStateID 00000000 0000 2047 ffffff31504f2740 0002 0002
ffffff3158771608 File 00000000 16686 2047 ffffff313e6baa40 0001 0001
ffffff315872b850 Lockowner 00000000 341250 2047 ffffff31593701c0 0002 0002
ffffff30fb5751a0 LockStateID 00000000 341250 2047 ffffff30f5fa2140 0002 0002
ffffff31587a09b8 OpenStateID 00000000 20327 2047 ffffff313f5443c0 0003 0003
ffffff31586cc7c0 OpenOwner 00000000 1048576 2047 ffffff315987c340 0001 0001
ffffff3158788410 ClntIP 00000000 0000 2047 ffffff3158239800 0001 0001
ffffff31581dfcd8 Client 00000000 0027 2047 ffffff315858d880 0002 0002
The bold output above identifies that we have reached the max number of available entries for the OpenOwner table.
The OpenOwner table is exhausted due to a defect in which we do not aggressively reap stale OpenOwner entries for active clients.
If the customer is unable to upgrade right away, we can identify which clients are consuming the majority of the OpenOwner table entries.
Rebooting those clients will provide temporary relief and restore NFSv4 access.
Here is an example where one client has consumed 999670 of the 1048576 of the OpenOwner table entries.
From mdb, we will dump the OpenOwner table and use awk, sort, and uniq to summarize the number of entries for each unique clientid.
# mdb -k
> ::rfs4_oo !awk '/clientid=/{print $5}' |sort |uniq -c
7277 clientid=0x10000a94f4c8a1e,
341 clientid=0x10000ec4f4c8a1e,
589 clientid=0x10000ed4f4c8a1e,
256 clientid=0x10000f04f4c8a1e,
29312 clientid=0x10001554f4c8a1e,
524 clientid=0x10001794f4c8a1e,
264 clientid=0x10001a44f4c8a1e,
12 clientid=0x10001b14f4c8a1e,
6620 clientid=0x10001b64f4c8a1e,
3342 clientid=0x10001b74f4c8a1e,
139 clientid=0x10001bc4f4c8a1e,
999670 clientid=0x10001bf4f4c8a1e,
230 clientid=0x10001c04f4c8a1e,
The bold output above shows the clientid 0x10001bf4f4c8a1e has 999670 of the 1048576 OpenOwner entries. Provide clientid to the ::rfs4_client dcmd command to obtain the memory address for this client's rfs4_client_t structure.
> ::rfs4_client -c 0x10001bf4f4c8a1e
Address dbe clientid confirm_verf NCnfm unlnk cp_confirmed Last Access
ffffff53689841b8 ffffff5368984150 10001bf4f4c8a1e bf01000100000000 False False 0 2012 Apr 6 17:32:07
Address Dbe Client OpenSeq Owner
ffffff594000d058 ffffff594000d008 ffffff53689841b8 2 clientid=0x10001bf4f4c8a1e, owner: 6f70656e2069643a42200bd300bbe150
Address Dbe Client OpenSeq Owner
ffffff594000d278 ffffff594000d228 ffffff53689841b8 2 clientid=0x10001bf4f4c8a1e, owner: 6f70656e2069643aaaa1a9eca66c5fdc
Address Dbe Client OpenSeq Owner
ffffff594000d498 ffffff594000d448 ffffff53689841b8 2 clientid=0x10001bf4f4c8a1e, owner: 6f70656e2069643a5539a05ab95732a5
Address Dbe Client OpenSeq Owner
...
Now print out the rfs4_client_t structure at the memory address identified in bold from the previous output.
> ffffff53689841b8::print rfs4_client_t rc_cbinfo.cb_callback.cb_location.r_addr
rc_cbinfo.cb_callback.cb_location.r_addr = 0xffffff58c9bca1e0 "192.168.10.3.216.239"
The bold output above identifies the IP of the client that can be rebooted to free entries in the OpenOwner table and restore NFSv4 client access.
> $q
Back to <Document 1402579.1> Sun Storage 7000 Unified Storage System: How to Troubleshoot NFS Problems.
References
@ <BUG:6976554> - JDEV: @DEPLOYMENT(CONTEXTROOT="..." DOESN'T APPEAR TO DO ANYTHING FOR JAX-WS
<NOTE:1402579.1> - Sun Storage 7000 Unified Storage System: How to Troubleshoot NFS Problems
Attachments
This solution has no attachment