Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-75-1398376.1
Update Date:2012-06-25
Keywords:

Solution Type  Troubleshooting Sure

Solution  1398376.1 :   Sun Storage 7000 Unified Storage System: How to get a network trace to assist in troubleshooting network problems  


Related Items
  • Sun Storage 7310 Unified Storage System
  •  
  • Sun Storage 7410 Unified Storage System
  •  
  • Sun ZFS Storage 7120
  •  
  • Sun ZFS Storage 7320
  •  
  • Sun ZFS Storage 7420
  •  
  • Sun Storage 7110 Unified Storage System
  •  
  • Sun Storage 7210 Unified Storage System
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>NAS>SN-DK: 7xxx NAS
  •  
  • .Old GCS Categories>Sun Microsystems>Storage - Disk>Unified Storage
  •  




In this Document
Purpose
Troubleshooting Steps
 Client side trace
 Oracle Solaris
 Wireshark
 Appliance side network trace
References


Applies to:

Sun ZFS Storage 7320 - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 7310 Unified Storage System - Version Not Applicable to Not Applicable [Release N/A]
Sun ZFS Storage 7420 - Version Not Applicable to Not Applicable [Release N/A]
Sun ZFS Storage 7120 - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 7110 Unified Storage System - Version Not Applicable to Not Applicable [Release N/A]
7000 Appliance OS (Fishworks)
NAS head revision : [not dependent]
BIOS revision : [not dependent]
ILOM revision : [not dependent]
JBODs Model : [not dependent]
CLUSTER related : [not dependent]


Purpose

Describes procedures and tools for capturing network traces which may be needed to assist troubleshooting complex network problems.

To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community - 7000 Series ZFS Appliances

Troubleshooting Steps

It will sometimes be necessary to collect network traces to be able to successfully troubleshoot a network problem. A trace can be collected from both ends - one from the client and one from the appliance.
The options chosen when collecting the trace will differ depending upon protocols and networks that are affected.
Make sure that the network traces are captured at the time that the network problem is actually happening. Traces captured after a problem has disappeared will be of no use.

Client side trace

The tools and commands that can be used to collect network traces from the client side will obviously differ depending upon the client operating system that is being used. Once the data has been collected upload it to Oracle - please see <Document 1020199.1> "How to Upload Data to Oracle Such as Explorer and Core Files".

Oracle Solaris

If the client is running Oracle Solaris then a network can be obtained using the snoop command. The options specified to snoop should ensure that only packets directed to the appliance via the network interface that connects to the appliance's data network are captured. If the problem is specific to certain protocols then these can be specified too along with restricting the packets captured to specific ports as well if necessary. This will help to identify where the problem may be occurring rather than data getting lost in a large amount of output.
The snoop data should be collected in a binary output file by using the -o <output file> option. The -s option limits the packet size to 200 bytes only and might be of interest if the network is very active and we do not want to drop any packets in the network capture - see man snoop(1m).
e.g. the command:

# snoop -P -d hme0 -o snoop-output <-s 200> <appliance hostname or IP address on data network>.


will collect all packets sent via the interface hme0 that are addressed to the appliance on the newtork that the data shares are shared upon. If the -P and the appliance host information were not specified then all packets to all hosts sent or received on interface hme0 will be captured. The output will be collected in the file "snoop-output" and this can be uploaded to Oracle Support.

Wireshark

Wireshark is a freely available program that can be used to capture network traces. It has versions available for Windows clients as well as Oracle Solaris, and several popular distributions of Linux.  Please note that Wireshark is a third party program and is not supported in any way by Oracle. However traces captured using Wireshark can be analyzed by Oracle support.
Again the Wireshark options should be used to restrict the packets collected to those that are specifically directed to the appliance's IP address via the network interface connected to the same data network as the appliance.

Appliance side network trace

To capture a trace from the appliance it will be necessary to raise a Service Request with Oracle as this will involve running commands from the OS shell.


To capture a network trace from the appliance, drop to the OS shell and do the following.

# cd /var/ak/dropbox

Example 1: Capture network packets to a particular host.
# snoop -Pq -d <data network interface> -o <output file> <client address>

Where:
<data network interface> is the network interface on the network that the shares are shared out upon. Make sure that if there are several of these that the correct interface is chosen for the group of clients that are experiencing problems.
<client address> is the IP address or host name of a particular client that is experiencing the problems.
The "-P" option will capture packets in non-promiscuous mode. Only broadcast, multicast, or packets addressed to the specified host will be captured.
The q switch will use "quiet mode" so that output is not displayed to the screen but only captured in the specified output file.
Use Ctrl-C to stop the snoop.

Example 2: Capture network packets except those on particular ports whilst running the snoop in the background.
# snoop -q -d nge0 -o /var/ak/dropbox/snoop-nge0.cap ! port 22 and ! port 215 and ! port 2049 &
# snoop -q -d nge1 -o /var/ak/dropbox/snoop-nge1.cap not port 22 and not port 215 and not port 2049 &
# jobs
Running snoop -d nge0 -o /var/ak/dropbox/snoop-nge0.cap ! port 22 and ! port 215 and ! port 2049
Running snoop -d nge1 -o /var/ak/dropbox/snoop-nge1.cap not port 22 and not port 215 and not port 2049
# kill %1 %2 (or pkill snoop)
1- Done snoop -d nge0 -o /var/ak/dropbox/snoop-nge0.cap ! port 22 and ! port 215 and ! port 2049
2+ Done snoop -d nge1 -o /var/ak/dropbox/snoop-nge1.cap not port 22 and not port 215 and not port 2049

Where:
-q = quiet
! = not
22 = ssh
215 = BUI interface
2049 = nfs (unblock this port if nfs is the problem)

So this will capture packets to all hosts on networks that the nge0 and nge1 devices are attached to. But will not capture ssh, BUI or nfs packets.
Using the "&" will run the jobs in the background. This is useful if the snoop command will be run for a long period of time as it will enable other commands to be run in a shared shell session whilst the snoop still runs in the background.

If the snoop output is captured in a file in /var/ak/dropbox then it will be collected by a support bundle the next time that a bundle is run.
If the problem happens intermittently then it may be useful to create a workflow containing the snoop command that can be given to the customer to run via the BUI when they experience the problem.

Once you have the snoop output files collected in a bundle they can be viewed with the command:

# snoop -ri <output file> | less


For further examples and more detail see the Amber Road Rotation Snoop page.

Back to <Document 1392086.1> Sun Storage 7000 Unified Storage System: How to Troubleshoot Network Problems.

References

<NOTE:1020199.1> - How to Upload Data to Oracle Such as Explorer and Core Files
<NOTE:1392086.1> - Sun Storage 7000 Unified Storage System: How to Troubleshoot Network Problems

Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback