Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-75-1394484.1
Update Date:2012-06-25
Keywords:

Solution Type  Troubleshooting Sure

Solution  1394484.1 :   Sun Storage 7000 Unified Storage System: Gathering diagnostic data for network problems.  


Related Items
  • Sun Storage 7410 Unified Storage System
  •  
  • Sun Storage 7310 Unified Storage System
  •  
  • Sun ZFS Storage 7120
  •  
  • Sun Storage 7110 Unified Storage System
  •  
  • Sun ZFS Storage 7320
  •  
  • Sun ZFS Storage 7420
  •  
  • Sun Storage 7210 Unified Storage System
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>NAS>SN-DK: 7xxx NAS
  •  
  • .Old GCS Categories>Sun Microsystems>Storage - Disk>Unified Storage
  •  




In this Document
Purpose
Troubleshooting Steps
 1. Initial Checks of Network Status
 2. Checking Hardware and Logs
 2.1 Appliance Logs
 2.2 Service Logs
 2.3 Client Logs
 3. Network Traces
 4. Further assistance required
References


Applies to:

Sun Storage 7110 Unified Storage System - Version Not Applicable to Not Applicable [Release N/A]
Sun ZFS Storage 7320 - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 7310 Unified Storage System - Version Not Applicable to Not Applicable [Release N/A]
Sun ZFS Storage 7420 - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 7410 Unified Storage System - Version Not Applicable to Not Applicable [Release N/A]
7000 Appliance OS (Fishworks)
NAS head revision : [not dependent]
BIOS revision : [not dependent]
ILOM revision : [not dependent]
JBODs Model : [not dependent]
CLUSTER related : [not dependent]


Purpose

The purpose of this document is to inform the reader where to look to gather data for network problems on the 7000 series appliance, or on clients accessing an appliance's shares. Tools and techniques will be discussed that should help the reader gather data to proceed towards a cause for the problem themselves, or to gather data to supply to Oracle Support representatives to assist in diagnosis.

To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community - 7000 Series ZFS Appliances

Troubleshooting Steps

1. Initial Checks of Network Status

Firstly check the status of your network via the BUI or CLI.

BUI :

Configuration > NETWORK > Addresses

This will show the status of your datalinks and interfaces along with IP addresses, subnets and hostnames. A green LED indicates the interface is active, a yellow LED indicates a failure or that an interface has been configured with a duplicate IP address.

CLI:

appliance:> configuration net devices show

Shows the status of the physical network interface, whether it is up, what speed it is connected at and the MAC address.

appliance:> configuration net datalinks show

Shows whether the link from the network interface out to the network infrastructure is active, the type of datalink (device, LACP, VLAN) and what physical device(s) the link is connected to.

appliance:> configuration net interfaces show

Shows the IP address and subnet of the interface, what datalink(s) it uses, whether it is part of an IPMP group and whether it is active. If you do see a failure of an interface or datalink it is unlikely that the associated physical device has failed, as the Network Interface Cards themselves rarely fail. Checking the hardware status and the logs (as shown in section 2) should help to confirm that the appliance hardware is indeed operational. It is much more likely that the cause is a switch or other network component failure external to the appliance, a configuration issue with the external network or the appliance, or a protocol or driver problem.

The next check is therefore to see whether the appliance routing configuration allows clients to connect to the networks that the shares are on, or that the Management Interface is available on.

BUI :

Configuration > NETWORK > Routing

Check that you see a green LED next to each route to show it is available.

CLI :

appliance:> configuration net routing show

From the CLI you can use ping and traceroute to ensure that gateways and clients are reachable from the appliance and to map the routes that are taken.

appliance:> ping <target IP address>
appliance:> traceroute <target IP address>

2. Checking Hardware and Logs

The hardware status of the Network Interface Cards can be seen by looking at the "maintenance" section of the BUI or CLI. The network cards will all be in PCI slots in the head unit which will be chassis-000 from the CLI, note these commands do not show the status of the "onboard" network interfaces, only those extra network cards installed in the PCI slots:

BUI :

Maintenance > HARDWARE > Slot

CLI :

appliance:>maintenance hardware select chassis-000 select slot show

If there are any currently active problems that have been detected by the system you will find them under

BUI :

Maintenance > Problems

CLI :

appliance:> maintenance problems show

2.1 Appliance Logs

There are logs of appliance related events that are available through the BUI and CLI. The most useful for troubleshooting network problems are the ALERTS, FAULTS and SYSTEM logs:

ALERTS - This is a log of key events during appliance operation
FAULTS - This log records hardware and software faults. Look here to find records of services failing to start and errors from PCI slots that cards may be installed in.
SYSTEM - This is the Operating System log. Problems with drivers and data service protocol messages will be logged here.

In the BUI you can see the contents of these logs by selecting the appropriate log from

Maintenance > LOGS

To check the logs from the CLI use the commands:

appliance:> maintenance logs select alert show
appliance:> maintenance logs select fltlog show
appliance:> maintenance logs select system show

2.2 Service Logs

The BUI also shows logs for the SMF services that are responsible for starting up the Data and Directory Services, and setting the System Settings. Check these logs if one of these services is needed in your configuration but is showing as offline or faulted.  The reason for the service not starting should be shown in the appropriate log. For instance if the FAULTS log above showed that the NFS svc:/network/nfs/server:default service was offline you would check

Configuration > SERVICES > NFS > Logs

and from the drop-down box listing the various services that comprise the NFS service you would select "Log of network-nfs-server:default".


If the Management Interface is not available check the logs from the OS shell.
To check for system faults look at
/var/ak/logs/debug.sys
/var/ak/logs/system.sys

To check the alerts log /var/ak/logs/alert.ak you use the aklog utility. The archived alert logs can be read by specifying the path to the log e.g.:

# aklog alert
# aklog /var/ak/logs/alert.ak.0

Currently active faults can be seen using the fmadm command. A historical listing of faults that have been found on the system but may no longer be present can be seen with the -a switch. e.g.:

# fmadm faulty
# fmadm faulty -a

The status of the services can be checked with

# svcs -xv

to show any faulted services. Check the log file specified in the output from the above command for the reasons why the service was faulted.

It may be necessary to enable debugging for some network services to assist diagnosis of more complex problems. Please see the Beehive Amber Road Support wiki for further details on this. If you are not able to view this link please contact a member of the NAS support team to assist you.

2.3 Client Logs

Often the most useful error messages in troubleshooting network problems are those displayed by the clients. These will help in isolating the source of a problem if the root cause is not a problem with the appliance but the network infrastructure.
The location of client logs will be dependent upon the operating system of that client. Some examples of the usual locations of the system logs will be

Linux - /var/log/messages
Solaris - /var/adm/messages
Windows - The "system log" viewable through the "Event Viewer".

Please collect all examples of client log messages relating to the network problem when engaging Oracle Support for further troubleshooting assistance.

3. Network Traces

For problems where the reasons why a client is unable to access a share are less obvious, it will often be necessary to get a network trace. Usually a trace from both the client involved and the appliance is needed. The method used for acquiring the network trace from the client will be dependent on the client's operating system and the protocol that is being used. To get a network trace from the appliance will require assistance from an Oracle Support representative. To do this a Service Request will need to be opened.
Please see <Document 1398376.1> "Sun Storage 7000 Unified Storage System: How to get a network trace to assist in troubleshooting network problems" for more details.

4. Further assistance required

Finally if the cause of the network problem still cannot be isolated please engage Oracle Support by opening a Service Request to assist you further. Please include all the relevant details and information including examples of any errors that seen on the appliance and clients, along with an accurate problem description in the SR notes.
If the problem is complex then a support bundle should also be obtained and uploaded to Oracle.  Please see <Document 1019887.1> "Sun Storage 7000 Unified Storage System: How to collect supportfile bundle using the BUI or CLI".
To upload a network trace or other diagnostic file other than a support bundle please see <Document 1020199.1> "How to Upload Data to Oracle Such as Explorer and Core Files".

 

Back to <Document 1392086.1> Sun Storage 7000 Unified Storage System: How to Troubleshoot Network Problems.

This document contains normalized content and is managed by the Domain Lead(s) of the respective domains. To notify content owners of a knowledge gap contained in this document, and/or prior to updating this document, please contact the domain engineers that are managing this document via the "Document Feedback" alias(es)

References

<NOTE:1019887.1> - Sun Storage 7000 Unified Storage System: How to collect a supportbundle using the BUI or CLI
<NOTE:1020199.1> - How to Upload Data to Oracle Such as Explorer and Core Files
<NOTE:1392086.1> - Sun Storage 7000 Unified Storage System: How to Troubleshoot Network Problems
<NOTE:1398376.1> - Sun Storage 7000 Unified Storage System: How to get a network trace to assist in troubleshooting network problems

Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback