Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-75-1396100.1
Update Date:2012-06-25
Keywords:

Solution Type  Troubleshooting Sure

Solution  1396100.1 :   Sun Storage 7000 Unified Storage System: Causes and Solutions for Well Known General Networking Problems  


Related Items
  • Sun Storage 7310 Unified Storage System
  •  
  • Sun Storage 7410 Unified Storage System
  •  
  • Sun ZFS Storage 7120
  •  
  • Sun ZFS Storage 7420
  •  
  • Sun Storage 7110 Unified Storage System
  •  
  • Sun ZFS Storage 7320
  •  
  • Sun Storage 7210 Unified Storage System
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>NAS>SN-DK: 7xxx NAS
  •  
  • .Old GCS Categories>Sun Microsystems>Storage - Disk>Unified Storage
  •  




In this Document
Purpose
Troubleshooting Steps
 1. General Checks for Network Outages
 2. Loss of network following configuration change
 3. One or more interfaces in an IPMP group are marked as failed
 4. Other causes of network problems
 Packet loss when using Link Aggregation with devices connected to separate network switches:
 Loss of network connectivity to either or both Management Interface and Client Access to shares when using SNMP to monitor many interfaces:
 nge interfaces may disappear:
 Network Devices only connect at half-duplex:
 On a cluster setting "allow admin" may cause loss of default route on the peer head:
 Loss of network configuration ability in a 7410 cluster:
References


Applies to:

Sun Storage 7310 Unified Storage System - Version Not Applicable to Not Applicable [Release N/A]
Sun ZFS Storage 7420 - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 7210 Unified Storage System - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 7110 Unified Storage System - Version Not Applicable to Not Applicable [Release N/A]
Sun ZFS Storage 7320 - Version Not Applicable to Not Applicable [Release N/A]
7000 Appliance OS (Fishworks)
NAS head revision : [not dependent]
BIOS revision : [not dependent]
ILOM revision : [not dependent]
JBODs Model : [not dependent]
CLUSTER related : [not dependent]


Purpose

This document will list some of the more common causes of loss of general network connectivity.  Where the problem is not specific to one particular data service protocol or clients of a particular directory service, and affects both the Management Interface and client data share access equally.

To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community - 7000 Series ZFS Appliances

Troubleshooting Steps

1. General Checks for Network Outages

First rule out problems with the network infrastructure.
  • Check switch ports and switches for problems.
  • Check network cables for loose or broken connections.
  • Try plugging a different cable into a different switch port if at all possible.
  • If a NIC port on the appliance is still suspected try using a different NIC port on the appliance and configuring a datalink and interface for that device if at all possible.

If the network infrastructure can be ruled out then there are some further issues that can affect network connectivity on the appliance.

Sometimes memory exhaustion, poor choice of pool layout or capacity issues may lead to resource depletion severe enough that time outs can occur for the data service protocols before a reply can be sent to a client's request. These problems are covered in detail in the documents for performance troubleshooting and troubleshooting ZFS storage pool faults.

See the following documents as starting points for troubleshooting these kind of problems :

<Document 1331769.1> "Sun Storage 7000 Unified Storage System: How to Troubleshoot Performance Issues"
<Document 1388529.1> "Sun Storage 7000 Unified Storage System: How to Troubleshoot ZFS Storage Pool Faults"

2. Loss of network following configuration change

Because of the way the 4 layer network model on the appliance works, making changes at the datalink or device levels will mean that the routing and interface levels will need to be removed by the system before the lower level changes can be applied. The interface and routing information should then be applied again once the change has been made. However, if for some reason the change cannot be applied then the interface and routing layers cannot be reapplied and the network connectivity will be lost. An example of this would be when changing an existing datalink to make a Link Aggregation by adding another device. If the switch that the devices are connected to does not use LACP and the correct properties are not selected then the link will not become active and the interface cannot be applied back. For further details on possible impact to routing of changing or removing a datalink, also see :
<Document 1165144.1> "Sun Storage 7000 Unified Storage System: Networking: Default or static routes lost".

3. One or more interfaces in an IPMP group are marked as failed

If probe-based failure detection is used for interfaces in an IPMP group, you may find some or all of the interfaces being failed intermittently. This is often due to a default router that is being used as a target for the ICMP probes from the test interfaces in the IPMP group. These routers can often be "intelligent" switches that can prioritise network traffic when they become busy. A lower priority is given to ICMP traffic which can increase the return times for the ICMP probes to greater than the timeout value set to decide if the interface has failed.
The recommendation is to use link-based IPMP failure detection instead. See <Document 1395461.1> "Sun Storage 7000 Unified Storage System: Best Practice Recommendations for Network Configuration" for details.

4. Other causes of network problems

Packet loss when using Link Aggregation with devices connected to separate network switches:

This is an illegal configuration for most types of switch - see:
<Document 1382065.1> "Aggregation with each interface connected to a separate network switch".

Loss of network connectivity to either or both Management Interface and Client Access to shares when using SNMP to monitor many interfaces:

If SNMP is being used to monitor the appliance and a lot of network interfaces are configured then this can lead to problems with the SNMP service which can then hog CPU resource. Please see:
<Document 1397764.1> "Sun Storage 7000 Unified Storage System: Appliance may hang, run slowly or lose network connectivity when using SNMP to monitor many network interfaces".

nge interfaces may disappear:

If you are experiencing problems with nge interfaces intermittently failing on 7x10 Unified Storage appliances and you are running with appliance kit version less than 2010.08.17.3.1 see:
<Document 1397685.1> "Sun Storage 7000 Unified Storage System: nge interfaces may disappear from the appliance under certain conditions".

Network Devices only connect at half-duplex:

If your BUI shows your network devices are only connected at half-duplex please see:
<Document 1372016.1> "Sun Storage 7000 Unified Storage System: The network connection keeps getting set to half duplex".

On a cluster setting "allow admin" may cause loss of default route on the peer head:

Please see:
<Document 1321373.1> "Sun ZFS Storage Appliance: Default route vanishes on one node when network configuration is changed on other node".

Loss of network configuration ability in a 7410 cluster:

This may be because of mismatched network device names between the cluster heads. Please see:
<Document 1022238.1> "Sun Storage 7410 Recovery procedure for mismatched network device names".

 

Back to <Document 1392086.1> Sun Storage 7000 Unified Storage System: How to Troubleshoot Network Problems.


This document contains normalized content and is managed by the Domain Lead(s) of the respective domains. To notify content owners of a knowledge gap contained in this document, and/or prior to updating this document, please contact the domain engineers that are managing this document via the "Document Feedback" alias(es)

References

<NOTE:1331769.1> - Sun Storage 7000 Unified Storage System: How to Troubleshoot Performance Issues
<NOTE:1388529.1> - Sun Storage 7000 Unified Storage System: How to Troubleshoot ZFS Storage Pool Issues
<NOTE:1372016.1> - Sun Storage 7000 Unified Storage System: The network connection keeps getting set to half duplex
<NOTE:1382065.1> - Aggregation with each interface connected to a seperate network switch
<NOTE:1165144.1> - Sun Storage 7000 Unified Storage System: Networking - Default or static routes lost
<NOTE:1321373.1> - Sun Storage 7000 Unified Storage Systeme: Default route vanishes on one cluster node when network configuration is changed on other node.
<NOTE:1392086.1> - Sun Storage 7000 Unified Storage System: How to Troubleshoot Network Problems
<NOTE:1395461.1> - Sun Storage 7000 Unified Storage System: Best Practice Recommendations for Network Configuration
<NOTE:1397685.1> - Sun Storage 7000 Unified Storage System: nge network interfaces may disappear from the appliance under certain conditions
<NOTE:1397764.1> - Sun Storage 7000 Unified Storage System: Appliance may hang, run slowly or lose network connectivity when using SNMP to monitor many network interfaces
<NOTE:1022238.1> - Sun Storage 7410 recovery procedure for mismatched network device names.

Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback