Document Audience:INTERNAL
Document ID:I1099-1
Title:Sun Fire V440 and Netra 440 systems using a specific networking configuration may unexpectedly reset. Sun Alert: Yes
Copyright Notice:Copyright © 2005 Sun Microsystems, Inc. All Rights Reserved
Update Date:2004-11-23

_____________________________________________________________

*** The following Temporary Communication is not related to the
*** content of the FIN or FCO document below.

Sun (SM) Remote Services (SRS) Net Connect is a collection of Web-delivered
services that provide a way for you to share important systems configuration
information and telemetry to Sun, and for Sun to share our knowledge with you.
Go to www.sun.com/srs/netconnect to download
____________________________________________________________________________

---------------------------------------------------------------------
            - Sun Proprietary/Confidential: Internal Use Only -
---------------------------------------------------------------------  
                        FIELD INFORMATION NOTICE
               (For Authorized Distribution by Sun Services)
FIN #: I1099-1
Synopsis: Sun Fire V440 and Netra 440 systems using a specific networking configuration may unexpectedly reset. Sun Alert: Yes
Create Date: Aug/18/04 

***** Attention:  

This FIN is now Inactive as it has been OBSOLETED by 
Sun Alert 57618.  Please refer to Sun Alert 57618 for
the latest updates for this issue. (Aug/18/04)  

*****
SunAlert: Yes
Top FIN/FCO Report: Yes
Products Reference: Sun Fire V440, Netra 440
Product Category: Server / Server Component
Product Affected: 
Systems Affected:
-----------------  
Mkt_ID   Platform   Model   Description                    Serial Number
------   --------   -----   -----------                    -------------
  -      A42        ALL     Sun Fire V440                      -
  -      N42        ALL     Netra 440                          -


X-Options Affected:
-------------------
Mkt_ID    Platform   Model     Description                 Serial Number
------    --------   -----     -----------                 -------------
  -           -       -           -                            -
Parts Affected: 
----------------------
Part Number    Description   	                Model
-----------    -----------   	                -----
540-5919-XX    FRU,ASSY,MOTHERBOARD,NETRA440      -
540-5418-XX    ASSY,MRBD w/CPU cage, CHLPA        -
References: 
BugID: 5039862
ESC:   551088 
Sun Alert: 57618
Issue Description: 
In an extremely limited number of applications, and with a single system
configuration, the Sun Fire V440 or Netra 440 system may experience an 
unexpected reset and will reboot.

The specific configuration which triggers this situation is as follows:

   Some or all of the data being transferred is transported via the first
   onboard ethernet interface "ce0" (Cassini ASIC).
   
When this issue occurs, the system will reset and an error message appears
on the console.  The system then reboots.  No core files are generated 
and the reset output will not be logged to the /var/adm/messages file.

At system reset, the error message displayed on the console is:

      Fatal Error Reset   
      SC Alert: Host System has Reset

If it is suspected that the V440 is experiencing this issue, change the OBP 
variables as follows to provide more verbose output on the next failure.  

Note: The settings below are only recommended to verify whether the system is 
experiencing this issue and should not be used long term.  Once the failure is 
verified then the parameters should be set back to the recommended (error-reset, 
power-on-reset) settings.  These settings are recommended so that the output 
will be verbose, but the ALOM 64k circular buffer will not get overrun by POST
and Obdiag messages and lose the intended FATAL RESET output.

     diag-switch?    true
     post-trigger    none
     obdiag-trigger  none

When the parameters above are set, the error message will include some 
additional information indicating the reset reason as "PBM FATAL", with a 
PCI IO-Bridge register output similar to:

   ha019 console login:

   Fatal Error Reset
   SC Alert: Host System has Reset

   @(#)OBP 4.10.10 2003/08/29 06:25 Sun Fire V440
   Clearing TLBs
   Loading Configuration
   Membase: 0000.0033.0000.0000
   MemSize: 0000.0000.4000.0000
   Init CPU arrays Done
   Init E$ tags Done
   Setup TLB Done
   MMUs ON
   Scrubbing Tomatillo tags... 0 1
   Block Scrubbing Done
   Find dropin, Copying Done, Size 0000.0000.0000.5ca0
   PC = 0000.07ff.f000.4c88
   PC = 0000.0000.0000.4d28
   Find dropin, (copied), Decompressing Done, Size 0000.0000.0006.6700
   ttya initialized
   System Reset: (PBM FATAL)
   JBUS-PCI bridge
   JBUS-PCI bridge
   slave Error Register: 8000000000001000

This issue could occur when high activity on the ce0 interface, coupled with 
JBUS activity, generates an error on one of the JBUS control lines.
The error condition that causes this situation is extremely rare and is
configuration specific.

There is currently no permanent resolution.  Customer sites experiencing this
issue should use the workaround procedures provided below.
Implementation: 
---
        |   |   MANDATORY (Fully Proactive)
         ---    
         
  
         ---
        |   |   CONTROLLED PROACTIVE (per Sun Geo Plan) 
         --- 
         
                                
         ---
        | X |   REACTIVE (As Required)
         ---
Corrective Action: 
The following recommendation is provided as a guideline for authorized
Sun Services Field Representatives who may encounter the above
mentioned issue.

A long-term corrective action plan is being developed by Sun and will be
delivered via Sun's service team.   In the meantime, utilize the following
workarounds.  These workarounds require some configuration changes.

The first "ce0" (net0) onboard interface should not be used.  To replace the 
functionality of this interface, use either of the following:

a) Use only the second "ce1" (net1) onboard network interface.  

    OR

b) Install a PCI ethernet card in any available PCI slot.  Choosing to place 
the card into a 33MHz slot may lower performance relative to using the card
in a 66MHz slot.

The following Sun card is tested and supported as a workaround for full 
gigabit network replacement functionality:

    X1150A     501-5902-xx Sun GigaSwift Ethernet UTP (Copper)

The following Sun card is tested and supported, but will not provide gigabit
network functionality equivalent to the onboard port:

    X2222A     501-5727-xx Dual Ultra-2 SCSI/Dual FastEthernet PCI Adapter


Additional Procedures:

It is highly recommended that to ensure the "ce0" (net0) is never accessed
inadvertantly in a matter that could trigger this issue (e.g. SunVTS),
that the "ce0" interface be completely disabled.  It is also recommended 
due to Solaris instance numbering, that this be done after initial Solaris 
installation, to ensure net1 is assigned "ce1" instance, instead of "ce0".

To completely disable "ce0" (net0) from the system, use the following 
commands to install an NVRAM script at the OBP "ok" prompt:
1. ok nvedit
2.   0: probe-all install-console banner 
3.   1: " /pci@1c,600000/network@2" $delete-device drop 
4.   2: 
   Type "Ctrl-C" to exit nvedit.
5. ok nvstore
6. ok setenv use-nvramrc? true 
   use-nvramrc? =        true
7. ok reset-all

After the system resets, "ce0" should not be visible by OBP (i.e. you should 
not see a path to "ce0" [/pci@1c,600000/network@2] when you run "show-devs" 
from OBP).  ce0 device should not be seen by Solaris (i.e. prtconf or prtpicl).
Comments: 
None. 


============================================================================
Implementation Footnote: 
i)   In case of MANDATORY FINs, Sun Services will attempt to    
     contact all affected customers to recommend implementation of 
     the FIN. 
   
ii)  For CONTROLLED PROACTIVE FINs, Sun Services mission critical    
     support teams will recommend implementation of the FIN  (to their  
     respective accounts), at the convenience of the customer. 

iii) For REACTIVE FINs, Sun Services will implement the FIN as the   
     need arises.
----------------------------------------------------------------------------
 
All released FINs and FCOs can be accessed using your favorite network 
browser as follows:
 
SunSolve Internal Access:
_______________________
 
* Access the SunSolve Online URL at http://sunsolve.Central/

* From there, select the appropriate link to browse the FIN or FCO index.

Internet Access:
_______________
 
* Access the top level URL of  https://spe.sun.com


FIN/FCO Homepage Access: 
_________________________
 
* Access the top level URL of http://sdpsweb.Central/FIN_FCO/index.html

* From there, select the appropriate link to query or browse the FIN and
  FCO Homepage collections.

To submit either a FIN or FCO refer to the following URLs for templates
and instructions;

*  For FCO: http://pronto.central/fco.html
*  For FIN: http://pronto.central/fin.html
--------------------------------------------------------------------------
General:
--------
* Send questions or comments to [email protected]
---------------------------------------------------------------------------
Statusactive