Document Audience: | INTERNAL |
Document ID: | I0768-1 |
Title: | Sun Cluster 2.X software does not support the hardware DR features of Sun Enterprise Servers |
Copyright Notice: | Copyright © 2005 Sun Microsystems, Inc. All Rights Reserved |
Update Date: | 2002-02-06 |
---------------------------------------------------------------------
- Sun Proprietary/Confidential: Internal Use Only -
---------------------------------------------------------------------
FIELD INFORMATION NOTICE
(For Authorized Distribution by SunService)
FIN #: I0768-1
Synopsis: Sun Cluster 2.X software does not support the hardware DR features of Sun Enterprise ServersCreate Date: Feb/06/01
Keywords:
Sun Cluster 2.X software does not support the hardware DR features of Sun Enterprise Servers
SunAlert: No
Top FIN/FCO Report: No
Products Reference: DR features on Sun Enterprise Servers
Product Category: Server / SW Admin
Product Affected:
Systems Affected
----------------
Mkt_ID Platform Model Description Serial Number
------ -------- ----- ----------------------- -------------
- E3000 ALL Ultra Enterprise 3000 -
- E3500 ALL Ultra Enterprise 3500 -
- E4000 ALL Ultra Enterprise 4000 -
- E4500 ALL Ultra Enterprise 4500 -
- E5000 ALL Ultra Enterprise 5000 -
- E5500 ALL Ultra Enterprise 5500 -
- E6000 ALL Ultra Enterprise 6000 -
- E6500 ALL Ultra Enterprise 6500 -
X-Options Affected
------------------
Mkt_ID Platform Model Description Serial Number
------ -------- ----- ----------- -------------
X1073 - - SC 2.1 SCI/SBUS BOARD -
X1252 - - PDB CLUSTER FOUND PKG EXX00 -
X1303 - - PDB CFP ENTERPRISE EXX00 -
Parts Affected:
Part Number Description Model
----------- ----------- -----
300-1260-0X Power/Cooling Module 300W -
300-1301-03 Power Supply 184W -
300-1444-0X Power/Cooling Module 300W SF+ -
370-2345-0X SCI Communications Adapter -
370-2868-0X SCI Communications Adapter -
References:
BugId: 4245807 - Replacement of redundant power supply causes SCI link
failure.
4283290 - SCI goes to link down when inserting power supply.
ESC: 520967 - Replacement of redundant power supply causes SCI link failure.
523082 - SCI goes to link down when inserting power supply.
DOC: 805-6512-05: Sun Enterprise Cluster Hardware Service Manual.
URL: http://suncluster.eng/products/SC2.2/fcs_docs/hardware_service/805-6512.pdf
Issue Description:
When replacing a power supply or other components in a Sun Enterprise
Exx00 server which is part of a Sun Cluster 2.x configuration, it is
important to follow the proper maintenance procedure. Failure to do so
may result in the SCI Adaptor board resetting, and the system becoming
disconnected from the cluster. This can disrupt availability for
customer applications.
Sun Enterprise servers support Dynamic Reconfiguration, or
hot-swapping, of defective power supplies. This works well unless a
system is part of a Sun Cluster configuration. If a new power supply
is inserted while a system is still attached to the cluster, the SCI
Adaptor board might reset, dropping the system from the cluster.
A typical error message appears on the console as:
NOTICE: SCI Adapter 0 : The SCI link is not operational int-csr=0x2000000
cur-csr=0x40419c0 cur-csr-masked=0x4000000
NOTICE: SCI Adapter 0 : Reset, Sync or CRC Error
NOTICE: SCI Adapter 0 : Resetting SCI link
NOTICE: ID[SUNWcluster.sma.smak.4001]: SCI Adapter 0: Card not operational
NOTICE: ID[SUNWcluster.sma.smak.4051]: SCI Adapter 0: Link not operational
This problem may impact any Sun Enterprise Server (E3x00, E4x00, E5x00
and E6x00) with an SCI SBus Communications card AND Sun Cluster 2.x
software.
Use the following commands to identify whether the system is configured
with the above combination which may experience this problem.
. Option 1: /usr/platform/sun4u/sbin/prtdiag -v
. Option 2: /opt/SUNWsma/bin/get_ci_status
Sun Cluster 2.x software was not designed to support the hardware DR
features of Sun Enterprise servers. This was decided because the
cluster software already supports Node Level Failover and supporting
Sun Enterprise hardware DR was not necessary. This could also have
caused other realtime software problems.
The proper procedure for replacing defective server components in one
of these affected system is provided in the Sun Enterprise Cluster System
Hardware Service Manual. Briefly, this process involves first taking
the affected node out of the cluster, and powering the unit down before
attempting to service the unit. After the unit has been serviced,
power up the unit and re-introduce the node back into the cluster.
Implementation:
---
| | MANDATORY (Fully Proactive)
---
---
| | CONTROLLED PROACTIVE (per Sun Geo Plan)
---
---
| X | REACTIVE (As Required)
---
Corrective Action:
The following recommendation is provided as a guideline for authorized
Enterprise Services Field Representatives who may encounter the above
mentioned problem.
Please be aware of and follow the cluster hardware maintenance
procedures as described in Chapters 2, 3 and 9 of the Sun Enterprise
Cluster System Hardware Service Manual, P/N 805-6512-05.
http://suncluster.eng/products/SC2.2/fcs_docs/hardware_service/805-6512.pdf
In order to replace server components, please adhere to the following
procedure:
1. System administrator switches over the data services to another node
in the cluster and removes the affected node from the cluster.
2. Perform the hardware procedure to replace the defective component. This
may involve powering down the system, depending on whether or not a
component (such as a power supply) is hot-swappable. See the manual
for guidelines for specific components.
3. System administrator re-introduces the node to the cluster and performs
any necessary software tasks.
Comments:
None
============================================================================
Implementation Footnote:
i) In case of MANDATORY FINs, Enterprise Services will attempt to
contact all affected customers to recommend implementation of
the FIN.
ii) For CONTROLLED PROACTIVE FINs, Enterprise Services mission critical
support teams will recommend implementation of the FIN (to their
respective accounts), at the convenience of the customer.
iii) For REACTIVE FINs, Enterprise Services will implement the FIN as
the need arises.
----------------------------------------------------------------------------
All released FINs and FCOs can be accessed using your favorite network
browser as follows:
SunWeb Access:
--------------
* Access the top level URL of http://sdpsweb.ebay/FIN_FCO/
* From there, select the appropriate link to query or browse the FIN and
FCO Homepage collections.
SunSolve Online Access:
-----------------------
* Access the SunSolve Online URL at http://sunsolve.Corp/
* From there, select the appropriate link to browse the FIN or FCO index.
Supporting Documents:
---------------------
* Supporting documents for FIN/FCOs can be found on Edist. Edist can be
accessed internally at the following URL: http://edist.corp/.
* From there, follow the hyperlink path of "Enterprise Services Documenta-
tion" and click on "FIN & FCO attachments", then choose the appropriate
folder, FIN or FCO. This will display supporting directories/files for
FINs or FCOs.
Internet Access:
----------------
* Access the top level URL of https://infoserver.Sun.COM
--------------------------------------------------------------------------
General:
--------
* Send questions or comments to [email protected]
--------------------------------------------------------------------------