Asset ID: |
1-75-1013119.1 |
Update Date: | 2011-03-24 |
Keywords: | |
Solution Type
Troubleshooting Sure
Solution
1013119.1
:
Troubleshooting temperature warnings on multiple components within a Sun Fire [TM] Serengeti or LightWeight8 system
Related Items |
- Sun Fire E6900 Server
- Sun Fire 3800 Server
- Sun Fire 6800 Server
- Sun Fire E4900 Server
- Sun Netra 1280 Server
- Sun Fire 4800 Server
- Sun Fire E2900 Server
- Sun Fire V1280 Server
- Sun Fire 4810 Server
- Sun Netra 1290 Server
|
Related Categories |
- GCS>Sun Microsystems>Servers>Midrange V and Netra Servers
- GCS>Sun Microsystems>Servers>Midrange Servers
|
PreviouslyPublishedAs
217971
Applies to:
Sun Fire 3800 Server Sun Fire 4800 Server Sun Fire 4810 Server Sun Fire 6800 Server Sun Fire E2900 Server All Platforms
Purpose
DescriptionThis document addresses temperature warnings or
errors on multiple components (board, repeater, etc) in Sun Fire [TM]
3800, 4800, 4810, E4900, 6800, E6900 and Sun Fire [TM] v1280, E2900, and
Netra [TM] 1280, 1290 systems. This document covers situations
where the system is powered on, but warnings or errors related to
temperature exist in the configuration. If you have only a single device in the chassis reporting
temperature warnings, see <Document:1010052.1> for troubleshooting information
related to that specific issue.
Symptoms:
- System components report unusually
high or low temperature warnings or messages.
- Fan Tray(s) may be
marked
Failed
in
showenvironment
output from the System Controller.
- Domain(s) may be unable to be
powered on, unable to be "setkeyswitched on", or booted; Or Domain functions may be
completely unaffected by the errors, warnings, or Fan Tray
status.
- The System Controller (SC) should be
accessible
- Other systems in the same area or
data center may also be reporting temperature errors, messages, or issues.
Last Review Date
March 24, 2011
Instructions for the Reader
A Troubleshooting Guide is provided to assist
in debugging a specific issue. When possible, diagnostic tools are included in the document
to assist in troubleshooting.
Troubleshooting Details
Steps to FollowPlease validate that each troubleshooting step below is true for your environment. The steps will provide instructions or a link to a document, for validating the step and taking corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Please do not skip a step.1.
Verify this is
the only server having high temperature warnings in the environment
(data center, rack, etc).
- Log into neighboring systems in the same rack or general area
and look at temperature readings to determine whether they are also
showing elevated temperatures.
- If multiple systems are
showing elevated temperatures, the issue is specific to the site
(data center) and the site administrator should investigate the
issue.
2. Verify that all Fan Trays
are active and ok (not marked FAILED in
showenvironment or
showlogs).
-
Confirm the status as shown in <Document:1011930.1> Sun Fire[TM] (3800-6800) System Controller Application (ScApp) How To's.
- If there is a failed Fan Tray then please follow the
instructions detailed in <Document:1008393.1> Troubleshooting Cooling Fan Failures on Sun Fire [TM] Serengeti or LightWeight8 Systems starting
after Step 3 of that document.
3. Verify that multiple
component's temperature are high as shown in
showenvironment
output and
record which components are implicated.
-
Confirm the status as shown in <Document:1011930.1> Sun Fire[TM] (3800-6800) System Controller Application (ScApp) How To's.
- If there is only a single component with an elevated
temperature you should use <Document:1010052.1> Troubleshooting temperature warnings on an individual component within a Sun Fire [TM] Serengeti or LightWeight8 system
and start on Step 4.
4. Verify that the components
in error are not physically located together in
chassis.
- Use each system's pictures in the
Sun
System Handbook to determine where these components are
physically located.
- Of importance, determine if
all of the components implicated are physically located on the same
side of the chassis (left or right, front or back).
- If a clear indication that they are grouped together on a
particular side of the chassis, visibly inspect the area around that side
of the chassis. Try to determine if there is an obstruction, dirty
filters, or some other external blockage associated to the "hot
spot".
Be aware of how the system is designed to cool itself when performing a visual inspection of it:- All of these systems utilize Front to Back cooling.
- For 3800, 48x0, 6800, E4900, and E6900 the Systems Site Planning Guide, Chapter 3.4.1 provides diagrams showing airflow. Two notes are key:
- Any systems mounted in a rack must have front to back cooling (no side to side).
- The front of the cabinet should not be facing, nor be in the path of the exhaust air from any other systems or cabinets.
- The E2900, v1280, n1280, and n1290 also use front to back cooling which is document in their Systems Site Planning Guide. A diagram is not available for these chassis, however.
- Be aware of <Document:1021703.1>
- The Alert details that for E2900 & v1280 the left air filter should be removed and on Netra 1280 & 1290, the air filters should be inspected and/or replaced every 6 months.
5. Confirm the errors persist
on the same component when the other SC is
main.
- If the errors cease utilizing the new SC, then the former
SC is suspect and should be replaced.
- SC failover reference: <Document:1003245.1> Sun Fire[TM] 3800-6900: System Controller failover functionality
6.
Collect the following data and collaborate with the
next level of support.
-
It is preferred that Explorer with the
appropriate scextended or 1280extended option as detailed in <Document:1018748.1> How to Run Sun[TM] Explorer and Forward the Data to a Sun Engineer
- If Explorer data can not be collected for
whatever reason see <Document:1003529.1> Procedure to manually collect Sun Fire[TM] Midrange System Controller level failure data
Internal Comments
At this point, if the customer has validated that each troubleshooting step above is true for their
environment, and the issue still exists, escalate to your next level of technical support.
Previously Published As 91429
Attachments
This solution has no attachment
|