Asset ID: |
1-75-1013119.1 |
Update Date: | 2012-07-18 |
Keywords: | |
Solution Type
Troubleshooting Sure
Solution
1013119.1
:
Sun Fire [TM] SF3800/SF4800/SF4810/SF6800 - E4900/E6900 - V1280/E2900 - Netra 1280/1290 : Troubleshooting temperature warnings on multiple components
Related Items |
- Sun Fire E6900 Server
- Sun Fire 3800 Server
- Sun Fire 6800 Server
- Sun Fire E4900 Server
- Sun Netra 1280 Server
- Sun Fire 4800 Server
- Sun Fire V1280 Server
- Sun Fire E2900 Server
- Sun Fire 4810 Server
- Sun Netra 1290 Server
|
Related Categories |
- PLA-Support>Sun Systems>SPARC>Enterprise>SN-SPARC: Exx00
- .Old GCS Categories>Sun Microsystems>Servers>Midrange Servers
- .Old GCS Categories>Sun Microsystems>Servers>Midrange V and Netra Servers
|
PreviouslyPublishedAs
217971
Applies to:
Sun Fire 3800 Server
Sun Fire 4800 Server
Sun Fire 4810 Server
Sun Fire 6800 Server
Sun Fire E2900 Server
All Platforms
Purpose
Description
This document addresses temperature warnings or errors on multiple components (board, repeater, etc) in Sun Fire [TM] 3800, 4800, 4810, E4900, 6800, E6900 and Sun Fire [TM] v1280, E2900, and Netra [TM] 1280, 1290 systems.
This document covers situations where the system is powered on, but warnings or errors related to temperature exist in the configuration. If you have only a single device in the chassis reporting temperature warnings, see <Document:1010052.1> for troubleshooting information related to that specific issue.
Symptoms:
- System components report unusually high or low temperature warnings or messages.
- Fan Tray(s) may be marked Failed in showenvironment output from the System Controller.
- Domain(s) may be unable to be powered on, unable to be "setkeyswitched on", or booted; Or Domain functions may be completely unaffected by the errors, warnings, or Fan Tray status.
- The System Controller (SC) should be accessible
- Other systems in the same area or data center may also be reporting temperature errors, messages, or issues.
Troubleshooting Steps
Steps to Follow
Please validate that each troubleshooting step below is true for your environment. The steps will provide instructions or a link to a document, for validating the step and taking corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Please do not skip a step.
1. Verify this is the only server having high temperature warnings in the environment (data center, rack, etc).
- Log into neighboring systems in the same rack or general area and look at temperature readings to determine whether they are also showing elevated temperatures.
- If multiple systems are showing elevated temperatures, the issue is specific to the site (data center) and the site administrator should investigate the issue.
2. Verify that all Fan Trays are active and ok (not marked FAILED in showenvironment or showlogs).
- Confirm the status as shown in <Document:1011930.1> Sun Fire[TM] (3800-6800) System Controller Application (ScApp) How To's.
- If there is a failed Fan Tray then please follow the instructions detailed in <Document:1008393.1> Troubleshooting Cooling Fan Failures on Sun Fire [TM] Serengeti or LightWeight8 Systems starting after Step 3 of that document.
3. Verify that multiple component's temperature are high as shown in showenvironment output and record which components are implicated.
- Confirm the status as shown in <Document:1011930.1>Sun Fire[TM] (3800-6800) System Controller Application (ScApp) How To's.
- If there is only a single component with an elevated temperature you should use <Document:1010052.1> Troubleshooting temperature warnings on an individual component within a Sun Fire [TM] Serengeti or LightWeight8 system and start on Step 4.
4. Verify that the components in error are not physically located together in chassis.
- Use each system's pictures in the Sun System Handbook to determine where these components are physically located.
- Of importance, determine if all of the components implicated are physically located on the same side of the chassis (left or right, front or back).
- If a clear indication that they are grouped together on a particular side of the chassis, visibly inspect the area around that side of the chassis. Try to determine if there is an obstruction, dirty filters, or some other external blockage associated to the "hot spot".
Be aware of how the system is designed to cool itself when performing a visual inspection of it:
- All of these systems utilize Front to Back cooling.
- For 3800, 48x0, 6800, E4900, and E6900 the Systems Site Planning Guide, Chapter 3.4.1 provides diagrams showing airflow. Two notes are key:
- Any systems mounted in a rack must have front to back cooling (no side to side).
- The front of the cabinet should not be facing, nor be in the path of the exhaust air from any other systems or cabinets.
- The E2900, v1280, n1280, and n1290 also use front to back cooling which is document in their Systems Site Planning Guide. A diagram is not available for these chassis, however.
- Be aware of <Document:1021703.1>
- The Alert details that for E2900 & v1280 the left air filter should be removed and on Netra 1280 & 1290, the air filters should be inspected and/or replaced every 6 months.
5. Confirm the errors persist on the same component when the other SC is main.
- If the errors cease utilizing the new SC, then the former SC is suspect and should be replaced.
- SC failover reference: <Document:1003245.1> Sun Fire[TM] 3800-6900: System Controller failover functionality
6. Collect the following data and collaborate with the next level of support.
- It is preferred that Explorer with the appropriate scextended or 1280extended option as detailed in <Document:1019066.1> How to collect scextended or 1280extended Explorer.
- If Explorer data can not be collected for whatever reason see <Document:1003529.1> Procedure to manually collect Sun Fire[TM] Midrange System Controller level failure data
Internal Comments
At this point, if the customer has validated that each troubleshooting step above is true for their
environment, and the issue still exists, escalate to your next level of technical support.
Previously Published As 91429
References
<NOTE:1003245.1> - Sun Fire[TM] 3800-6900: System Controller failover functionality
<NOTE:1008393.1> - Sun Fire [TM] SF3800/SF4800/SF4810/SF6800 - E4900/E6900 - V1280/E2900 - Netra 1280/1290 : Troubleshooting Cooling Fan Failures
<NOTE:1010052.1> - Troubleshooting temperature warnings on an individual component within a Sun Fire [TM] Serengeti or LightWeight8 system
<NOTE:1011930.1> - Sun Fire[TM] 3800, 4800, 4810, 6800, E4900, and E6900 System Controller Application (ScApp) How To's.
<NOTE:1021703.1> - Potential for System Outages Due to Cooling Issues on Sun Fire V1280/E2900, Netra 1280/1290
<NOTE:1019066.1> - Sun Fire[TM] v1280, 3800, 4800, 4810, 6800, E2900, E4900, E6900 and Netra[TM] 1280, 2900 servers: How to collect scextended or 1280extended Explorer
Attachments
This solution has no attachment