Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Technical Instruction Sure Solution 1011650.1 : Sun Enterprise[TM] 3X00-6X00 Servers: Board Temperature Information
PreviouslyPublishedAs 215972
Applies to:Sun Enterprise 3000 ServerSun Enterprise 3500 Server Sun Enterprise 4000 Server Sun Enterprise 4500 Server Sun Enterprise 5000 Server All Platforms GoalThis document provides an optimal temperature specification for Sun Enterprise[TM] classic systems. This document also describes how to tune the system environment to eliminate most known, transient errors.SolutionSun Enterprise[TM] 3X00-6X00 servers are tested for operation in ambient temperatures ranging from 0 to 68 degrees centigrade (32 to 154 degrees Fahrenheit). Each CPU/memory module has a thermistor installed below each of the processor boards. The analog output from each thermistor is fed to an analog-to-digital converter, and the resulting value is placed in a system register for reading by software. The same implementation is used on the I/O boards and the clock board, so accurate temperature readings are maintained for all core system boards. The sampled temperature is used to drive the speed of the cooling fans enclosed in the 300 watt Power Cooling Modules (PCMs). Note: A memory-only CPU/memory board does not provide any temperature data because no thermistors are installed for monitoring the temperatures of DIMMs. However, memory DIMMs do not generate a significant amount of heat, so system reliability is not adversely affected in any way. Software control is performed using a polling mechanism implemented in the Solaris[TM] Operating System (Solaris OS) that reads the temperature registers every 2 seconds. If the temperature reaches a "Yellow Zone" threshold, the system, using console messages, emits warnings. If the temperature reaches a "Red Zone" threshold, the system continues and repeats the warning. If the temperature for the affected component stays in the red zone for 20 seconds or longer, the system either powers down the component or powers itself down entirely, depending on the implementation level of the product. Monitoring software sets the "Yellow Zone" at 60 degrees celsius for CPU/memory Boards, I/O boards and the clock board. The "Red Zone" is set to temperatures at 68 degrees celsius on all boards. Finding Nominal Temperatures of the System Boards: A recent study of transient system errors shows that these errors could be greatly reduced by maintaining an environment that is optimal for the hardware. Temperature and humidity auditing can provide you with data to achieve these optimal temperatures. An intake temperature of 70 Deg F or 21.11 Deg C and an RH% 45% - 50% should bring a Sun Enterprise classic server into compliance to achieve optimal numbers. 2) The next step is to find the present nominal temperature. To find the present nominal temperature, obtain the output of a prtdiag -v command from the suspect system. The output should be current and obtained after at least 168 hours (7 days) of up time. Simply add the temperatures of all the CPU/Memory boards (because the CPU/Memory boards contain the most temperature critical components), and then divide the total by the number of CPU/Memory boards to get an average. For Example:
System Temperatures (Celsius) Board State Current Min Max Trend In the example, the nominal temperature of this system's system boards is 30.5C: 29+29+30+31+32+32+33=183/6=30.5 ASIC Revisions Brd FHC AC SBus0 SBus1 PCI0 PCI1 FEPS Board Type Specifications for Temperature Zones Solaris OS 2.5.1 with patch 103640-33 or later When the patches are installed, warning messages appear at 60 degrees C, and a power down sequence of overheated CPU modules occurs at a new danger limit setting of 68 degrees C. These temperatures are lower than the standard default limits of 73 degrees C (for warning messages) and 83 degrees C (for a danger limit). --------------------------------------------------------------------------- CPU 24C - 60C 68C 28C-32C 30C Note: 1) Severe temperature or relative humidity swings should be avoided. 2) A CPU temperature above 40 degrees centigrade might be within *** Operating: 5 C to 35 C (41 F to 95 F) *** If the room is generally in compliance and the system boards are running above 40 degrees C., it is possible that the machine is installed in a "hotspot." You should investigate and consider managing cooling around the machine to bring it into compliance. 0 OK 29 28 32 stable<-temp shouldn't vary more then 5.5 Deg C Product Solaris 2.5.1 Solaris 2.6 Operating System Solaris 7 Operating System Solaris 8 Operating System Sun Enterprise 6500 Server Sun Enterprise 6000 Server Sun Enterprise 5500 Server Sun Enterprise 5000 Server Sun Enterprise 4500 Server Sun Enterprise 4000 Server Sun Enterprise 3500 Server Sun Enterprise 3000 Server Internal Comments For internal Sun use only. Many DTAG ETAG UE and correctable errors are caused by environmental conditions,
Attachments This solution has no attachment |
||||||||||||
|