Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-73-1000668.1
Update Date:2010-08-24
Keywords:

Solution Type  FAB (standard) Sure

Solution  1000668.1 :   The thermcal utility should be run if a Centerplane or Centerplane Support Board (CSB) is replaced on the Sun Fire 12K/15K/E20K/E25K.  


Related Items
  • Sun Fire E25K Server
  •  
  • Sun Fire E20K Server
  •  
  • Sun Fire 12K Server
  •  
  • Sun Fire 15K Server
  •  
Related Categories
  • GCS>Sun Microsystems>Sun FAB>Standard>Reactive
  •  

PreviouslyPublishedAs
200880


Product
Sun Fire 12K Server
Sun Fire E20K Server
Sun Fire 15K Server
Sun Fire E25K Server

Bug Id
<SUNBUG: 4959202> - SMS should improve ASIC temperature calculation precision.
<SUNBUG: 4959175> - Thermcal does not indicate SMS needs to be restarted.

Part
  • Part No: 501-4936-XX
  • Part Description: Sun Fireplane Interconnect
Part
  • Part No: 501-5378-XX
  • Part Description: ASSY ECB MECH CP SUPRT STARCAT

Impact

When a Centerplane or Centerplane Support Board (CSB) is replaced in a Sun Fire 12K/15K/E20K/E25K Server, the thermcal utility must be executed following replacement. Failure to implement this action may lead to over/under temperature conditions being reported by the ESMD daemon for the ASICS on the logical half of the Centerplane.

An additional impact is that the the logical half of the Centerplane (CP) needs to remain powered off for a 30 minute period. As this is part of the process to replace a CSB unit, the logical CP will have to be powered off anyway. In the case of Centerplane replacement, thermcal must be run on both logical halfs. After the thermcal operation is complete, SMS on the Main SC will have to be stopped and restarted.

Any situation where a new combination of CSB and logical CP half are combined may cause this issue. In this case, thermcal will have to be executed.  Below is a sample of ASIC temperatures recorded before and after a specific CSB replacement. Note the differences for the same ASIC locations.

Before replacing CS1:

sms-svc> showenvironment -p temps | grep CP

SCPER at SCPER1 max1617a AMB 0 Temp 24.00 C 13.0 sec OK
SCPER at SCPER1 max1617a AMB 1 Temp 24.00 C 13.0 sec OK
SCPER at SCPER1 max1617a AMB 2 Temp 24.00 C 13.0 sec OK
CP at CP0 dmx0 DMX0 Temp 29.87 C 12.8 sec OK
CP at CP0 dmx1 DMX1 Temp 29.83 C 12.8 sec OK
CP at CP0 dmx3 DMX3 Temp 31.92 C 12.8 sec OK
CP at CP0 dmx5 DMX5 Temp 29.95 C 12.8 sec OK
CP at CP0 amx0 AMX0 Temp 31.74 C 12.8 sec OK
CP at CP0 amx1 AMX1 Temp 31.62 C 12.8 sec OK
CP at CP0 rmx RMX Temp 29.78 C 12.8 sec OK
CP at CP0 darb DARB Temp 30.01 C 12.8 sec OK
CP at CP1 dmx0 DMX0 Temp 30.12 C 12.6 sec OK
CP at CP1 dmx1 DMX1 Temp 32.26 C 12.6 sec OK
CP at CP1 dmx3 DMX3 Temp 30.08 C 12.6 sec OK
CP at CP1 dmx5 DMX5 Temp 30.21 C 12.6 sec OK
CP at CP1 amx0 AMX0 Temp 29.87 C 12.6 sec OK
CP at CP1 amx1 AMX1 Temp 31.81 C 12.6 sec OK
CP at CP1 rmx RMX Temp 31.95 C 12.6 sec OK
CP at CP1 darb DARB Temp 30.05 C 12.6 sec OK

After replacing CS1 without running thermcal:

sms-svc> showenvironment -p temps | grep CP

SCPER at SCPER1 max1617a AMB 0 Temp 24.00 C 22.4 sec OK
SCPER at SCPER1 max1617a AMB 1 Temp 24.00 C 22.4 sec OK
SCPER at SCPER1 max1617a AMB 2 Temp 24.00 C 22.4 sec OK
CP at CP0 dmx0 DMX0 Temp 27.91 C 22.3 sec OK
CP at CP0 dmx1 DMX1 Temp 29.83 C 22.3 sec OK
CP at CP0 dmx3 DMX3 Temp 29.93 C 22.3 sec OK
CP at CP0 dmx5 DMX5 Temp 27.96 C 22.3 sec OK
CP at CP0 amx0 AMX0 Temp 29.79 C 22.3 sec OK
CP at CP0 amx1 AMX1 Temp 29.70 C 22.3 sec OK
CP at CP0 rmx RMX Temp 29.78 C 22.3 sec OK
CP at CP0 darb DARB Temp 28.00 C 22.3 sec OK
CP at CP1 dmx0 DMX0 Temp 50.92 C 22.1 sec OK
CP at CP1 dmx1 DMX1 Temp 49.09 C 22.1 sec OK
CP at CP1 dmx3 DMX3 Temp 46.55 C 22.1 sec OK
CP at CP1 dmx5 DMX5 Temp 49.18 C 22.1 sec OK
CP at CP1 amx0 AMX0 Temp 47.80 C 22.1 sec OK
CP at CP1 amx1 AMX1 Temp 49.74 C 22.1 sec OK
CP at CP1 rmx RMX Temp 52.23 C 22.1 sec OK
CP at CP1 darb DARB Temp 48.55 C 22.1 sec OK

For the Environment Sensor Monitoring Daemon (ESMD) to correctly monitor the temperatures of the CP ASICs it is required for a thermcal operation to run to profile each set of ASICs on each logical half of the CP. This generates seed values used by ESMD to calculate the ASIC temperatures. While the ASICs exist on the CP, the circuit which is used to measure this profile exists on the CSB. In effect, they form a unique pair, and these values generated are only valid for this pair of CSB and logical CP. If the CP or CSB unit is replaced, thermcal needs to be rerun for this new unique pair.


Symptoms


Resolution

Run the "thermcal" utility following any Centerplane or CSB replacement on a Sun Fire 12K/15K/E20K/E25K Server. Special instructions will be issued when an authorization code is requested for a Centerplane or CSB replacement.

As of SMS 1.5, thermcal and the accompanying man page thermcal (1m) are bundled.  The executable is /opt/SUNWSMS/bin/thermcal.  For previous versions of SMS, a thermcal binary and instructions on its usage are available at the URL below:

http://pts-platform/twiki/bin/view/Products/SCthermcal


Modification History
Date: 15-NOV-2005
  • Added requirement for running thermcal when Centerplane is changed. 
  • Thermcal now bundled in SMS 1.5

Date: 18-JUN-2004
  • Changed the website link from: http://pts-americas/esq/hsq/starcat/tools/thermcal.html to http://pts-americas.west/esq/hsq/starcat/tools/thermcal.html in both the reference and corrective action sections.

 



Previously Published As
100597
Internal Comments


None.


Related Information
  • URL: http://pts-platform/twiki/bin/view/Products/SCthermcal

Internal Contributor/submitter
[email protected]

Internal Eng Business Unit Group
KE Authors

Internal Eng Responsible Engineer
[email protected]

Internal Services Knowledge Engineer
[email protected]

Internal Kasp FAB Legacy ID
100597, I1071-1 (FIN)

Internal Sun Alert & FAB Admin Info
Critical Category:
Significant Change Date:
Avoidance: Patch
Responsible Manager: null
Original Admin Info: [WF 11-15-2005 shassall: added requirement for running thermcal when Centerplane is changed. Thermcal now bundled in SMS 1.5 New website: http://pts-platform/twiki/bin/view/Products/SCthermcal]

Internal SA-FAB Eng Submission
The thermcal utility should be run if a Centerplane or Centerplane Support Board (CSB) is replaced on the Sun Fire 12K/15K/E20K/E25K .

Product_uuid
077fd4c5-df8f-4320-ad69-7d01603a674d|Sun Fire 12K Server
1404a2d3-059a-11d8-84cb-080020a9ed93|Sun Fire E20K Server
29e4659c-0a18-11d6-9fa1-e67bbc033df8|Sun Fire 15K Server
d842dd03-059b-11d8-84cb-080020a9ed93|Sun Fire E25K Server

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback