![]() | Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition | ||
|
|
![]() |
|||||||||||||||||||||||||||||||||||||||||||||||||||||
Solution Type Predictive Self-Healing Sure Solution 1392829.1 : Netra CT900: How to read ShMM's debug.log file
When debugging Netra CT900 problems --- especially chassis related problems --- /tmp/debug.log file from the ShMM is very useful data/resources to identify possible RC of the problem. This article shows how to read into the /tmp/debug.log file of ShMM. In this Document
Applies to:Sun Netra CT900 Server - Version: Not ApplicableSun Netra CT900 Server - Version: Not Applicable and later [Release: N/A and later] Information in this document applies to any platform. PurposeFor Netra CT900 related problem, more than often, the /tmp/debug.log file from the ShMM is require to look into the possible RC of the problem --- especially it is a chassis related problem.This article shows how to read through /tmp/debug.log file and what are the points of interest regarding to certain type of problems. Netra CT900: How to read ShMM's debug.log fileData CollectingIf /tmp/debug.log file does not exist on "Active" ShMM, execute the /etc/summary script to generate it:# ./summary PING 10.133.104.1 (10.133.104.1): 56 data bytes 84 bytes from 10.133.104.1: icmp_seq=0 ttl=255 time=8.4 ms --- 10.133.104.1 ping statistics --- 1 packets transmitted, 1 packets received, 0% packet loss round-trip min/avg/max = 8.4/8.4/8.4 ms send debug file /tmp/debug.log to PPS # Content of debug.log fileThe /tmp/debug.log file is an equivalent of Explorer files to ShMM. Each session is separated by a header. Here is a list of header in a /tmp/debug.log file:
Type of IssuesThere are several types of chassis problems that could use debug.log file to locate the RC.FAN TRAY/TEMPERATURE ISSUESFor fan try/temperature related issues,
NOTE: List of sensors could also be obtained by the following command form ShMM: # clia sensor <addr> | grep Sensor where <addr> could be <IPMB address of blade>, "board <N>", or "shm <N>". PEM (Power Entry Module) ISSUESPEM related issues does not need debug.log files as most of the data needed are collected form "[clia] sensordata" and "[clia] getfruledstate" commands.
POWER ISSUES
BLADE ISSUES
FIRMWARE UPGRADE ISSUES
NETWORK ISSUESNetwork issues is out of the scope of this article.Details of debug.log fileNetwork InterfacesThis is "ifconfig -a" output of ShMM, it shows several interfaces:
Shelfman VersionPigeon Point Shelf Manger ver. 2.6.4-R3U3-RRPigeon Point and the stylized lighthouse logo are trademarks of Pigeon Point syatems. Copyright (c) 2002-2008 Pigeon Point Systems All rights reserved Build date/time: Jan 11 2010 05:48:20 Carrier: ACB Carrier subtype: 3; subversion: 0 Shelfman Status"Active" or "Backup"ShMC IPMB AddressLocal IPMB Address = 0x100x10 is shm1 (top), and 0x12 is shm2 (bottom) Board InformationShowing the type of blade and Hot Swap State:Physical Slot # 3 92: Entity: (0xa0, 0x60) Maximum FRU device ID: 0x02 PICMG Version 2.2 Hot Swap State: M4 (Active), Previous: M3 (Activation In progress), Last State Change Cause: Normal State Change (0x0) 92: FRU # 0 Entity: (0xa0, 0x60) How Swap State: M4(Active), Previous: M3 (Activation In progress), Last State Change Cause: Normal State Change (0x0) Device ID String: "NetraCP-3060" 92: FRU # 2 (AMC # 1) Entity: (0xa0, 0x61) How Swap State: M4(Active), Previous: M3 (Activation In progress), Last State Change Cause: Normal State Change (0x0) Device ID String: "375-3470-01" Here are the explanations of How Swap States:
Fan ListFan Tray state of the chassis:20: FRU # 3 Current Level: 5 Minimum Speed Level: 0, Maximum Speed Level: 15 20: FRU # 4 Current Level: 5 Minimum Speed Level: 0, Maximum Speed Level: 15 20: FRU # 5 Current Level: 5 Minimum Speed Level: 0, Maximum Speed Level: 15 The fan speed level should be either 5 (min speed) or 15 (max speed). Cooling StateCooling state and list of all temperature sensors:Cooling state: "Normal" Sensor(s) at this state: (0x12,2,0) (0x90,41,0) (0x90,40,0) (0x90,6,0) (0x94,42,0) (0x94,41,0) (0x94,40,0) (0x94,6,0) (0x82,53,0) (0x82,52,0) (0x82,45,0) (0x82,44,0) (0x82,37,0) (0x82,36,0) (0x82,20,0) (0x82,10,0) (0x92,31,0) (0x92,30,0) (0x92,7,0) (0x92,6,0) (0x92,5,0) (0x86,25,0) (0x86,24,0) (0x86,23,0) (0x86,22,0) (0x86,21,0) (0x86,20,0) (0x86,19,0) (0x96,31,0) (0x96,30,0) (0x96,29,0) (0x96,6,0) (0x96,5,0) (0x96,4,0) (0x90,42,0) (0x88,6,0) (0x86,26,0) (0x9c,4,0) (0x9c,3,0) (0x98,4,0) (0x98,3,0) (0x9c,5,0) (0x98,5,0) (0x20,120,0) (0x20,121,0) (0x20,122,0) (0x20,123,0) (0x20,124,0) (0x20,125,0) (0x20,126,0) (0x20,200,0) (0x20,201,0) Cooling could be at "Minor Alert", "Major Alert" or "Critical Alert":
The list has the following format: (IPMB Addr, Sensor #, LUN). System Event LogOutput of "[clia] sel" and has the following event:<ID>: Event at <D&T>; from:(IPMB, FRU, LUN); sensor:(<type>, <#>); event:<event type>: <Details> And an example: 0x0010: Event: at Jan 10 17:01:11 2011; from:(0x84,0,0); sensor:(0x07,4); event:0x3(asserted): 0x00 0xFF 0xFF This event, "from:(0x84,0,0); sensor:(0x07,4); event:0x3(asserted)", shows it is from slot 8 (0x84, IPMB address of slot 8), FRU 0 (the board itself), and sensor #4 ("Hot Swap AMC #3") is asserted. More details are needed to fully decode the SEL, other output such as "[clia] sensor", "[clia] sensordata", "[clia]fruinfo", etc might be necessary. APPENDIXA quick reference table to covert between IPMB address, slot number, and switch port number:Physical | shm1 shm2 Shef 01 02 03 04 05 06 Logical | ---- ---- ---- 13 11 09 07 05 03 BASE | ---- ---- ---- 0/13 0/11 0/09 0/07 0/05 0/03 EXTENDED | ---- ---- ---- 0/12 0/10 0/08 0/06 0/04 0/02 IPMB Add | 10 12 20 9a 96 92 8e 8a 86 HW Add | 08 09 10 4d 4b 49 47 45 43 Physical | 07 08 09 10 11 12 13 14 Logical | 01 02 04 06 08 10 12 14 BASE | ---- ---- 0/04 0/06 0/08 0/10 0/12 0/14 EXTENDED | ---- ---- 0/03 0/05 0/07 0/09 0/11 0/13 IPMB Add | 82 84 88 8c 90 94 98 9c HW Add | 41 42 44 46 48 4a 4c 4e References<NOTE:1346085.1> - Netra CT900 ShMM debug.log analysisAttachments This solution has no attachment |
|||||||||||||||||||||||||||||||||||||||||||||||||||||
|