Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1375921.1
Update Date:2012-01-30
Keywords:

Solution Type  Technical Instruction Sure

Solution  1375921.1 :   How To Check The Processor C-States From The O/S For Memory Correctable ECC Faults  


Related Items
  • Sun Fire X4270 Server
  •  
  • Sun Fire X4170 Server
  •  
  • Sun Fire X4170 M2 Server
  •  
  • Exadata Database Machine X2-2 Qtr Rack
  •  
  • Sun Fire X4270 M2 Server
  •  
Related Categories
  • PLA-Support>Sun Systems>x64>Server>SN-x64: MISC-SERVER
  •  
  • .Old GCS Categories>Sun Microsystems>Specialized Systems>Database Systems
  •  




Created from <SR 3-4768793216>

Applies to:

Exadata Database Machine X2-2 Qtr Rack - Version: Not Applicable to Not Applicable - Release: N/A to N/A
Sun Fire X4170 M2 Server - Version: Not Applicable to Not Applicable   [Release: N/A to N/A]
Sun Fire X4270 M2 Server - Version: Not Applicable and later    [Release: N/A and later]
Sun Fire X4170 Server - Version: Not Applicable to Not Applicable   [Release: N/A to N/A]
Sun Fire X4270 Server - Version: Not Applicable to Not Applicable   [Release: N/A to N/A]
Information in this document applies to any platform.

Goal

=== ODM Question ===
 
Question  from customer


Solution

=== ODM Answer ===

Q. Is there any way to check if "C-state" is disabled without entering to the BIOS?
A  The Answer is Yes you can check the C-States in the operating system.
     You should be able to see the cstates using dmidecode or You may check the C-states in the following directory:

     # cat /proc/acpi/processor/CPU#/power
        "active state" should be C-State on the machine.

Example:
[root@sunfcel04 processor]# ls -ltr
total 0
dr-xr-xr-x 2 root root 0 Nov 10 20:11 CPUF
dr-xr-xr-x 2 root root 0 Nov 10 20:11 CPUE
dr-xr-xr-x 2 root root 0 Nov 10 20:11 CPUD
dr-xr-xr-x 2 root root 0 Nov 10 20:11 CPUC
dr-xr-xr-x 2 root root 0 Nov 10 20:11 CPUB
dr-xr-xr-x 2 root root 0 Nov 10 20:11 CPUA
dr-xr-xr-x 2 root root 0 Nov 10 20:11 CPU9
dr-xr-xr-x 2 root root 0 Nov 10 20:11 CPU8
dr-xr-xr-x 2 root root 0 Nov 10 20:11 CPU7
dr-xr-xr-x 2 root root 0 Nov 10 20:11 CPU6
dr-xr-xr-x 2 root root 0 Nov 10 20:11 CPU5
dr-xr-xr-x 2 root root 0 Nov 10 20:11 CPU4
dr-xr-xr-x 2 root root 0 Nov 10 20:11 CPU3
dr-xr-xr-x 2 root root 0 Nov 10 20:11 CPU2
dr-xr-xr-x 2 root root 0 Nov 10 20:11 CPU1
dr-xr-xr-x 2 root root 0 Nov 10 20:11 CPU0

[root@sunfcel04 processor]# cd CPU9
/proc/acpi/processor/CPU9

[root@sunfcel04 CPU9]# ls
info limit power throttling

[root@sunfcel04 CPU9]# more power
active state: C1  <<<------------C-state
max_cstate: C1
bus master activity: 00000000
states:
*C1: type[C1] promotion[C2] demotion[--] latency[001] usage[1758805894] duration[00000000000000000000]
C2: type[C2] promotion[C3] demotion[C1] latency[017] usage[00000000] duration[00000000000000000000]
C3: type[C3] promotion[--] demotion[C2] latency[017] usage[00000000] duration[00000000000000000000]

Q. Please provide me the command/procedure to clear up the faulty component.
A.  How to clear the alerts:

From the ILOM Command Line Interface (CLI):
Login in to the ILOM shell
-> cd /SYS/MB/P1/D7

-> show
/SYS/MB/P1/D7
Targets:
PRSNT
SERVICE

Properties:
type = DIMM
ipmi_name = MB/P1/D7
fru_name = 4GB DDR3 SDRAM 666
fru_manufacturer = SAMSUNG
fru_version = 00
fru_part_number = M393B5170EH1-CH9
fru_serial_number = 85199011
fault_state = Faulted                 <<< CHECK THE STATUS >>>
clear_fault_action = (none)

-> set clear_fault_action=ture        <<< THE ALARM SHOULD BE CLEARED >>>

From The Web GUI:
   Under Component Sub Tab
   From the pull down menu
   Select the Faulted Component
   Toggle clear the logs

Please Refer to : <Document: 1297311.1>


Question 2.

Then please point me out in which Oracle document says that kernel configuration should be done. Also, you ask me to edit the grub configuration file on all servers. Do you mean compute nodes and cell nodes?

Please Refer to : <Document: 1297311.1>
Also disable C-state in BIOS :
Enter into BIOS Setup,  "Advanced" -> "CPU" menu, and then set the C-state feature to "Disabled".

Note:
To Disable CPU C-state in  Down Rev (Sw 1.3.x and below) system. Enable Expert Mode
  Enter BIOS,  Hit F4 get into expert mode,  and then set the  Intel (R) C-STATE tech feature to "Disabled".

Do you mean compute nodes and cell nodes?
Yes

Statement 1.
This is a productive environment, therefore a reboot is not possible without scheduling.
Response: These are just fake alerts, so clear them first , when you have the downtime please ensure to make the BIOS changes.

References

<NOTE:1297311.1> - Uncorrectable memory (DIMM) errors are reported while the system is idle.

Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback