Asset ID: |
1-71-1019467.1 |
Update Date: | 2011-04-11 |
Keywords: | |
Solution Type
Technical Instruction Sure
Solution
1019467.1
:
How To Diagnose Missing CPU's and/or Memory on sun4v Platforms
Related Items |
- Sun SPARC Enterprise T5440 Server
- Sun SPARC Enterprise T1000 Server
- Sun SPARC Enterprise T5220 Server
- Sun SPARC Enterprise T5240 Server
- Sun SPARC Enterprise T2000 Server
- Sun SPARC Enterprise T5140 Server
- Sun SPARC Enterprise T5120 Server
|
Related Categories |
- GCS>Sun Microsystems>Servers>CMT Servers
|
PreviouslyPublishedAs
239925
Applies to:
Sun SPARC Enterprise T2000 Server
Sun SPARC Enterprise T1000 Server
Sun SPARC Enterprise T5120 Server
Sun SPARC Enterprise T5140 Server
Sun SPARC Enterprise T5220 Server
All Platforms
Goal
Description
This document describes one possible reason why CPU's (cores/threads) and/or Memory, may not be displayed after booting the Solaris Operating System on sun4v platforms.
This document provides steps to recover these CPU's or Memory, if needed.
Symptoms
The following Solaris commands do not report the actual physical CPU's (cores/threads) or Memory DIMMs installed within the system:
prtdiag(1M),
mpstat(1M),
prtconf(1M)
This gives the impression that the system has less CPU's or Memory than it should have.
Here are a couple of examples taken from a Sun Fire T2000. As you will see it only shows 4 CPU's and 1024MB Memory:
From OBP:
|
Sun Fire T200, No Keyboard
Copyright 2008 Sun Microsystems, Inc. All rights reserved.
OpenBoot 4.28.1, 1024 MB memory available, Serial #69080228.
Ethernet address 0:14:4f:1e:14:a4, Host ID: 841e14a4.
{0} ok ls
f0283234 pci@7c0
f027ab00 pci@780
f027a9cc cpu@3
f027a89c cpu@2
f027a76c cpu@1
f027a63c cpu@0
f0277024 virtual-devices@100
f023fa78 virtual-memory
f023f46c memory@m0,8000000
f022d23c aliases
f022d1c4 options
f022d07c openprom
f022d008 chosen
f022cf90 packages
|
From the output of prtdiag -v :
|
System Configuration: Sun Microsystems sun4v Sun Fire T200
Memory size: 1024 Megabytes
================================ Virtual CPUs ================================
CPU ID Frequency Implementation Status
------ --------- ---------------------- -------
0 1000 MHz SUNW,UltraSPARC-T1 on-line
1 1000 MHz SUNW,UltraSPARC-T1 on-line
2 1000 MHz SUNW,UltraSPARC-T1 on-line
3 1000 MHz SUNW,UltraSPARC-T1 on-line
|
However, on the Service Processor (SP) via the Alom account, the following commands will show all the CPU's and Memory physically installed within the system:
sc> showcomponent
sc> showfru
Systems such as the Sun Fire T2000 and T5240, etc... are equipped with sun4v architecture that supports virtualization, and are therefore affected by this issue.
A possible common cause for this condition, is the presence of a previously installed LDom (Logical Domain) configuration that has not been cleared/removed completely.
This may result in the previously configured LDom installation, still affecting the component availability within the system. (this information is stored in Service Processor)
Solution
Related Document(s)
For details on LDoms setup, please refer to the Logical Domains (LDoms) ‘Ldom Version’ Administration Guides, available at Oracle VM for SPARC Documentation.
For details on the hardware configurations of the relevant sun4v platforms, please refer to the system specific documentation .
Steps to Follow
The following, are steps to verify that the system does NOT have an 'active' LDOM configuration currently running.
We need this check 'before' we set the system back to "factory-default".
(NB: Only IF the LDom configuration is no longer in use or to be reconfigured).
Steps to Resetting LDoms to factory-default
[1.] Confirm that if there are existing LDoms configurations that are actually in use.
[1-A.] Verify if the LDoms Manager is installed on the system.
Perform the following command to check if the LDoms Manager package SUNWldmis installed on the system.
# pkginfo |grep SUNWldm
application SUNWldm Logical Domains Manager
The binaries for the package is located at /opt/SUNWldm if the package is installed.
If the SUNWldm package does not exist on the system, it means that the LDoms Manager is not installed.
(Though the configuration is still effective as stored on the System Processor, which why we have missing CPU's and/or Memory in the first place).
If LDoms Manager is not installed, skip Step [1-B.] and proceed to Step [2.]
However, if want to find out what is configured under LDoms, please install the LDoms Manager package, and review the available configurations via the ldm(1M) command.
[1-B.] Verify that there is no other configured or active Guest Domains on the system.
If LDoms Manager does exist, some active Guest Domains are possibly still in-use.
To confirm the existence of Guest Domains, login as root, and run the following command:
# /opt/SUNWldm/bin/ldm list
Below is an example showing that there are 2 Guest Domains configured, and bound:
# /opt/SUNWldm/bin/ldm list
NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
primary active -n-cv SP 4 1G 0.3% 22h 59m
ldom2 bound ----- 5000 4 1G
ldom3 bound ----- 5001 4 1G
Below is an example showing 2 active Guest Domains, with the Operating System running:
# /opt/SUNWldm/bin/ldm list
NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
primary active -n-cv SP 4 1G 3.6% 22h 36m
ldom2 active -n--- 5000 4 1G 24% 11s
ldom3 active -n--- 5001 4 1G 32% 11s
Below is an example showing if there are no Guest Domains configured, only the Primary Domain entry is shown:
# /opt/SUNWldm/bin/ldm list
NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
primary active -n-cv SP 4 1G 0.3% 22h 59m
***CAUTION***
When there are Guest Domains displayed, be aware that they could be still in-use by other parties.
Please take EXTRA care to find out from the System Administrator or organization, who has configured these Guest Domains, and whether if these services are still in use/needed.
DO NOT PROCEED FURTHER If you are unsure if the Domains configured are still in use or not.
[2.] Reseting the LDoms configuration back to default.
By now, after carrying out Step [1.] above, we have confirmation that if LDoms configurations on the system are no longer needed, we are ready to reset the system configuration back to factory default settings.
The LDoms configuration for factory default is named as factory-default.
There are 2 possible ways to reset the configurations of the system.
If SUNWldm package is installed and verified, we can reset the system through LDoms Manager using the ldm(1M) command.
However, if the system has been freshly (re)installed with Solaris without LDoms Manager installed, the configurations can be reset from the Service Processor.
This can be via either the ALOM or ILOM prompt (depending on the specific platform/configuration).
Step [2-A.] and [2-B.] below, illustrate the steps for the 2 methods respectively.
[2-A.] Resetting through existing LDoms Manager installed on the system.
List out the current LDom configuration, as well as other available configurations saved on the system.
The following example shows that the current configuration is named prod:
# /opt/SUNWldm/bin/ldm list-spconfig
factory-default
prod [current]
To set to factory-default, use the following command:
# /opt/SUNWldm/bin/ldm set-spconfig factory-default
Once set, confirm the next power on configuration used will be factory-default, by running the following command
# /opt/SUNWldm/bin/ldm list-spconfig
factory-default [next]
prod [current]
NB - MANDATORY STEP:
For the action performed above to take effect we need to power cycle the system. (power off and power on the system)
On booting up, the system would be using the factory-default configuration, and all the hardware components on the system should now be visible.
The OBP banner will present all the available CPU's and memory. mpstat(1M), prtdiag(1M), etc ... will be showing the fully configured cores/threads and memory.
ldm(1M) will show the current effective configuration as factory-default.
# /opt/SUNWldm/bin/ldm list-spconfig
factory-default [current]
prod
After the configuration has been reset to factory-default, proceed to remove the other non active LDoms configurations on the system using the remove-spconfig subcommand for ldm(1M).
Please refer to LDoms Adminstration Guide for further details on LDoms configuration tasks.
[2-B.] Reseting through the Service Processor.
The LDoms configuration can be reset through the SP before booting the OS on the system.
Ensure you are connected to the Serial Management Port of the Service Processor.
For system with the ALOM configured on the Service Processor, use the following syntax:
sc> bootmode config="factory-default"
sc> poweroff
Wait for system to power off.
We can the power on the system
sc> poweron -c
OR
sc> poweron
sc> console -f
For system with ILOM, use the following syntax:
-> set /HOST/bootmode config="factory-default"
-> stop /SYS
Are you sure you want to stop /SYS (y/n)? y
Wait for machine to power off.
We can then power on the system.
-> start /SYS
Are you sure you want to start /SYS (y/n)? y
Starting /SYS
-> start /SP/console
Once the machine is powered on, it should have all configurations reset to the default with all the hardware components visible.
Please refer to the following documentations for further details on ALOM and ILOM commands:
- Advanced Lights Out Management (ALOM) CMT vX.X Guide
- Sun Integrated Lights Out Manager X.X Supplement
- Platform specific Service Manuals
Data Collection for further troubleshooting
If after the above steps of resetting the system back to factory-default, still shows some CPU's and/or memory not visible, please find out the number of CPU cores the system show be seeing, before contacting Oracle Support with the following information:
- The latest explorer output using the latest version of explorer script.
- The console session log of the attempt to reset the system configuration to factory-default.
Please refer to 1002383.1 Sun[TM] Explorer Data Collector for the latest Explorer Script.
Attachments
This solution has no attachment