Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1012546.1
Update Date:2011-03-17
Keywords:

Solution Type  Technical Instruction Sure

Solution  1012546.1 :   System controller replacement for Sun Fire[TM] 3800, 4800, 4810, 6800 E4900, and E6900 systems  


Related Items
  • Sun Fire E6900 Server
  •  
  • Sun Fire 3800 Server
  •  
  • Sun Fire 6800 Server
  •  
  • Sun Fire E4900 Server
  •  
  • Sun Fire 4800 Server
  •  
  • Sun Fire 4810 Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>Midrange Servers
  •  

PreviouslyPublishedAs
217276


Applies to:

Sun Fire 3800 Server
Sun Fire 4800 Server
Sun Fire 4810 Server
Sun Fire 6800 Server
Sun Fire E4900 Server
All Platforms

Goal

This document describes the preparation steps to prepare your configuration for a System Controller replacement on Sun Fire[TM] 3800, 4800, 4810, E4900, 6800, and E6900 systems.

The System Controller (SC) is not customer serviceable. 
An Oracle badged engineer is required to perform the physical replacement of this component.  The preparation steps, however, can and should be performed by a system administrator so that the system is in a state where the field engineer can complete the physical replacement activity as quickly as possible.

DISCLAIMER:
This document is to be used as a reference only.  It does not replace or super-cede the appropriate sections of the Platform Administration Manual or System Service Manual.

Resources:

Solution

Pre-Replacement (A customer or Sys Admin should perform this.):

1. Issue a "showplatform" command on the SC that needs to be replaced and keep the network and host information.
sc1:sc> showplatform
The system controller is configured to be on a network.
Network settings: static
Hostname: crnlc02
IP Address: 10.65.xx.xxx
Netmask: 255.255.255.0
Gateway: 10.65.16.1
DNS Domain: sun.com
Primary DNS Server: 10.65.xx.xxx
Secondary DNS Server: 10.65.xx.xxx
SNTP server: 10.65.xx.xx
SC POST diag Level: min
SC Failover: disabled
Telnet servers: Enabled
Idle connection timeout : No timeout

2. Issue a showsc command on the SC that needs to be replaced and make note of the status of the SC and the firmware version installed.

For the following procedure, System Controller 1, "sc1", is the unit to be replaced and in this case it is running ScApp 5.15.3.
sc1:sc> showsc
SC: SSC1  
Spare System Controller
SC Failover: enabled and active.
Clock failover enabled.
SC date: Fri May 13 09:14:55 EDT 2005
SC uptime: 3 minutes 41 seconds
ScApp version: 5.15.3
RTOS version: 32

3.  Make sure the SC to be replaced is the spare (as highlighted above).
  • If the SC is not the spare, issue the command setfailover force from the MAIN SC to failover SC control to the spare.
NOTE:  Do not execute the failover command if domain(s) are currently being keyswitched on.  Let the domain come up first and then execute the command to failover SC control after the domain(s) are fully up.
4. When able to shut down the SC, perform the following:
  • On the MAIN SC issue the command setfailover off to disable SC Failover.
  • Issue the command poweroff ssc1.
    • Again, this example is powering off SC 1.

An Oracle badged engineer is now required to perform the physical replacement of the SC.

The Oracle engineer will follow the replacement procedure document in the Sun Fire E6900/E4900 Systems Service Manual (pdf).

The new system controller should start to boot when plugged in. It is important to connect to the new SC via the serial port and validate that SSC_POST executes without issues.  If errors are seen during SSC_POST or it fails contact support services or troubleshoot the event with the Oracle Engineer who should still be onsite through this stage in the process.

Post-Replacement (A customer or Sys Admin can perform this, but the Oracle Engineer may assist in these steps as well)

1. Issue the command showsc on the new controller and see if the firmware matches the version from Pre-replacement data.
  • If it is the same jump to step 4 below.
2. Use the flashupdate command to update the SC's firmware so that it matches that of the other SC (Which should match it's pre-replacement release).
  • The new SC will need to obtain it's firmware update from an ftp/http site via the network port.
  • The command issued from the new SC will look similar to
    flashupdate -f ftp://ftp.site.com/dir scapp rtos.
    • Refer to the Install.info file in the firmware patch for more information.
    • See also Document 1006281.1 which describes the firmware upgrade process.
The ScApp firmware matrix can be found in Document 1010756.1 and provides information on the latest version of ScApp.  You are encouraged to keep ScApp updated in order to avoid encountering known defects.

3. The SC will reboot to start up utilizing the new firmware version after it is updated.

4. Issue the command setfailover on from the Main SC.
This enables SC Failover and synchronizes the new SC with the platform configuration information contained on the existing Main SC.
5. You should see the following message on the spare SC:
Platform.SC: SC Failover: enabled but not active, System controller needs to be rebooted. 
Reboot the new/spare SC.

6. Once the new SC boots, run showsc again and check the date and time, and confirm the IP addresses are as expected.

Post-replacement Checks:

1. Login to both the Main and spare SC and run the command showfailover.
Make sure they both reflect:
SC Failover: enabled active.
2. Run the command showkeyswitch from the Main SC and make sure all domains are on (or as expected if some domains were not active).

3. Check network connections and make sure you can login to the SCs via network (if configured).

4. If time allows (or desired) issue a setfailover force from the Main SC to test SC failover.
This will validate that the new SC can function as the Main SC.
  • You DO NOT want to do this if you haven't first confirmed that the two SCs have already reflected a status of "enabled and active" (step 1 above).
  • You should also wait while any domain finishes being keyswtiched off or on completes before failing over.
@ Previously Published As 81547

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback