Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1017589.1
Update Date:2012-05-02
Keywords:

Solution Type  Problem Resolution Sure

Solution  1017589.1 :   Resolving the Solaris[TM] driver error "WARNING: SC has stopped responding"  


Related Items
  • Sun Netra 240 (DC) Server
  •  
  • Sun Fire V240 Server
  •  
  • Sun Fire V250 Server
  •  
  • Sun Fire V440 Server
  •  
  • Sun Netra 440 Server
  •  
  • Sun Fire V210 Server
  •  
  • Sun Netra 240 (AC) Server
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>Usx/Blade/Netra>SN-SPARC: USx
  •  

PreviouslyPublishedAs
228744


Applies to:

Sun Fire V250 Server - Version Not Applicable and later
Sun Fire V440 Server - Version Not Applicable and later
Sun Netra 240 (AC) Server - Version Not Applicable and later
Sun Netra 240 (DC)Server - Version Not Applicable and later
Sun Netra 440 Server - Version Not Applicable and later
All Platforms

Symptoms

Occasionally, an error is reported to the console, and in /var/adm/messages from the driver - rmclomv, which is the remote management console, lights out manager driver.

This error occurs if the serial uart driver (su) is unloaded. The error message reported by rmclomv will look similar to the message below:

rmclomv: [ID 647266 kern.warning] WARNING: SC has stopped responding

This error may be reported once, and not continuously.This error may also manifest itself in the output of the command:

/usr/platform/`uname -i`/sbin/prtdiag

showing some components failing or in an unknown state.

The command:

/usr/platform/`uname -i`/sbin/scadm

may fail, and appears to time out.

On experiencing this condition, please follow the guidelines in this document, before considering any hardware replacement.

Cause

In short, CR 5017681 is root caused by an interrupt-sharing conflict between Solaris drivers. The patch or upgrade delivers a new pcisch driver that is able to share interrupts between the hardware drivers. Please refer to CR 5017681 for more details.

Solution

Step 1

Check whether the system controller (SC), otherwise known as the Advanced Lights-Out Management interface(ALOM), is accessible using an external connection, rather than from within the operating system. This is to test whether the SC is still alive, regardless of whether there is Solaris[TM] operating system access to SC.

To conduct this verification, you must connect to the SC via the serial management port, to confirm whether the SC is responsive. Take note that connecting to the SC via the Ethernet port (network management port) will depend on whether the SC's Ethernet port has been previously configured to accept incoming connections. If the SC's Ethernet port is not configured, access to the SC will not be available via the Ethernet.

When a connection has been gained to the SC, there might not be any output. This is normal because the SC default behavior(setting), is to time-out console redirection within 60 seconds of being booted or reset. Enter the escape sequence "#." to gain access to the SC prompt "sc> " or a "login:" request by
the SC.

Step 2

If no response is seen from the SC after keying in the escape sequence "#." or even after a power cycle, consider replacing the SC.

However, if you are able to access the system controller externally, then you
might be experiencing CR5017681: su driver unloading causes loss of connection from Solaris to ALOM

To resolve CR 5017681, do the following:

For Solaris[TM] 8 OS, apply patch 116962-13 or greater to update the pcisch driver.*

For Solaris 9 OS, upgrade to Solaris 9 Update 6 (HW 4/04) or later for the correct pcisch driver.

* The patch versions are at the time of writing the documentation. Please refer to MOS (Patches & Upgrades) for the latest versions.

Relief/Workaround

A workaround can be implemented without the need to patch the system which involves force-loading the Solaris driver for the serial UART chip.

Whilst logged in as root, run the following command:

echo "forceload: drv/su" >>/etc/system

Then reboot the platform to committ the changes.

This action will prevent the "su" serial driver from unloading.



Additional Information

This issue currently manifests itself on ALOM based products only.

** The original lights out management (LOM) based products are unaffected.




Internal Comments

In short, CR 5017681 is root caused by an interrupt-sharing conflict between Solaris drivers.
The patch or upgrade delivers a new pcisch driver that is able to share interrupts between the hardware drivers. Please refer to CR 5017681 for more details.
LOM, ALOM, RMC, sc, SC, responding, rmclomv, rmclomv:, rmc_comm, rmc_comm:, su, SU

Previously Published As 77568


Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback