Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-77-1019357.1
Update Date:2012-09-04
Keywords:

Solution Type  Sun Alert Sure

Solution  1019357.1 :   Sun Fire Server with Solaris 10 may Panic or Reset with lpost message, asynchronous event, fail to stop CPU or send_mondo timeout  


Related Items
  • Sun Netra 1280 Server
  •  
  • Solaris SPARC Operating System
  •  
  • Sun Fire E2900 Server
  •  
  • Sun Netra 1290 Server
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: Sun Alert
  •  
  • .Old GCS Categories>Sun Microsystems>Sun Alert>Release Phase>Resolved
  •  

PreviouslyPublishedAs
238746


Bug Id
<SUNBUG: 6684726>, <SUNBUG: 6699498>

Date of Preliminary Release
12-Jun-2008

Date of Resolved Release
01-Aug-2008

***Checked for relevance on 04-Sep-2012***

1. Impact

Loss of application availability may occur due to a system panic or reset. This type of fault is typically diagnosed to be a hardware failure and may lead to unnecessary hardware replacement.

2. Contributing Factors

This issue may occur on the following releases and platforms:

SPARC Platform
  • Solaris 10 with Sun Fire E6900/E4900/E2900/6800/4800/4810/3800/V1280 Netra 1280 and Netra 1290 Systems and without patches 114527-11 (FW) and 137111-04 (Kernel)
Notes:

This issue is specific to the Midrange servers listed above and only seen with SPARC USIV+  1.5GHz and 1.8GHz CPUs.
It is considered possible to occur with USIV+ 1.95 GHz CPU.

Current firmware versions of System Controller (ScApp) ScApp:5.20.9  (as delivered in patch 114527-10) and earlier are affected.

It has been observed when running programs that access OBP from the OS.  Examples of programs that access OBP from the OS are prtdiag, prtconf, cfgadm, picl, or other third party System management software.

This issue is very timing dependent and expected to be rare.

3. Symptoms

Console logs and core files are useful in identifying whether the system is experiencing
this issue.

-------------------------------------------
Console Logs Showing send_mondo panic:

domainA console login: {/N0/SB2/P0/C1} @(#) lpost   5.20.8  2007/11/20 10:33
{/N0/SB2/P0/C1} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB2/P0/C1} Use is subject to license terms.
send mondo timeout [8307178 NACK 0 BUSY] IDSR 0x4000000000000000  cpuids: 0x208
panic: failed to stop cpu520
panic[cpu3]/thread=3005bc3e080: send_mondo_set: timeout
000002a10062e9c0 SUNW,UltraSPARC-IV+:send_mondo_set+454 (2a10062eba0, ... 1, 2a10062eab0, 0)
%l0-3: aaaaaaaaaaaaaaaa 000000000000002f 000000000000002f 0000000000000209
%l4-7: 0000000001221400 00000007274a4ba9 4000000000000000 0000000000000040
000002a10062eaf0 unix:xt_some+194 (2a10062ed78, 2a10062ebf0, fffff7, fffffffffffffff8, 2a10062eba8, 0)


-------------------------------------------
Console logs showing Asynchronous Event and failed to stop:

domainA console login: {/N0/SB5/P2/C1} @(#) lpost 5.20.8
2007/11/20 10:33
{/N0/SB5/P2/C1} Copyright 2007 Sun Microsystems, Inc. All rights reserved.
{/N0/SB5/P2/C1} Use is subject to license terms.
{/N0/SB5/P2/C1} @(#) lpost 5.20.8 2007/11/20 10:33
{/N0/SB5/P2/C1} Copyright 2007 Sun Microsystems, Inc. All rights reserved.
{/N0/SB5/P2/C1} Use is subject to license terms.
{/N0/SB5/P2/C1} WARNING: Asynchronous Event.
{/N0/SB5/P2/C1} Component under test: /N0/SB5/P2 CPU
{/N0/SB5/P2/C1} AFSR1 EXT: 00000000.00000000 AFSR2 EXT:00000000.00000000
{/N0/SB5/P2/C1} tl tt tstate tpc tnpc
{/N0/SB5/P2/C1} 01 63 00000044.80000605 000007ff.f000c370000007ff.f000c374
Apr 23 17:31:15 e13-sc1 Domain-A.SC: Active - Panicking
panic: failed to stop cpu534

panic[cpu7]/thread=30029331920: bad kernel MMU miss at TL 2

4. Workaround

Workarounds for this issue are on a case by case basis and require consultation with Sun Services.
An individual action plan will be developed for your environment. 

5. Resolution

This issue is addressed on the following releases and platforms:

SPARC Platform
  • Solaris 10 with Sun Fire E6900/E4900/E2900/6800/4800/4810/3800/V1280 Netra 1280 and Netra 1290 Systems and with patches 114527-11 (FW) and 137111-04 (Kernel)
Modification History
01-Aug-2008: Updated Contributing Factors and Resolution sections. Now Resolved.
04-Sep-2012: Maintenance update, currency check, no change in content

References




Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback