Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-77-1365975.1
Update Date:2012-10-01
Keywords:

Solution Type  Sun Alert Sure

Solution  1365975.1 :   Solaris 10 Kernel Patch 147440-02 or Later Will Cause sun4v Systems to Panic with "xt_sync:timeout"  


Related Items
  • Solaris SPARC Operating System
  •  
  • Sun Fire T2000 Server
  •  
  • Sun SPARC Enterprise T1000 Server
  •  
  • Oracle Solaris Express
  •  
  • Sun SPARC Enterprise T2000 Server
  •  
  • Sun Fire T1000 Server
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: Sun Alert
  •  
  • .Old GCS Categories>Sun Microsystems>Sun Alert>Release Phase>Resolved
  •  


Adding kernel patch 147440-02 or 147440-03 may cause sun4v systems to panic with xt_sync: timeout

In this Document
Description
Occurrence
Symptoms
Workaround
Patches
History
References


Applies to:

Solaris SPARC Operating System - Version 10 10/09 U8 and later
Sun SPARC Enterprise T1000 Server
Oracle Solaris Express - Version 2010.11 and later
Sun SPARC Enterprise T2000 Server
Sun Fire T2000 Server
Information in this document applies to any platform.
__________________

SUNBUG:7098393

Date of Workaround Release: 04-Nov-2011

Date of Resolved Release: 28-Aug-2012
___________________________________

Description

Following installation of Solaris kernel patch 147440-02 (or later) on sun4v systems, a panic will occur with "xt_sync:timeout" after the system is rebooted.

Occurrence


This issue can occur in the following releases:

SPARC Platform

  • Solaris 10 with patch 147440-02 or later and Firmware earlier than 6.4.6
  • Solaris 11 Express based upon builds snv_169 or later

for the following systems:

T1000/T2000, SPARC Enterprise T1000/T2000

Notes:

1. Solaris 8, Solaris 9, sun4u, sun4us, T5xx0, T3, T4, and the x86 platform are not affected by this issue.

2. This issue will only affect sun4v systems, and only where Solaris kernel patch 147440-02 (or later) is installed.

3. To determine the machine class, execute the following command:

$ uname -m
sun4v

4. Solaris 11 Express distributions may include additional bug fixes above and beyond the build from which it was derived. The base build can be derived as follows:

$ uname -v
snv_151

If the output is of the format 151.x.x.x, then the build installed is snv_151

Symptoms


On reboot following installation of Solaris 10 kernel patch 147440-02 or later, the system will panic with a string similar to following:

panic[cpu14]/thread=2a1027f5ca0: xt_sync: timeout
 

Stack backtrace will vary. Example stack:

unix:panicsys+0x48(0x10bca68, 0x2a102838f90, 0x1913340, 0x1, , ,0x44e2001604, 0x10bca68, 0x2a102838f90)
unix:vpanic_common+0x78(0x10bca68, 0x2a102838f90, 0x2a102839018,0x597a6470f4, 0x1, 0x1913bf8)
genunix:cmn_err+0x98(0x3, 0x10bca68, 0x0, 0x0, 0x40, 0x193b954)
unix:xt_sync+0x370(0x2a102839330)
unix:hat_unload_callback+0x7d4(0x30002c77b40?, 0x3002fa8e000?, , 0x4)
unix:hat_unload(0x30002c77b40, 0x3002fa4e000, 0x40000, 0x4) - frame recycled
genunix:devmap_free_pages+0x30(0x1974430, 0x3002fa4e000, 0x40000)
genunix:devmap_umem_free_np(, 0x40000) - frame recycled
genunix:ddi_umem_free+0x8c(0x60026c5c9c0)
devinfo:di_freemem+0x2c(0x60026c2b4a0)
devinfo:di_ioctl+0x29c()
specfs:spec_ioctl(0x60044f82f80, 0xdf80, 0x28000, 0x100001,0x600214160b0) - frame recycled
genunix:fop_ioctl+0x20(0x60044f82f80, 0xdf80, 0x28000, 0x100001, 0x2a102839adc)
genunix:ioctl+0x184()
unix:syscall_trap32+0xcc()
 

Messages or console will show failure to stop multiple CPUs, as in the following example:

Cross trap sync timeout: at cpu_sync.xword[1]: 0x1010panic: failed to
stop cpu8
panic: failed to stop cpu9
panic: failed to stop cpu10
panic: failed to stop cpu11
panic: failed to stop cpu24
panic: failed to stop cpu25
panic: failed to stop cpu26
panic: failed to stop cpu27
 

Note: If the system is not rebooted after patch installation, a panic will not occur.

Workaround

This issue is addressed in firmware revision 6.4.6 or later with the following patches:

SPARC Platform

  • T1000/SPARC Enterprise T1000 firmware patch 126400-02 or later
  • T2000/SPARC Enterprise T2000 firmware patch 126399-02 or later

To determine the current firmware level on the system, run the following:

sc> showhost
Sun-Fire-T2000 System Firmware 6.3.12  2008/04/06 15:49

Host flash versions:
   Hypervisor 1.3.4 2007/03/28 06:03
   OBP 4.25.12 2008/03/23 13:27
   POST 4.25.12 2008/03/23 13:52
   Hypervisor 1.7.3.c 2010/07/09 15:14

 

Patches

<SUNPATCH:126400-02>
<SUNPATCH:126399-02>

History

04-Nov-2011: Document released
28-Nov-2011: Minor format change; no change to content
23-Apr-2012: Maintenance update; no change to content
03-May-2012: Add procedure note to Workaround
14-Jun-2012: Maintenance update, no change in content
28-Aug-2012: Updated for patches, change in products affected, issue is Resolved
01-Oct-2012: Correction to FW listing in Occurrence section (earlier than 6.4.6)


The putback that introduced this bug was RTI: 353941.
At this point it is not conclusive if it is 6994535 or 7014100
since they're both cited in the comments for file's change. 
Its also safe to say that the nevada putback that exposed 7058642
went into snv_169 and corresponds to RTI: 350129.


Internal Contributor/Submitter: [email protected],[email protected]
Internal Eng Responsible Engineer: [email protected]
Internal Services Knowledge Analyst: [email protected]
Internal Eng Business Unit Group: RPE Systems
Internal Escalation ID: 3-4666607831, 3-4673614091,3-4677666271,
3-4680770791,3-4681616861,3-4802797661,3-4826696591,3-4842979421
Internal Pending Patches:TBD

References




Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback