Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1018832.1
Update Date:2012-07-30
Keywords:

Solution Type  Problem Resolution Sure

Solution  1018832.1 :   Sun Fire [TM] SF3800/SF4800/SF4810/SF6800 - E4900/E6900 Server ( Serengeti/Amazon ): POST fails during IOPOST, marking all I/O Boards (IBs) as bad.  


Related Items
  • Sun Fire 4810 Server
  •  
  • Sun Fire 3800 Server
  •  
  • Sun Fire 6800 Server
  •  
  • Sun Fire E6900 Server
  •  
  • Sun Fire V1280 Server
  •  
  • Sun Fire 4800 Server
  •  
  • Sun Fire E4900 Server
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>Enterprise>SN-SPARC: Exx00
  •  
  • .Old GCS Categories>Sun Microsystems>Servers>Midrange Servers
  •  
  • .Old GCS Categories>Sun Microsystems>Servers>Midrange V and Netra Servers
  •  

PreviouslyPublishedAs
230625


Applies to:

Sun Fire V1280 Server - Version Not Applicable and later
Sun Fire 3800 Server - Version Not Applicable and later
Sun Fire 4800 Server - Version Not Applicable and later
Sun Fire 4810 Server - Version Not Applicable and later
Sun Fire 6800 Server - Version Not Applicable and later
All Platforms

Symptoms

All I/O Boards (IBs) are marked as bad during IOPOST. This can be misleading while diagnosing the right FRU

 

Cause

Sometimes all I/O Boards (IBs) are marked as bad because of a faulty CPU running IOPOST.

The CPU itself running POST is bad, which unfortunately goes undetected by LPOST (POST for the CPU itself).

Solution



See a snippet from the console logs below.

Note the following from the console logs :

  • SB4/P0 is the processor running the IOPOST
  • SB4/P0 marks IB6/P0 and IB6/P1 - the two IO controllers on IB6 as "Failed"
  • SB4/P0 marks IB8/P0 and IB8/P1 - the two IO controllers on IB8 as "Failed"
  • SB4/P0 is actually the bad CPU. Since the CPU itself is faulty, it cannot reliably test the IBs, marking the controllers on the IBs as failed.
  • SB4/P0 goes undetected during its own Self Test (called LPOST)
  • It is highly unlikely that all of the IO controllers (IB6/P0, IB6/P1, IB8/P0 and IB8/P1) are bad.

 

Console logs :

{/N0/SB4/P0} ERROR: TEST=PCI IO Controller Functional Tests,SUBTEST=PCI IO
Controller DMA loopback Tests ID=152.2
{/N0/SB4/P0} Component under test: /N0/IB6/P0 PCI IOC
{/N0/SB4/P0}    Data Access Error from address 00000000.08000820. AFSR =
00000002.00000094
{/N0/SB4/P0} Secondary AFAR 00000000.08000820, Secondary AFSR =
00000002.00000094
{/N0/SB4/P0}  tl  tt         tstate                 tpc               tnpc
{/N0/SB4/P0} 01  63  00000099.80000606  00000000.0001ca48
00000000.0001ca4c
{/N0/SB4/P0}    (CE) Correctable system data ECC error
{/N0/SB4/P0}  tl  tt         tstate                 tpc               tnpc
{/N0/SB4/P0} 02  32  00000044.80001504  00000000.0000f1e0
00000000.0000f1e4
{/N0/SB4/P0} 01  63  00000099.80000606  00000000.0001ca48
00000000.0001ca4c
{/N0/SB4/P0}    (CE) Correctable system data ECC error
{/N0/SB4/P0}    (TO) Time-out from system bus
{/N0/SB4/P0}  tl  tt         tstate                 tpc               tnpc
{/N0/SB4/P0} 03  34  00000091.80001507  00000000.00014d80
00000000.00014d84
{/N0/SB4/P0} 02  32  00000044.80001504  00000000.0000f1e0
00000000.0000f1e4
{/N0/SB4/P0} 01  63  00000099.80000606  00000000.0001ca48
00000000.0001ca4c
{/N0/SB4/P0} AFSR = 00000000.00000000
{/N0/SB4/P0} AFAR = 00000000.08000820
{/N0/SB4/P0} IMMU SFSR = 00000000.00000000
{/N0/SB4/P0} DMMU SFSR = 00000000.00700009
{/N0/SB4/P0} DMMU SFAR = 00000000.08000820
{/N0/SB4/P0} PState = 00000000.00000015
{/N0/SB4/P0} Dispatch Control =00000000.0000103f
{/N0/SB4/P0} Data Cache Unit Control =0000ce00.0000000e
{/N0/SB4/P0} Safari Config. = 0aaa0028.20200006
{/N0/SB4/P0} EState = 00000000.00000000
{/N0/SB4/P0} @(#) lpost         5.15.2  2003/08/04 10:27
{/N0/SB4/P0} Copyright 2001-2003 Sun Microsystems, Inc.  All rights
reserved.
{/N0/SB4/P0} Use is subject to license terms.
{/N0/SB4/P0} Running PCI IO Controller Basic Tests
{/N0/SB4/P0}  tl  tt         tstate                 tpc               tnpc
{/N0/SB4/P0} 02  32  00000044.80001503  000007ff.f0007cc0
000007ff.f0007cc4
{/N0/SB4/P0} 01  32  00000000.80000405  000007ff.f0009bec
000007ff.f0009bf0
{/N0/SB4/P0}    (TO) Time-out from system bus
{/N0/SB4/P0}    (PRIV) Privileged code access error(s)
{/N0/SB4/P0}    (ME) Multiple Errors of the same type occurred
{/N0/SB4/P0}  tl  tt         tstate                 tpc               tnpc
{/N0/SB4/P0} 03  32  00000099.80001502  000007ff.f0006a58
000007ff.f0006a5c
{/N0/SB4/P0} 02  32  00000044.80001503  000007ff.f0007cc0
000007ff.f0007cc4
{/N0/SB4/P0} 01  32  00000000.80000405  000007ff.f0009bec
000007ff.f0009bf0
{/N0/SB4/P0}    (TO) Time-out from system bus
{/N0/SB4/P0}    (PRIV) Privileged code access error(s)
{/N0/SB4/P0} @(#) lpost         5.15.2  2003/08/04 10:27
{/N0/SB4/P0} Copyright 2001-2003 Sun Microsystems, Inc.  All rights
reserved.
{/N0/SB4/P0} Use is subject to license terms.
{/N0/SB4/P0} @(#) lpost         5.15.2  2003/08/04 10:27
{/N0/SB4/P0} Copyright 2001-2003 Sun Microsystems, Inc.  All rights
reserved.
{/N0/SB4/P0} Use is subject to license terms.
{/N0/IB6/P0} Failed <--- !!
{/N0/IB6/P1} Failed <--- !!
Sep 10 11:05:24 he101 Domain-A.SC: Excluded unusable, unlicensed, failed or
disabled board: /N0/IB6
Copying IO prom to Cpu dram
...................................
{/N0/SB4/P0} Running PCI IO Controller Basic Tests
{/N0/SB4/P0} Jumping to memory 00000000.00000020 [00000010]
{/N0/SB4/P0} System PCI IO post code running from memory
{/N0/SB4/P0} @(#) lpost         5.15.2  2003/08/04 10:28
{/N0/SB4/P0} Copyright 2001-2003 Sun Microsystems, Inc.  All rights
reserved.
{/N0/SB4/P0} Use is subject to license terms.
{/N0/SB4/P0} Subtest: PCI IO Controller Register Initialization for aid
0x1c
{/N0/SB4/P0} Running PCI IO Controller Functional Tests
{/N0/SB4/P0} Subtest: PCI IO Controller IOMMU  TLB Compare Tests for aid
0x1c
{/N0/SB4/P0} Subtest: PCI IO Controller IOMMU TLB Flush Tests for aid 0x1c
{/N0/SB4/P0} Subtest: PCI IO Controller DMA loopback Tests for aid 0x1c
{/N0/SB4/P0} ERROR: TEST=PCI IO Controller Functional Tests,SUBTEST=PCI IO
Controller DMA loopback Tests ID=152.2
{/N0/SB4/P0} Component under test: /N0/IB8/P0 PCI IOC
{/N0/SB4/P0}    Data Access Error from address 00000000.08000820. AFSR =
00000002.00000094
{/N0/SB4/P0} Secondary AFAR 00000000.08000820, Secondary AFSR =
00000002.00000094
{/N0/SB4/P0}  tl  tt         tstate                 tpc               tnpc
{/N0/SB4/P0} 01  63  00000099.80000605  00000000.0001c8b4
00000000.0001c8b8
{/N0/SB4/P0}    (CE) Correctable system data ECC error
{/N0/SB4/P0}  tl  tt         tstate                 tpc               tnpc
{/N0/SB4/P0} 02  32  00000044.80001503  00000000.0000f1e0
00000000.0000f1e4
{/N0/SB4/P0} 01  63  00000099.80000605  00000000.0001c8b4
00000000.0001c8b8
{/N0/SB4/P0}    (CE) Correctable system data ECC error
{/N0/SB4/P0}    (TO) Time-out from system bus
{/N0/SB4/P0}  tl  tt         tstate                 tpc               tnpc
{/N0/SB4/P0} 03  34  00000091.80001506  00000000.00014d80
00000000.00014d84
{/N0/SB4/P0} 02  32  00000044.80001503  00000000.0000f1e0
00000000.0000f1e4
{/N0/SB4/P0} 01  63  00000099.80000605  00000000.0001c8b4
00000000.0001c8b8
{/N0/SB4/P0} AFSR = 00000000.00000000
{/N0/SB4/P0} AFAR = 00000000.08000820
{/N0/SB4/P0} IMMU SFSR = 00000000.00000000
{/N0/SB4/P0} DMMU SFSR = 00000000.00700009
{/N0/SB4/P0} DMMU SFAR = 00000000.08000820
{/N0/SB4/P0} PState = 00000000.00000015
{/N0/SB4/P0} Dispatch Control =00000000.00000000
{/N0/SB4/P0} Data Cache Unit Control =00000000.0000000c
{/N0/SB4/P0} Safari Config. = 0aaa0028.20200006
{/N0/SB4/P0} EState = 00000000.00000000
{/N0/SB4/P0} @(#) lpost         5.15.2  2003/08/04 10:27
{/N0/SB4/P0} Copyright 2001-2003 Sun Microsystems, Inc.  All rights
reserved.
{/N0/SB4/P0} Use is subject to license terms.
{/N0/SB4/P0} Running PCI IO Controller Basic Tests
{/N0/SB4/P0}  tl  tt         tstate                 tpc               tnpc
{/N0/SB4/P0} 01  32  00000000.80000405  000007ff.f0009bec
000007ff.f0009bf0
{/N0/SB4/P0}    (TO) Time-out from system bus
{/N0/SB4/P0}    (PRIV) Privileged code access error(s)
{/N0/SB4/P0}  tl  tt         tstate                 tpc               tnpc
{/N0/SB4/P0} 02  32  00000099.80001502  000007ff.f0006a58
000007ff.f0006a5c
{/N0/SB4/P0} 01  32  00000000.80000405  000007ff.f0009bec
000007ff.f0009bf0
{/N0/SB4/P0}    (TO) Time-out from system bus
{/N0/SB4/P0}    (PRIV) Privileged code access error(s)
{/N0/SB4/P0}  tl  tt         tstate                 tpc               tnpc
{/N0/SB4/P0} 03  32  00000099.80001502  000007ff.f0006a58
000007ff.f0006a5c
{/N0/SB4/P0} 02  32  00000099.80001502  000007ff.f0006a58
000007ff.f0006a5c
{/N0/SB4/P0} 01  32  00000000.80000405  000007ff.f0009bec
000007ff.f0009bf0
{/N0/SB4/P0}    (TO) Time-out from system bus
{/N0/SB4/P0}    (PRIV) Privileged code access error(s)
{/N0/SB4/P0} @(#) lpost         5.15.2  2003/08/04 10:27
{/N0/SB4/P0} Copyright 2001-2003 Sun Microsystems, Inc.  All rights
reserved.
{/N0/SB4/P0} Use is subject to license terms.
{/N0/SB4/P0} @(#) lpost         5.15.2  2003/08/04 10:27
{/N0/SB4/P0} Copyright 2001-2003 Sun Microsystems, Inc.  All rights
reserved.
{/N0/SB4/P0} Use is subject to license terms.
{/N0/IB8/P0} Failed <--- !!
{/N0/IB8/P1} Failed <--- !!
Sep 10 11:05:47 he101 Domain-A.SC: Excluded unusable, unlicensed, failed or
disabled board: /N0/IB8
Sep 10 11:05:47 he101 Domain-A.SC: No usable Io board in domain.
setkeyswitch operation did not complete



Relief/Workaround

Disable the System Board (SB) containing the CPU running IOPOST (that fails IOPOST), so we move IOPOST to run on a different CPU.

This can be achieved by using the "disablecomponent" command from the system controller interface (SC-App) Alternatively, disabling the processor itself using the "disablecomponent" command is a valid workaround too.






Product
Sun Fire 6800 Server
Sun Fire 4810 Server
Sun Fire 4800 Server
Sun Fire 3800 Server
Sun Fire E4900
Sun Fire E6900






IOPOST, IB6/P0, Failed, DMA, Functional, Controller



Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback