Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-75-1010784.1
Update Date:2011-05-30
Keywords:

Solution Type  Troubleshooting Sure

Solution  1010784.1 :   Sun Fire[TM] Servers (12K/15K/E20K/E25K): Verifying System Integrity After a Hardware Replacement  


Related Items
  • Sun Fire E25K Server
  •  
  • Sun Fire E20K Server
  •  
  • Sun Fire 12K Server
  •  
  • Sun Fire 15K Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>High-End Servers
  •  

PreviouslyPublishedAs
214909


Applies to:

Sun Fire 12K Server
Sun Fire 15K Server
Sun Fire E20K Server
Sun Fire E25K Server
All Platforms

Purpose

After a FRU has been replaced, run a high level of post to verify the system integrity and to confirm the diagnosis and the status of the replaced FRU.

Last Review Date

May 10, 2011

Instructions for the Reader

A Troubleshooting Guide is provided to assist in debugging a specific issue. When possible, diagnostic tools are included in the document to assist in troubleshooting.

Troubleshooting Details

Some hardware-related failures are hard to diagnose, for example, multibit ECC memory errors.

To verify if the right part has been replaced, run a high level of post after the part replacement. This procedure is for Sun Fire[TM] Servers (12K/15K/E20K/E25K).

Perform the following steps:

1. Choose the post level:

  • Level: 32
    • What: Tests all locations but keeps pattern count low.
    • When: Minimum level that should be run after a hardware replacement
  • Level: 64
    • What: Tests all locations with all patterns, excepting memory/external cache.
    • When: Recommended default level after hardware replacement
  • Level: 96
    • What: Tests all locations with all patterns, meaning full for memory and external cache tests.
    • When: Recommended level after memory or uniboard replacement, or if it us unclear if there is a memory problem or uniboard.  Also, use full for L2 SRAM related problems.

2. Set the post level:

Edit or change the following file to reflect the post level you want to run:
/etc/opt/SUNWSMS/config/[A-R]/.postrc (Pick the domain letter you wish to test.)
Include or change the level of post you want to run, such as "level 64."

3. Run the test and start the domain:

setkeyswitch -d [A-R] on     (Pick the domain letter you want to test.)

4. Remove the .postrc setting for this test.

After the test domain has been brought up, verify the post log in:

/var/opt/SUNWSMS/adm/[A-R]/post
Check if there are any failures and if all the hardware has been included in the domain. If there are problems reported, follow-up trouble shooting steps might be required.
If there are no problems reported, edit the .postrc file for the domain you want to test on location.
Remove the entry in the /etc/opt/SUNWSMS/config/[A-R]/.postrc file you created before the test.

End of procedure



Updated by the ESG Knowledge Content Team

For additional information, refer to the guidelines on time consumption in
the following document: http://panacea.uk.oracle.com/twiki/pub/Products/StarcatPOST/post.pdf

We used to have a table of POST times for various configurations on gavroche.france, but that has vanished into the Oracle network void for now.

post, hpost, testing, fru, replacement, failure, test, keyswitch, starcat, 15k, 12k, 20k, 25k

Previously Published As
76912

Change History

Date: 2006-02-15
User Name: 25440
Action: Approved
Comment: Oops! I missed those. Thanks, guys!
Version: 5

Date: 2006-02-15
User Name: 25440
Action: Accept
Comment:
Version: 0

Date: 2006-02-15
User Name: 88097
Action: Approved
Comment: Correct, internal URLs were public.


Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback