Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Problem Resolution Sure Solution 1003436.1 : Sun Fire[TM] 12K/15K: Post failure on Domain Advanced Tests
PreviouslyPublishedAs 204819 Symptoms During a high level HPOST (>64) on a domain with 72 or more processors, a 12K/15K CPU may fail HPOST with the following failure: Proc SB0/P0 timed out on test Domain Advanced Tests id=0x6F. Test Failed. FAIL Proc SB0/P0: test_seq_cwd(): failed out of config on timeout Primary service FRU is Slot SB0. Proc SB0/P1: Not Good and poll_busy Proc SB0/P0: EpiDomAdvR1_sc_tfunc(): Master failed Proc SB0/P1: EpiDomAdvR1_sc_tfunc(): Slave failed Proc SB0/P0: clear_lpost_mastership(): Called for non-good CPU Proc SB0/P0: summarize_test_state(): flags !(SC_CODE & LASTTEST): 0x0000 EpiDomAdvR1_sc_tfunc(): Failures occurred. Stage repeat required Repeating all LPOST stages (1)... Dstop/Recordstop/Timeout recovery (1); rerun starting at: cpu_lpost The tests will rerun with the implicated cpu deconfigured and the rerun NOTE: Similar HPOST behavior has been found to occur on large Jaguar configurations with SMS 1.4.1. Apply same workaround outlined in this document. Resolution Confirm the number of cpus in the domain. If there are 72 or greater, then this is most likely Bug ID 4818581. The bug states the following: "When running Hpost level 64 or above, the Domain Advanced Tests (stage This is not likely to be bad hardware unless this workaround or patch fix To test the hardware sanity, take the board which failed this test and This bug was fixed in SMS 1.3 HPOST patch 114608-02 and should be SMS 1.2 does not have the fix, therefore use the provided workaround Relief/Workaround Extend the test timeout period by adding the following to the /var/opt/SUNWSMS/SMS/etc/platform/.postrc file: poll_timeout_mult 4 # Bugid 4818581 Be aware of the downside of this setting which increases the timeouts by a Product Sun Fire 15K Server Sun Fire 12K Server Internal Comments Related to this are several bugs: 4454842 4775888 4818581 4851017 6307312 Also see Troubleshooting Article <Document: 1004888.1> for another reason why this workaround may be applicable. POST, HPOST, CPU, proc, 64, 96, 127, level, jaguar Previously Published As 72263 Change History Date: 2005-09-22 User Name: 95826 Action: Approved Comment: - verified metadata - changed review date to 2006-09-22 - checked for TM - none added - checked audience : contract Publishing Version: 3 Date: 2005-09-22 User Name: 95826 Action: Accept Comment: Version: 0 Date: 2005-09-22 User Name: 101037 Action: Approved Comment: Addition looks fine Version: 0 Product_uuid 29e4659c-0a18-11d6-9fa1-e67bbc033df8|Sun Fire 15K Server 077fd4c5-df8f-4320-ad69-7d01603a674d|Sun Fire 12K Server Attachments This solution has no attachment |
||||||||||||
|