Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Problem Resolution Sure Solution 1007933.1 : Dynamic Reconfiguration (DR) failure on Sun Fire[tm] 4800 running Oracle
PreviouslyPublishedAs 210942 Symptoms Dynamic Reconfiguration (DR) failed attempting to remove a known good uniboard for replacement for Field Change Order purposes. Sun Fire[TM] 4800 running Solaris[TM] 9 KU 117171-05 and Oracle 9i with firmware 5.18.1. Two board domain (SB0 SB2). Additional single board domain unaffected, for completeness. # # cfgadm -c disconnect N0.SB2 cfgadm: Hardware specific failure: unconfigure N0.SB2: I/O error: /ssm@0,0/memory-controller@b,400000 <or> # cfgadm -v -c disconnect N0.SB2 request delete capacity (4 cpus) request delete capacity (1048576 pages) request delete capacity N0.SB2 done request offline SUNW_cpu/cpu8 request offline SUNW_cpu/cpu9 request offline SUNW_cpu/cpu10 request offline SUNW_cpu/cpu11 request offline SUNW_cpu/cpu8 done request offline SUNW_cpu/cpu9 done request offline SUNW_cpu/cpu10 done request offline SUNW_cpu/cpu11 done unconfigure N0.SB2 notify remove SUNW_cpu/cpu8 notify remove SUNW_cpu/cpu9 notify remove SUNW_cpu/cpu10 notify remove SUNW_cpu/cpu11 notify remove SUNW_cpu/cpu8 done notify remove SUNW_cpu/cpu9 done notify remove SUNW_cpu/cpu10 done notify remove SUNW_cpu/cpu11 done cfgadm: Hardware specific failure: unconfigure N0.SB2: I/O error: /ssm@0,0/memory-controller@b,400000 # No processes bound to a specific processor (pbind). NO Real Time processes running. No Sun[TM] Cluster involved. System functioning normally, running Oracle in a Quality Assurance environment. # cfgadm -al N0.SB2 Ap_Id Type Receptacle Occupant Condition N0.SB2 CPU_V2 connected configured ok N0.SB2::cpu0 cpu connected configured ok N0.SB2::cpu1 cpu connected configured ok N0.SB2::cpu2 cpu connected configured ok N0.SB2::cpu3 cpu connected configured ok N0.SB2::memory # No permanent memory on this (SB2) uniboard: # cfgadm -alv |grep memory N0.SB0::memory connected configured ok base address 0x0, 8388608 KBytes total, 3054144 KBytes permanent Mar 2 13:13 memory n /devices/ssm@0,0:N0.SB0::memory N0.SB2::memory connected configured ok base address 0x2000000000, 8388608 KBytes total Mar 2 13:13 memory n /devices/ssm@0,0:N0.SB2::memory # Resolution There is no current resolution (or work around) and the best that can be done is to minimize database down time. DBA has utilized the majority of physical memory for Oracle. Shutdown Oracle and DR works fine: (note minimal Oracle Listener processes running) # # ps -ef |grep ora root 29930 29922 0 11:03:28 pts/6 0:00 grep ora oracle 9407 1 0 Mar 09 ? 0:05 /dbvol01/oracle/product/9.2.0/bin/tnslsnr 1522 -inherit oracle 9413 9409 0 Mar 09 ? 0:01 /dbvol01/oracle/product/9.2.0/bin/dbsnmp oracle 9409 1 0 Mar 09 ? 0:00 /bin/sh /dbvol01/oracle/product/9.2.0/bin/dbsnmpwd oracle 9398 1 0 Mar 09 ? 5:31 /dbvol01/oracle/product/9.2.0/bin/tnslsnr 1521 -inherit # # cfgadm -v -c disconnect N0.SB2 request delete capacity (4 cpus) request delete capacity (1048576 pages) request delete capacity N0.SB2 done request offline SUNW_cpu/cpu8 request offline SUNW_cpu/cpu9 request offline SUNW_cpu/cpu10 request offline SUNW_cpu/cpu11 request offline SUNW_cpu/cpu8 done request offline SUNW_cpu/cpu9 done request offline SUNW_cpu/cpu10 done request offline SUNW_cpu/cpu11 done unconfigure N0.SB2 unconfigure N0.SB2 done notify remove SUNW_cpu/cpu8 notify remove SUNW_cpu/cpu9 notify remove SUNW_cpu/cpu10 notify remove SUNW_cpu/cpu11 notify remove SUNW_cpu/cpu8 done notify remove SUNW_cpu/cpu9 done notify remove SUNW_cpu/cpu10 done notify remove SUNW_cpu/cpu11 done disconnect N0.SB2 disconnect N0.SB2 done poweroff N0.SB2 poweroff N0.SB2 done unassign N0.SB2 skipped # # Replaced uniboard in question and brought back into environment: # cfgadm -v -c connect N0.SB2 assign N0.SB2 assign N0.SB2 done poweron N0.SB2 poweron N0.SB2 done test N0.SB2 test N0.SB2 done connect N0.SB2 connect N0.SB2 done # mpstat CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 0 165 1 238 296 181 358 16 34 49 0 1483 6 3 4 86 1 70 0 573 209 192 339 15 31 55 0 993 5 4 3 88 2 71 1 1433 37 19 312 14 29 50 0 387 7 6 3 84 3 70 1 1451 18 1 304 13 29 49 0 317 7 5 4 84 # #cfgadm -v -c configure N0.SB2 configure N0.SB2 configure N0.SB2 done notify online SUNW_cpu/cpu8 notify online SUNW_cpu/cpu9 notify online SUNW_cpu/cpu10 notify online SUNW_cpu/cpu11 notify add capacity (4 cpus) notify add capacity (1048576 pages) notify add capacity N0.SB2 done # #mpstat CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 0 165 1 238 296 181 358 16 34 49 0 1483 6 3 4 86 1 70 0 573 209 192 339 15 31 55 0 993 5 4 3 88 2 71 1 1433 37 19 312 14 29 50 0 387 7 6 3 84 3 70 1 1451 18 1 304 13 29 49 0 317 7 5 4 84 8 278 0 1325 4 1 56 1 12 16 0 355 3 2 5 90 9 32 0 59 45 42 23 0 10 9 0 62 0 1 0 99 10 65 0 304 7 5 32 0 11 12 0 136 1 1 1 97 11 60 0 284 4 1 19 0 7 11 0 98 0 2 1 97 # #cfgadm -alv N0.SB2 Ap_Id Receptacle Occupant Condition Information When Type Busy Phys_Id N0.SB2 connected configured ok powered-on, assigned Apr 4 11:36 CPU_V2 n /devices/ssm@0,0:N0.SB2 N0.SB2::cpu0 connected configured ok cpuid 8, speed 900 MHz, ecache 8 MBytes Apr 4 11:36 cpu n /devices/ssm@0,0:N0.SB2::cpu0 N0.SB2::cpu1 connected configured ok cpuid 9, speed 900 MHz, ecache 8 MBytes Apr 4 11:36 cpu n /devices/ssm@0,0:N0.SB2::cpu1 N0.SB2::cpu2 connected configured ok cpuid 10, speed 900 MHz, ecache 8 MBytes Apr 4 11:36 cpu n /devices/ssm@0,0:N0.SB2::cpu2 N0.SB2::cpu3 connected configured ok cpuid 11, speed 900 MHz, ecache 8 MBytes Apr 4 11:36 cpu n /devices/ssm@0,0:N0.SB2::cpu3 N0.SB2::memory connected configured ok base address 0x2000000000, 8388608 KBytes total Apr 4 11:36 memory n /devices/ssm@0,0:N0.SB2::memory # Additional Information Notes on System Controller activity during physical replacement: root@my-server:/tmp# telnet 192.168.0.25 Trying 192.168.0.25... Connected to 192.168.0.25. Escape character is '^]'. System Controller 'my-server': Type 0 for Platform Shell Type 1 for domain A console Type 2 for domain B console Type 3 for domain C console Type 4 for domain D console Input: 0 Enter Password: Platform Shell my-server:SC> poweron SB2 /N0/SB2: powered on my-server:SC> my-server:SC> my-server:SC> my-server:SC> showboard -p version Component Compatible Version --------- ---------- ------- SSC0 Reference 5.18.1 Build_01 /N0/IB6 Yes 5.18.1 Build_01 /N0/SB0 Yes 5.18.1 Build_01 /N0/SB2 Yes 5.18.1 Build_01 /N0/IB8 Yes 5.18.1 Build_01 /N0/SB4 Yes 5.18.1 Build_01 my-server:SC> my-server:SC> my-server:SC> my-server:SC> my-server:SC> showboards Slot Pwr Component Type State Status Domain ---- --- -------------- ----- ------ ------ SSC0 On System Controller Main Passed - SSC1 On Present Spare - - ID0 On Sun Fire 4800 Centerplane - OK - PS0 On A153 Power Supply - OK - PS1 On A153 Power Supply - OK - PS2 On A153 Power Supply - OK - FT0 On Fan Tray Low Speed OK - FT1 On Fan Tray Low Speed OK - FT2 On Fan Tray Low Speed OK - RP0 On Repeater Board - OK - RP2 On Repeater Board - OK - /N0/SB0 On CPU Board V2 Active Passed A /N0/SB2 On CPU Board V2 Assigned Under Test A /N0/SB4 On CPU Board V2 Active Passed C /N0/IB6 On PCI I/O Board Active Passed A /N0/IB8 On PCI I/O Board Active Passed C my-server:SC> Product Sun Fire 3800 Server Sun Fire 6800 Server Sun Fire 4810 Server Sun Fire 4800 Server Oracle9i Database Release 1 (9.0.1) ISM, DR, dynamic reconfiguration, ipcs, shared memory Previously Published As 81331 Change History Date: 2005-04-28 User Name: 25440 Action: Approved Comment: Publishing. Version: 6 Date: 2005-04-28 User Name: 25440 Action: Accept Comment: Version: 0 Attachments This solution has no attachment |
||||||||||||
|