Information in this document applies to any platform.
normal
Symptoms
Normally the devices (12 physical disks ) at OS level use the path /dev/sdX, where the map is
slot# device
----------------------
0 sda
1 sdb
2 sdc
3 sdd
4 sde
5 sdf
6 sdg
7 sdh
8 sdi
9 sdj
10 sdk
11 sdl
Now, when the disks on slot 4 was replaced, it didn't reuse the original letter, so now the mapping is
slot# device
----------------------
0 sda
1 sdb
2 sdc
3 sdd
4 sdac
5 sde
6 sdf
7 sdg
8 sdh
9 sdi
10 sdj
11 sdk
SUNDIAG will show incorrect mapping :-
SUNDIAG
LUN:
------------------
name: 0_4
cellDisk:
deviceName: /dev/sdac <<<< Should be /dev/sde
Cause
The reason why a disk could not be added to the diskgroup is because the griddisks do not exist. The physicaldisk was replaced, so the new griddisks has not been created & ASM can't discover them.
Solution
1. Drop the celldisk that is in predictive failure mode
In our case its DISK 4
EXAMPLE :-
cellcli> drop <celldisk cd_04_gfsdw2cel02> force.
Please remember to replace correct cell disk name in above
The force command will drop the griddisks.
2. Create the celldisk
EXAMPLE :-
cellcli> create celldisk cd_04_fgsdw2cel02 lun=0_4.
Note that the celldisk will be using /dev/sdac, but that is ok.That is the current path returned by the os.
3. Check lsscsi command.
0:2:0:0] disk LSI MR9261-8i 2.90 /dev/sda
[0:2:1:0] disk LSI MR9261-8i 2.90 /dev/sdb
[0:2:2:0] disk LSI MR9261-8i 2.90 /dev/sdc
[0:2:3:0] disk LSI MR9261-8i 2.90 /dev/sdd
[0:2:4:0] disk LSI MR9261-8i 2.90 /dev/sdac <------------------------ here
[0:2:5:0] disk LSI MR9261-8i 2.90 /dev/sde
[0:2:6:0] disk LSI MR9261-8i 2.90 /dev/sdf
[0:2:7:0] disk LSI MR9261-8i 2.90 /dev/sdg
[0:2:8:0] disk LSI MR9261-8i 2.90 /dev/sdh
[0:2:9:0] disk LSI MR9261-8i 2.90 /dev/sdi
[0:2:10:0] disk LSI MR9261-8i 2.90 /dev/sdj
[0:2:11:0] disk LSI MR9261-8i 2.90 /dev/sdk
At this time the status of the celldisk will be normal
3. Create the griddisks
EXAMPLE:-
cellcli>create griddisk DATA_GFSDW2_CD_04_gfsdw2cel02 celldisk=CD_04_gfsdw2cel02 ,size=423g
cellcli>create griddisk RECO_GFSDW2_CD_04_gfsdw2cel02 celldisk=CD_04_gfsdw2cel02 ,size=105.6875G
cellcli>create griddisk DBFS_DG_CD_04_gfsdw2cel02 celldisk=CD_04_gfsdw2cel02
Please remember to replace the above names & size correctly
Now the griddisks will be normal.
4. At this time the ASM instance will discover the new disks as candidate
EXAMPLE:-
sql>select path,header_status,group_number from v$asm_disk where path like '%CD_04_gfsdw2cel02';
It should return 3 griddisks with header_status='CANDIDATE';
o/192.168.10.4/DATA_GFSDW2_CD_04_gfsdw2cel02
o/192.168.10.4/RECO_GFSDW2_CD_04_gfsdw2cel02
o/192.168.10.4/DBFS_DG_CD_04_gfsdw2cel02
5. Add each disk to the diskgroup
EXAMPLE:-
sql>alter diskgroup dbfs_dg add disk 'o/192.168.10.4/DBFS_DG_CD_04_gfsdw2cel02';
sql> alter diskgroup DATA_GFSDW2 add disk 'o/192.168.10.4/DATA_GFSDW2_CD_04_gfsdw2cel02';
sql> alter diskgroup reco_gfsdw2 add disk 'o/192.168.10.4/RECO_GFSDW2_CD_04_gfsdw2cel02';
Please remember to replace the above names correctly
6. Run query
sql> select * from gv$asm_operation where state='RUN';
sql> select group_number,failgroup,mode_status,count(*) from v$asm_disk group by group_number,failgroup,mode_status;
In this case :-
It returned 12 for data and reco diskgroups, plus 10 for dbfs_dg on each failgroup.
7. Wait for the rebalance to complete.
SQL>select * from gv$asm_operation where state='RUN';
8. In case of errors, please do collect ms-odl* log file from $ADR_BASE/diag/asm/cell/gfsdw2cel02/trace on the cell.
Attachments
This solution has no attachment