Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1468477.1
Update Date:2012-06-18
Keywords:

Solution Type  Problem Resolution Sure

Solution  1468477.1 :   Hard Disk Incorrect Mapping After Disks Replacement - Celldisk Does Not Drop Before Replacing The Physical Disk  


Related Items
  • Exadata Database Machine X2-2 Hardware
  •  
Related Categories
  • PLA-Support>Sun Systems>x64>Engineered Systems HW>SN-x64: EXADATA
  •  




In this Document
Symptoms
Cause
Solution


Created from <SR 3-5709766931>

Applies to:

Exadata Database Machine X2-2 Hardware - Version Not Applicable and later
Information in this document applies to any platform.
After disk replacement, the status is following :-

CellCLI> list lun attributes name,celldisk,status

EXAMPLE :-
0_3 CD_03_gfsdw2cel02 normal
0_4 normal

Symptoms

Normally the devices (12 physical disks ) at OS level use the path /dev/sdX, where the map is

slot#  device
----------------------
0 sda
1 sdb
2 sdc
3 sdd
4 sde
5 sdf
6 sdg
7 sdh
8 sdi
9 sdj
10 sdk
11 sdl

Now, when the disks on slot 4 was replaced, it didn't reuse the original letter, so now the mapping is

slot#  device
----------------------
0 sda
1 sdb
2 sdc
3 sdd
4 sdac
5 sde
6 sdf
7 sdg
8 sdh
9 sdi
10 sdj
11 sdk

SUNDIAG will show incorrect mapping :-

SUNDIAG
LUN:
------------------
name:               0_4
cellDisk:          
deviceName:         /dev/sdac  <<<< Should be /dev/sde

Cause

 The reason why a disk could not be added to the diskgroup is because the griddisks do not exist.  The physicaldisk was replaced, so the new griddisks has not been created & ASM can't discover them.

Solution

 1. Drop the celldisk that is in predictive failure mode

In our case its DISK 4

EXAMPLE :-
cellcli> drop <celldisk cd_04_gfsdw2cel02> force.

Please remember to replace correct cell disk name in above

The force command will drop the griddisks.

2. Create the celldisk
EXAMPLE :-
cellcli> create celldisk cd_04_fgsdw2cel02 lun=0_4.

Note that the celldisk will be using /dev/sdac, but that is ok.That is the current path returned by the os.

3. Check lsscsi command.

0:2:0:0]    disk    LSI      MR9261-8i        2.90  /dev/sda
[0:2:1:0]    disk    LSI      MR9261-8i        2.90  /dev/sdb
[0:2:2:0]    disk    LSI      MR9261-8i        2.90  /dev/sdc
[0:2:3:0]    disk    LSI      MR9261-8i        2.90  /dev/sdd
[0:2:4:0]    disk    LSI      MR9261-8i        2.90  /dev/sdac <------------------------ here
[0:2:5:0]    disk    LSI      MR9261-8i        2.90  /dev/sde
[0:2:6:0]    disk    LSI      MR9261-8i        2.90  /dev/sdf
[0:2:7:0]    disk    LSI      MR9261-8i        2.90  /dev/sdg
[0:2:8:0]    disk    LSI      MR9261-8i        2.90  /dev/sdh
[0:2:9:0]    disk    LSI      MR9261-8i        2.90  /dev/sdi
[0:2:10:0]   disk    LSI      MR9261-8i        2.90  /dev/sdj
[0:2:11:0]   disk    LSI      MR9261-8i        2.90  /dev/sdk

At this time the status of the celldisk will be normal

3. Create the griddisks

EXAMPLE:-
cellcli>create griddisk DATA_GFSDW2_CD_04_gfsdw2cel02 celldisk=CD_04_gfsdw2cel02 ,size=423g
cellcli>create griddisk RECO_GFSDW2_CD_04_gfsdw2cel02 celldisk=CD_04_gfsdw2cel02 ,size=105.6875G
cellcli>create griddisk DBFS_DG_CD_04_gfsdw2cel02 celldisk=CD_04_gfsdw2cel02

Please remember to replace the above names & size correctly

Now the griddisks will be normal.

4. At this time the ASM instance will discover the new disks as candidate

EXAMPLE:-
sql>select path,header_status,group_number from v$asm_disk where path like '%CD_04_gfsdw2cel02';

It should return 3 griddisks with header_status='CANDIDATE';

o/192.168.10.4/DATA_GFSDW2_CD_04_gfsdw2cel02
o/192.168.10.4/RECO_GFSDW2_CD_04_gfsdw2cel02
o/192.168.10.4/DBFS_DG_CD_04_gfsdw2cel02


5. Add each disk to the diskgroup

EXAMPLE:-
sql>alter diskgroup dbfs_dg add disk 'o/192.168.10.4/DBFS_DG_CD_04_gfsdw2cel02';
sql> alter diskgroup DATA_GFSDW2 add disk 'o/192.168.10.4/DATA_GFSDW2_CD_04_gfsdw2cel02';
sql> alter diskgroup reco_gfsdw2 add disk 'o/192.168.10.4/RECO_GFSDW2_CD_04_gfsdw2cel02';

Please remember to replace the above names correctly


6. Run query

sql> select * from gv$asm_operation where state='RUN';
sql> select group_number,failgroup,mode_status,count(*) from v$asm_disk group by group_number,failgroup,mode_status;

In this case :-
It returned 12 for data and reco diskgroups, plus 10 for dbfs_dg on each failgroup.

7. Wait for the rebalance to complete.

SQL>select * from gv$asm_operation where state='RUN';

8. In case of errors, please do collect ms-odl* log file from $ADR_BASE/diag/asm/cell/gfsdw2cel02/trace on the cell.


Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback