Martins Blog

Trying to explain complex things in simple terms

Do not shut down ASM when dropping a disk

Posted by Martin Bach on December 1, 2009

I was doing some maintenance today with an ASM disk group (10.2.0.4.1 64bit on RHEL 5.3). In a nutshell, we swapped disks in the FRA to increase the capacity of the disk group. One of the nice things is that this can be done online (but not where I work), and in one command:

SQL> alter diskgroup fra drop disk 'FRA1' add disk 'ORCL:FRA3';

The keen observer will spot the use of ASMLib here. I have to say that ASMLib is a really nice tool. Exadata doesn’t use it which makes you think it’s deprecated internally. The download side had RPMs for RHEL 5.4 listed so we’re fine for some time.

Back to my problem. During the rebalance operation after the alter diskgroup command the 2nd node of the cluster went down. Oops, that wasn’t great: the entries in v$asm_operation were gone immediately even though there were a good 30 minutes remaining. A quick glance at the alert.log confirmed that the rebalance was indeed aborted:

Shutting down instance: further logons disabled
Tue Dec  1 14:32:33 2009
Shutting down instance (immediate)
License high water mark = 8
Tue Dec  1 14:32:33 2009
SQL> ALTER DISKGROUP ALL DISMOUNT
Tue Dec  1 14:32:33 2009
ERROR: ORA-1013 thrown in ARB0 for group number 3
Tue Dec  1 14:32:33 2009
Errors in file /u01/app/oracle/product/10.2.0/asm_1/admin/+ASM/bdump/+asm1_arb0_14234.trc:
ORA-01013: user requested cancel of current operation
Tue Dec  1 14:32:33 2009
NOTE: stopping process ARB0

There were some other error messages which didn’t really matter. Key thing was that the operation canceled. When the server came back up, I looked at v$asm_disk:

SQL> select mount_status, header_status,state,library from v$asm_disk where name = 'FRA1';

MOUNT_S HEADER_STATU STATE    LIBRARY
------- ------------ -------- ----------------------------------------------------------------
CACHED  MEMBER       DROPPING ASM Library - Generic Linux, version 2.0.4 (KABI_V2)

State dropping? That’s where it left just when the instance went down. OK, so why not drop the disk then if it already thinks it’s being dropped:

SQL> alter diskgroup fra drop disk 'fra1';
alter diskgroup fra drop disk 'fra1'
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15071: ASM disk "FRA1" is already being dropped

There is usually a force option, too, but the documentation said that that was not allowed for external redundancy. Tried it anyway but:

SQL> a  force
 1* alter diskgroup fra drop disk 'fra1' force
SQL> /
alter diskgroup fra drop disk 'fra1' force
*
ERROR at line 1:
ORA-15067: command or option incompatible with diskgroup redundancy

Fair enough, documentation proved to be correct. So I had an ASM disk in my disk group that wasn’t playing ball. What to do now? I remembered from the oracle-l mailing list that a rebalance might help. Since no one was pushing me too hard and I had raised it with Oracle Support to buy time I tried the manual rebalance operation:

SQL> alter diskgroup FRA rebalance power 6;

Once that completed I undropped the disks for disk group FRA. In hinsight I should have checked if the disk “FRA1″ disappeared, it may have done so.

sql> alter diskgroup FRA undrop disks;

This also triggers a rebalance, as always check v$asm_operation for an estimated duration. Since I always like to see the output of commands I don’t see everyday I posted it here:

SQL>  alter diskgroup FRA undrop disks;

Diskgroup altered.

SQL> select * from v$asm_operation;

GROUP_NUMBER OPERA STAT      POWER     ACTUAL      SOFAR   EST_WORK   EST_RATE EST_MINUTES
------------ ----- ---- ---------- ---------- ---------- ---------- ---------- -----------
 3 REBAL RUN           6          1          0          1          0           0

SQL> /

GROUP_NUMBER OPERA STAT      POWER     ACTUAL      SOFAR   EST_WORK   EST_RATE EST_MINUTES
------------ ----- ---- ---------- ---------- ---------- ---------- ---------- -----------
 3 EXPEL RUN           0          0          0          0          0           0

SQL> /

no rows selected

Notice the EXPEL in the operation column.

SQL> select group_number,path,total_mb,free_mb,header_status,state from v$asm_disk
 2  where name = 'FRA1';

no rows selected

SQL> select name from v$asm_disk;

NAME
------------------------------



DATA1
DATA2
DATA3
DATA4
DATA5
DATA6
DATA7
DATA8
FRA2
FRA3
LOGCTL1

14 rows selected.

No more mention of FRA1, great! Problem solved, next one please.

About these ads

2 Responses to “Do not shut down ASM when dropping a disk”

  1. [...] 10-How to solve interrupted drop disk from diskgroup in ASM ? Martin Bach-Do not shut down ASM when dropping a disk [...]

  2. Kevin said

    Great post! This really helped me with a similar problem.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

Join 2,271 other followers

%d bloggers like this: