This is in reply to some work I was supposed to carry out over the weekend on a DR database. It could have been really simple, but it turned out it wasn’t. As part of a major migration I moved a 2TB database from OCFS to ASM, which worked really nicely with a Data Guard switchover operation. Trouble now was that the standby databases still references online redo log files on the file system rather than ASM. I wanted to get rid of these annoying messages by updating the control file which was already rectified on the primary.
Under normal (non ASM) circumstances that doesn’t cause any trouble, but since ASM has the naming convention “+DISGROUPNAME/db_unique_name” the standby controlfile isn’t that easy to replace. Well actually it is but I needed some more careful thinking.
The steps are quite simple:
- Create a standby controlfile on the primary: alter database create standby controlfile as ‘/tmp/primary.ctl’;
- Ship this file to the standby database server
- Create a backup of the existing standby controlfile just in case (rman: backup current controlfile format ‘/tmp/backup.ctl’)
- Cancel managed recovery: alter database recover managed standby database cancel;
- Shut the standby database down: shutdown immediate (if you’re a good citizen you also defer log shipping from the primary to avoid any errors poluting the alert.log and potentially triggering an alarm)
- Startup nomount
- RMAN: restore controlfile from ‘/tmp/primary.ctl’
A remark at this place: the controlfiles parameter usually uses OMF syntax when in ASM. Prior to 10.2.0.4 you couldn’t restore the controlfile if the spfile parameter was an OMF name, but it seems you can do this now. Users of other versions please reset the control_files parameter in the spfile and bounce the instance.
And here the problems began! Let’s compare the systems I set up-orcl is the primary whereas dest is the (physical) standby.
These are the data files on the primary:
SQL> select name from v$datafile; NAME -------------------------------------------------------------------------------- +DATA/orcl/datafile/system.261.697925953 +DATA/orcl/datafile/sysaux.262.697926011 +DATA/orcl/datafile/undotbs1.263.697926061 +DATA/orcl/datafile/users.265.697926105 +DATA/orcl/datafile/streams_tbs.274.703447755
… and the standby:
NAME -------------------------------------------------------------------------------- +DATA/dest/datafile/system.327.705003099 +DATA/dest/datafile/sysaux.326.705003205 +DATA/dest/datafile/undotbs1.320.705003311 +DATA/dest/datafile/users.314.705003351 +DATA/dest/datafile/streams_tbs.273.705003371
I have followed my own instructions so far, here’s proof that I am using a spfile on the standby:
SQL> select count(isspecified),isspecified from v$spparameter group by isspecified; COUNT(ISSPECIFIED) ISSPEC ------------------ ------ 311 FALSE 159 TRUE SQL> show parameter control_files NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ control_files string +DATA/dest/controlfile/current .272.705003073, +FRA/dest/cont rolfile/current.277.705003073
Now let’s back up the controlfile on the standby, just for that warm fuzzy feeling that we can get it back if needed.
RMAN> backup current controlfile format '/tmp/standby.ctl'; Starting backup at 07.12.2009 18:29 Starting implicit crosscheck backup at 07.12.2009 18:29 allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=11 device type=DISK Crosschecked 8 objects Finished implicit crosscheck backup at 07.12.2009 18:29 Starting implicit crosscheck copy at 07.12.2009 18:29 using channel ORA_DISK_1 Crosschecked 2 objects Finished implicit crosscheck copy at 07.12.2009 18:29 searching for all files in the recovery area cataloging files... cataloging done List of Cataloged Files ======================= File Name: +fra/DEST/ARCHIVELOG/2009_12_07/thread_1_seq_40.286.705001903 File Name: +fra/DEST/ARCHIVELOG/2009_11_30/thread_1_seq_2.294.704315075 File Name: +fra/DEST/ARCHIVELOG/2009_11_23/thread_1_seq_1.279.703705239 File Name: +fra/DEST/ARCHIVELOG/2009_11_20/thread_1_seq_21.268.703447517 File Name: +fra/DEST/CONTROLFILE/Current.289.705002253 using channel ORA_DISK_1 channel ORA_DISK_1: starting full datafile backup set channel ORA_DISK_1: specifying datafile(s) in backup set including current control file in backup set channel ORA_DISK_1: starting piece 1 at 07.12.2009 18:29 channel ORA_DISK_1: finished piece 1 at 07.12.2009 18:29 piece handle=/tmp/standby.ctl tag=TAG20091207T182916 comment=NONE channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01 Finished backup at 07.12.2009 18:29 Starting Control File and SPFILE Autobackup at 07.12.2009 18:29 piece handle=+FRA/dest/autobackup/2009_12_07/s_705003768.322.705004161 comment=NONE Finished Control File and SPFILE Autobackup at 07.12.2009 18:29
After putting the instance into nomount mode, I restored the controlfile:
RMAN> restore controlfile from '/tmp/primary.ctl'; Starting restore at 07.12.2009 18:30 allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=136 device type=DISK channel ORA_DISK_1: copied control file copy output file name=+DATA/dest/controlfile/current.272.705003073 output file name=+FRA/dest/controlfile/current.277.705003073 Finished restore at 07.12.2009 18:30
See-no OMF related error! But the trouble is looming, if we ask RMAN to report the controlfile contents we get:
RMAN> report schema; Starting implicit crosscheck backup at 07.12.2009 18:31 allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=11 device type=DISK Crosschecked 8 objects Finished implicit crosscheck backup at 07.12.2009 18:31 Starting implicit crosscheck copy at 07.12.2009 18:31 using channel ORA_DISK_1 Crosschecked 2 objects Finished implicit crosscheck copy at 07.12.2009 18:31 searching for all files in the recovery area cataloging files... cataloging done List of Cataloged Files ======================= File Name: +fra/DEST/AUTOBACKUP/2009_12_07/s_705003768.322.705004161 File Name: +fra/DEST/ARCHIVELOG/2009_12_07/thread_1_seq_40.286.705001903 File Name: +fra/DEST/ARCHIVELOG/2009_12_07/thread_1_seq_43.317.705003599 File Name: +fra/DEST/ARCHIVELOG/2009_12_07/thread_1_seq_45.319.705003611 File Name: +fra/DEST/ARCHIVELOG/2009_12_07/thread_1_seq_44.320.705003613 File Name: +fra/DEST/ARCHIVELOG/2009_11_30/thread_1_seq_2.294.704315075 File Name: +fra/DEST/ARCHIVELOG/2009_11_23/thread_1_seq_1.279.703705239 File Name: +fra/DEST/ARCHIVELOG/2009_11_20/thread_1_seq_21.268.703447517 File Name: +fra/DEST/CONTROLFILE/Current.289.705002253 RMAN-06139: WARNING: control file is not current for REPORT SCHEMA Report of database schema for database with db_unique_name DEST List of Permanent Datafiles =========================== File Size(MB) Tablespace RB segs Datafile Name ---- -------- -------------------- ------- ------------------------ 1 700 SYSTEM *** +DATA/orcl/datafile/system.261.697925953 2 600 SYSAUX *** +DATA/orcl/datafile/sysaux.262.697926011 3 200 UNDOTBS1 *** +DATA/orcl/datafile/undotbs1.263.697926061 4 65 USERS *** +DATA/orcl/datafile/users.265.697926105 5 50 STREAMS_TBS *** +DATA/orcl/datafile/streams_tbs.274.703447755 List of Temporary Files ======================= File Size(MB) Tablespace Maxsize(MB) Tempfile Name ---- -------- -------------------- ----------- -------------------- 1 20 TEMP 32767 +DATA/orcl/tempfile/temp.264.697926079
Not good – references the primary, but where are the standby database’s files? It gets worse when we turn on managed recovery, this is from the alert.log:
alter database recover managed standby database disconnect from session Attempt to start background Managed Standby Recovery process (dest) MRP0 started with pid=29, OS id=5916 MRP0: Background Managed Standby Recovery process started (dest) 2009-12-07 18:32:34.837000 +00:00 started logmerger process Managed Standby Recovery not using Real Time Apply WARNING! Recovering data file 1 from a fuzzy file. If not the current file it might be an online backup taken without entering the begin backup command. WARNING! Recovering data file 2 from a fuzzy file. If not the current file it might be an online backup taken without entering the begin backup command. WARNING! Recovering data file 3 from a fuzzy file. If not the current file it might be an online backup taken without entering the begin backup command. WARNING! Recovering data file 4 from a fuzzy file. If not the current file it might be an online backup taken without entering the begin backup command. WARNING! Recovering data file 5 from a fuzzy file. If not the current file it might be an online backup taken without entering the begin backup command. Parallel Media Recovery started with 2 slaves Waiting for all non-current ORLs to be archived... All non-current ORLs have been archived. Clearing online redo logfile 1 +DATA/orcl/onlinelog/group_1.258.699824793 Clearing online log 1 of thread 1 sequence number 45 Completed: alter database recover managed standby database disconnect from session 2009-12-07 18:32:41.195000 +00:00 Clearing online redo logfile 1 complete Clearing online redo logfile 2 +DATA/orcl/onlinelog/group_2.380.699824509 Clearing online log 2 of thread 1 sequence number 46
Options at this stage are:
- Restore previous controlfile and leave it
- Set db_file_name_convert and log_file_name_convert
- RMAN catalog command
- Panic :)
Of course we don’t panic, here’s the solution. First stop managed recovery. Now we are going to add the files in the +DATA/dest directory to the standby controlfile.
RMAN> catalog start with '+data/dest/'; searching for all files that match the pattern +data/dest/ List of Files Unknown to the Database ===================================== File Name: +data/DEST/ONLINELOG/group_1.279.705003387 File Name: +data/DEST/ONLINELOG/group_2.278.705003397 File Name: +data/DEST/ONLINELOG/group_3.277.705003409 File Name: +data/DEST/ONLINELOG/group_4.276.705003427 File Name: +data/DEST/DATAFILE/SYSTEM.327.705003099 File Name: +data/DEST/DATAFILE/SYSAUX.326.705003205 File Name: +data/DEST/DATAFILE/UNDOTBS1.320.705003311 File Name: +data/DEST/DATAFILE/USERS.314.705003351 File Name: +data/DEST/DATAFILE/STREAMS_TBS.273.705003371 File Name: +data/DEST/CONTROLFILE/Current.269.703447253 File Name: +data/DEST/CONTROLFILE/Current.268.705001545 File Name: +data/DEST/CONTROLFILE/Current.275.705002253 Do you really want to catalog the above files (enter YES or NO)? yes cataloging files... cataloging done List of Cataloged Files ======================= File Name: +data/DEST/DATAFILE/SYSTEM.327.705003099 File Name: +data/DEST/DATAFILE/SYSAUX.326.705003205 File Name: +data/DEST/DATAFILE/UNDOTBS1.320.705003311 File Name: +data/DEST/DATAFILE/USERS.314.705003351 File Name: +data/DEST/DATAFILE/STREAMS_TBS.273.705003371 File Name: +data/DEST/CONTROLFILE/Current.275.705002253
Now let’s shut it down and switch to the copy:
RMAN> shutdown immediate RMAN> switch database to copy; datafile 1 switched to datafile copy "+DATA/dest/datafile/system.327.705003099" datafile 2 switched to datafile copy "+DATA/dest/datafile/sysaux.326.705003205" datafile 3 switched to datafile copy "+DATA/dest/datafile/undotbs1.320.705003311" datafile 4 switched to datafile copy "+DATA/dest/datafile/users.314.705003351" datafile 5 switched to datafile copy "+DATA/dest/datafile/streams_tbs.273.705003371" RMAN> report schema; RMAN-06139: WARNING: control file is not current for REPORT SCHEMA Report of database schema for database with db_unique_name DEST List of Permanent Datafiles =========================== File Size(MB) Tablespace RB segs Datafile Name ---- -------- -------------------- ------- ------------------------ 1 700 SYSTEM *** +DATA/dest/datafile/system.327.705003099 2 600 SYSAUX *** +DATA/dest/datafile/sysaux.326.705003205 3 200 UNDOTBS1 *** +DATA/dest/datafile/undotbs1.320.705003311 4 65 USERS *** +DATA/dest/datafile/users.314.705003351 5 50 STREAMS_TBS *** +DATA/dest/datafile/streams_tbs.273.705003371 List of Temporary Files ======================= File Size(MB) Tablespace Maxsize(MB) Tempfile Name ---- -------- -------------------- ----------- -------------------- 1 20 TEMP 32767 +DATA/orcl/tempfile/temp.264.697926079
And it worked! The temp file isn’t created as part of the duplicate process.
alter database recover managed standby database disconnect from session Attempt to start background Managed Standby Recovery process (dest) MRP0 started with pid=28, OS id=6062 MRP0: Background Managed Standby Recovery process started (dest) 2009-12-07 18:37:49.494000 +00:00 started logmerger process Managed Standby Recovery not using Real Time Apply Parallel Media Recovery started with 2 slaves Waiting for all non-current ORLs to be archived... All non-current ORLs have been archived. Media Recovery Waiting for thread 1 sequence 46 Completed: alter database recover managed standby database disconnect from session 2009-12-07 18:38:00.329000 +00:00 RFS[1]: Assigned to RFS process 6076 RFS[1]: Identified database type as 'physical standby': Client is ARCH pid 5003 2009-12-07 18:38:05.785000 +00:00 RFS[2]: Assigned to RFS process 6085 RFS[2]: Identified database type as 'physical standby': Client is LGWR ASYNC pid 5527 Primary database is in MAXIMUM PERFORMANCE mode RFS[3]: Assigned to RFS process 6087 RFS[3]: Identified database type as 'physical standby': Client is ARCH pid 4999 RFS[2]: Opened log for thread 1 sequence 46 dbid 1225706675 branch 699825902 2009-12-07 18:38:09.895000 +00:00 Archived Log entry 6 added for thread 1 sequence 46 rlc 699825902 ID 0x492bb232 dest 2: RFS[4]: Assigned to RFS process 6089 RFS[4]: Identified database type as 'physical standby': Client is LGWR ASYNC pid 5527 Primary database is in MAXIMUM PERFORMANCE mode Media Recovery Log +FRA/dest/archivelog/2009_12_07/thread_1_seq_46.328.705004687 RFS[4]: Opened log for thread 1 sequence 47 dbid 1225706675 branch 699825902 2009-12-07 18:38:11.619000 +00:00 Errors in file /u01/app/oracle/diag/rdbms/dest/dest/trace/dest_rfs_6087.trc: ORA-16401: archive log rejected by Remote File Server (RFS) 2009-12-07 18:38:15.460000 +00:00 Media Recovery Waiting for thread 1 sequence 47 (in transit) 2009-12-07 18:38:42.577000 +00:00 Archived Log entry 7 added for thread 1 sequence 47 rlc 699825902 ID 0x492bb232 dest 2: RFS[5]: Assigned to RFS process 6097 RFS[5]: Identified database type as 'physical standby': Client is LGWR ASYNC pid 5527 Primary database is in MAXIMUM PERFORMANCE mode RFS[5]: Opened log for thread 1 sequence 48 dbid 1225706675 branch 699825902 2009-12-07 18:38:45.485000 +00:00 Media Recovery Log +FRA/dest/archivelog/2009_12_07/thread_1_seq_47.329.705004691 Media Recovery Waiting for thread 1 sequence 48 (in transit)
Et voila.
Pingback: Blogroll Report 04/12/2009-11/12/2009 « Coskan’s Approach to Oracle
Excellent write-up with all available options and examples. I have faced a similar situation few months ago and spent quite a long time to resolve it. The DB_UNIQUE_NAME parameter combined with ASM’s naming convention make it very difficult to troubleshoot this kind of issues. Thank you very much.
Thanks Martin – just used this and it saved me.