Another one of the cool but underrated features in 12c is the possibility to recover a physical standby over the network with one line in RMAN.
Why do you need to perform this activity? Assume someone really clever created a segment “nologging” and the database was not in force logging mode. This operation cannot be replicated by redo apply on the standby, and you are bound to have a problem. Or, in my case, I had the standby shut down in my lab environment (intentionally) and created a few PDBs on my primary. For some reason I lost an archived redo log. This would of course not happen in a production environment, but my lab VM is limited when it comes to space and I may have moved my backup to a USB disk that I didn’t bring along.
So a long story short: you either recreate the standby-which is a rather blunt-force approach to the problem-or recover the standby from the SCN where the problem originated. I was hoping the new 12c recover database from service command in RMAN could help. It did, but not as I expected at first.
The problem with my lab
My environment is Oracle Linux 6.4 x86-64 on Virtual Box with 4 GB RAM. Oracle is 12.1.0.1.2. CDB1 is primary, CDB2 is my standby database. Both are on the same VM (I only have 8 GB total memory). CDB2 complained about a missing archived redo log:
SQL> select * from v$archive_gap; THREAD# LOW_SEQUENCE# HIGH_SEQUENCE# CON_ID ---------- ------------- -------------- ---------- 1 27 27 1
Actually this is the first time I ever saw v$archive_gap return something when there was a gap! The issue is also visible in the alert.log
2014-03-12 14:12:29.361000 +00:00 FAL[client]: Failed to request gap sequence GAP - thread 1 sequence 27-27 DBID 793624397 branch 819906960 FAL[client]: All defined FAL servers have been attempted. ------------------------------------------------------------ Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization parameter is defined to a value that's sufficiently large enough to maintain adequate log switch information to resolve archivelog gaps. ------------------------------------------------------------
And to cut a long story short, I couldn’t get that back as I forgot my USB drive at home. Now another useful 12c feature will warn you that you cannot switch over!
In dgmgrl you can validate the database:
DGMGRL> validate database cdb2; Database Role: Physical standby database Primary Database: cdb1 Ready for Switchover: No Ready for Failover: Yes (Primary Running) Temporary Tablespace File Information: cdb1 TEMP Files: 5 cdb2 TEMP Files: 3 Flashback Database Status: cdb1: Off cdb2: On Standby Apply-Related Information: Apply State: Running Apply Lag: 205 days 3 hours 44 minutes 32 seconds Apply Delay: 0 minutes
You can fail over, but not switch over. And indeed, 205 days are rather a long delay… The delay is calculated using sysdate-first_time in v$archived_log for the missing archived redo log.
Also notice that I have fewer temp files in CDB1 compared to CDB2, that’s because I have 2 additional PDBs in CDB1 that are not (yet) in CDB2. OK, so here is my first try to get things back into shape (note that I’m connecting to the standby database as the target):
[oracle@server1 CDB2]$ rman target sys/xxx@cdb2 auxiliary sys/xxx@cdb1 Recovery Manager: Release 12.1.0.1.0 - Production on Wed Mar 12 14:19:24 2014 Copyright (c) 1982, 2013, Oracle and/or its affiliates. All rights reserved. connected to target database: CDB1 (DBID=793624397, not open) connected to auxiliary database: CDB1 (DBID=793624397) RMAN> recover database from service cdb1; ... Starting recover at 12.03.2014 14:19:57 using channel ORA_DISK_1 using channel ORA_DISK_2 skipping datafile 5; already restored to SCN 1735641 skipping datafile 7; already restored to SCN 1735641 channel ORA_DISK_1: starting incremental datafile backup set restore channel ORA_DISK_1: using network backup set from service cdb1 destination for restore of datafile 00001: /u01/oradata/CDB2/system01.dbf channel ORA_DISK_2: starting incremental datafile backup set restore channel ORA_DISK_2: using network backup set from service cdb1 destination for restore of datafile 00003: /u01/oradata/CDB2/sysaux01.dbf channel ORA_DISK_1: restore complete, elapsed time: 00:01:15 channel ORA_DISK_1: starting incremental datafile backup set restore channel ORA_DISK_1: using network backup set from service cdb1 destination for restore of datafile 00004: /u01/oradata/CDB2/undotbs01.dbf channel ORA_DISK_2: restore complete, elapsed time: 00:01:15 channel ORA_DISK_2: starting incremental datafile backup set restore channel ORA_DISK_2: using network backup set from service cdb1 destination for restore of datafile 00006: /u01/oradata/CDB2/users01.dbf channel ORA_DISK_2: restore complete, elapsed time: 00:00:03 channel ORA_DISK_2: starting incremental datafile backup set restore channel ORA_DISK_2: using network backup set from service cdb1 destination for restore of datafile 00008: /u01/oradata/CDB2/MASTER/system01.dbf channel ORA_DISK_1: restore complete, elapsed time: 00:00:10 channel ORA_DISK_1: starting incremental datafile backup set restore channel ORA_DISK_1: using network backup set from service cdb1 destination for restore of datafile 00009: /u01/oradata/CDB2/MASTER/sysaux01.dbf channel ORA_DISK_2: restore complete, elapsed time: 00:00:08 channel ORA_DISK_2: starting incremental datafile backup set restore channel ORA_DISK_2: using network backup set from service cdb1 destination for restore of datafile 00010: /u01/oradata/CDB2/MASTER/MASTER_users01.dbf channel ORA_DISK_2: restore complete, elapsed time: 00:00:07 channel ORA_DISK_1: restore complete, elapsed time: 00:00:16 starting media recovery archived log for thread 1 with sequence 28 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_28_9l0tjztf_.arc archived log for thread 1 with sequence 29 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_29_9l0tk008_.arc archived log for thread 1 with sequence 30 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_30_9l0tk2tq_.arc archived log for thread 1 with sequence 31 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_31_9l0tk73b_.arc archived log for thread 1 with sequence 32 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_32_9l0tk7mt_.arc archived log for thread 1 with sequence 33 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_33_9l0tkrd1_.arc archived log for thread 1 with sequence 34 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_34_9l0tkrgv_.arc archived log for thread 1 with sequence 35 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_35_9l0tkzvo_.arc archived log for thread 1 with sequence 36 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_36_9l0tkyrm_.arc archived log for thread 1 with sequence 37 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_37_9l0tkwf6_.arc archived log for thread 1 with sequence 38 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_38_9l0tl281_.arc archived log for thread 1 with sequence 39 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_39_9l0tl24w_.arc archived log for thread 1 with sequence 40 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_40_9l0tl3m0_.arc archived log for thread 1 with sequence 41 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_41_9l0tl7o2_.arc archived log for thread 1 with sequence 42 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_42_9l0tlos0_.arc archived log for thread 1 with sequence 43 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_43_9l0tlng5_.arc archived log for thread 1 with sequence 44 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_44_9l0trto7_.arc archived log for thread 1 with sequence 45 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_45_9l0v5ho6_.arc media recovery complete, elapsed time: 00:00:00 Finished recover at 12.03.2014 14:21:41 RMAN>
How very nice! It did it, or let’s say, it didn’t report any errors. Except that the operation had no effect.
Digging deeper
Here is the true problem. Connected to CDB2 I see these data files:
RMAN> report schema; using target database control file instead of recovery catalog RMAN-06139: WARNING: control file is not current for REPORT SCHEMA Report of database schema for database with db_unique_name CDB2 List of Permanent Datafiles =========================== File Size(MB) Tablespace RB segs Datafile Name ---- -------- -------------------- ------- ------------------------ 1 790 SYSTEM *** /u01/oradata/CDB2/system01.dbf 3 750 SYSAUX *** /u01/oradata/CDB2/sysaux01.dbf 4 90 UNDOTBS1 *** /u01/oradata/CDB2/undotbs01.dbf 5 250 PDB$SEED:SYSTEM *** /u01/oradata/CDB2/pdbseed/system01.dbf 6 5 USERS *** /u01/oradata/CDB2/users01.dbf 7 590 PDB$SEED:SYSAUX *** /u01/oradata/CDB2/pdbseed/sysaux01.dbf 8 260 MASTER:SYSTEM *** /u01/oradata/CDB2/MASTER/system01.dbf 9 620 MASTER:SYSAUX *** /u01/oradata/CDB2/MASTER/sysaux01.dbf 10 26 MASTER:USERS *** /u01/oradata/CDB2/MASTER/MASTER_users01.dbf List of Temporary Files ======================= File Size(MB) Tablespace Maxsize(MB) Tempfile Name ---- -------- -------------------- ----------- -------------------- 1 60 TEMP 32767 /u01/oradata/CDB2/temp01.dbf 2 20 PDB$SEED:TEMP 32767 /u01/oradata/CDB2/pdbseed/pdbseed_temp01.dbf 3 20 MASTER:TEMP 32767 /u01/oradata/CDB2/MASTER/temp01.dbf
While for CDB1 I get those. Note there are data files with file# > 10
RMAN> report schema; using target database control file instead of recovery catalog Report of database schema for database with db_unique_name CDB1 List of Permanent Datafiles =========================== File Size(MB) Tablespace RB segs Datafile Name ---- -------- -------------------- ------- ------------------------ 1 790 SYSTEM *** /u01/oradata/CDB1/system01.dbf 3 750 SYSAUX *** /u01/oradata/CDB1/sysaux01.dbf 4 90 UNDOTBS1 *** /u01/oradata/CDB1/undotbs01.dbf 5 250 PDB$SEED:SYSTEM *** /u01/oradata/CDB1/pdbseed/system01.dbf 6 5 USERS *** /u01/oradata/CDB1/users01.dbf 7 590 PDB$SEED:SYSAUX *** /u01/oradata/CDB1/pdbseed/sysaux01.dbf 8 260 MASTER:SYSTEM *** /u01/oradata/CDB1/MASTER/system01.dbf 9 620 MASTER:SYSAUX *** /u01/oradata/CDB1/MASTER/sysaux01.dbf 10 26 MASTER:USERS *** /u01/oradata/CDB1/MASTER/MASTER_users01.dbf 11 270 VALID:SYSTEM *** /u01/oradata/CDB1/valid/system01.dbf 12 610 VALID:SYSAUX *** /u01/oradata/CDB1/valid/sysaux01.dbf 13 260 RCAT:SYSTEM *** /u01/oradata/CDB1/rcat/system01.dbf 14 610 RCAT:SYSAUX *** /u01/oradata/CDB1/rcat/sysaux01.dbf 15 100 RCAT:RMAN_CATALOG *** /u01/oradata/CDB1/rcat/rman_catalog01.dbf 16 50 VALID:USERS *** /u01/oradata/CDB1/valid/users01.dbf List of Temporary Files ======================= File Size(MB) Tablespace Maxsize(MB) Tempfile Name ---- -------- -------------------- ----------- -------------------- 1 60 TEMP 32767 /u01/oradata/CDB1/temp01.dbf 2 20 PDB$SEED:TEMP 32767 /u01/oradata/CDB1/pdbseed/pdbseed_temp01.dbf 3 20 MASTER:TEMP 32767 /u01/oradata/CDB1/MASTER/temp01.dbf 4 20 VALID:TEMP 32767 /u01/oradata/CDB1/valid/pdbseed_temp01.dbf 5 20 RCAT:TEMP 32767 /u01/oradata/CDB1/rcat/pdbseed_temp01.dbf
In hindsight it makes sense: the “recover … from service” command did indeed recover the data files it found. But it didn’t create those it didn’t see in the standby database’s controlfile! It was surprising to me at first.
Getting things back to working order
Using the old fashioned way I first have to get those additional data files to my standby database. This requires a backup of the files that didn’t exist at v$archived_log.begin_scn# for sequence# = 27.
In this scenario it is obvious which files are missing but the scientifically correct method of working that out is this command:
SQL> SELECT FILE#, NAME FROM V$DATAFILE WHERE CREATION_CHANGE# > 1893106; FILE# NAME ---------- ------------------------------------------------------------ 11 /u01/oradata/CDB1/valid/system01.dbf 12 /u01/oradata/CDB1/valid/sysaux01.dbf 13 /u01/oradata/CDB1/rcat/system01.dbf 14 /u01/oradata/CDB1/rcat/sysaux01.dbf 15 /u01/oradata/CDB1/rcat/rman_catalog01.dbf 16 /u01/oradata/CDB1/valid/users01.dbf
1893106 is the first change number for archived log 27. Now in the next step you have to create a backup of these files (backup datafile 11,12,13,14,15,16 format … tag …) and make this backup available on the standby host. You also need a new standby controlfile. Back the current controlfile up and ship it over too. Then shut the standby down, and restore the controlfile. Use the RMAN “catalog start with” command to register the backup of your missing data files and restore them. In my case I’m using a file system for the data files so the restore can be scripted:
SQL> select 'set newname for datafile ' || file# || ' to ''' || replace(name,'CDB1','CDB2') || ''';' 2 from v$datafile where file# between 11 and 16;
Resulting in this run block + restore commands, executed on the standby:
run { set newname for datafile 11 to '/u01/oradata/CDB2/valid/system01.dbf'; set newname for datafile 12 to '/u01/oradata/CDB2/valid/sysaux01.dbf'; set newname for datafile 13 to '/u01/oradata/CDB2/rcat/system01.dbf'; set newname for datafile 14 to '/u01/oradata/CDB2/rcat/sysaux01.dbf'; set newname for datafile 15 to '/u01/oradata/CDB2/rcat/rman_catalog01.dbf'; set newname for datafile 16 to '/u01/oradata/CDB2/valid/users01.dbf'; restore datafile 11; restore datafile 12; restore datafile 13; restore datafile 14; restore datafile 15; restore datafile 16; }
The directories referenced here obviously have to exist. For exactly this reason it is probably not a good idea to use a directory on the file system per PDB: if it doesn’t exist on the standby your MRP0 process will fall over, complaining about the missing directory. When using the “report schema” command you should now have an equivalent number of data files on each database.
Getting the standby into sync with the primary the easy way
And NOW it’s time for the recover from service command just to be safe and have the standby on the same level as the primary. For this to work managed recovery has to be switched off. Here is the output in all its glory (the standby database is the target):
[oracle@server1 CDB2]$ rman target sys/xxx@cdb2 auxiliary sys/xxx@cdb1 Recovery Manager: Release 12.1.0.1.0 - Production on Wed Mar 12 15:00:12 2014 Copyright (c) 1982, 2013, Oracle and/or its affiliates. All rights reserved. connected to target database: CDB1 (DBID=793624397, not open) connected to auxiliary database: CDB1 (DBID=793624397) RMAN> recover database from service cdb1; Starting recover at 12.03.2014 15:00:27 using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=237 device type=DISK skipping datafile 5; already restored to SCN 1735641 skipping datafile 7; already restored to SCN 1735641 skipping datafile 8; already restored to SCN 2054839 skipping datafile 9; already restored to SCN 2054839 skipping datafile 10; already restored to SCN 2054839 skipping datafile 13; already restored to SCN 2068617 skipping datafile 14; already restored to SCN 2068617 skipping datafile 15; already restored to SCN 2068617 channel ORA_DISK_1: starting incremental datafile backup set restore channel ORA_DISK_1: using network backup set from service cdb1 destination for restore of datafile 00001: /u01/oradata/CDB2/system01.dbf channel ORA_DISK_1: restore complete, elapsed time: 00:00:35 channel ORA_DISK_1: starting incremental datafile backup set restore channel ORA_DISK_1: using network backup set from service cdb1 destination for restore of datafile 00003: /u01/oradata/CDB2/sysaux01.dbf channel ORA_DISK_1: restore complete, elapsed time: 00:00:25 channel ORA_DISK_1: starting incremental datafile backup set restore channel ORA_DISK_1: using network backup set from service cdb1 destination for restore of datafile 00004: /u01/oradata/CDB2/undotbs01.dbf channel ORA_DISK_1: restore complete, elapsed time: 00:00:03 channel ORA_DISK_1: starting incremental datafile backup set restore channel ORA_DISK_1: using network backup set from service cdb1 destination for restore of datafile 00006: /u01/oradata/CDB2/users01.dbf channel ORA_DISK_1: restore complete, elapsed time: 00:00:01 channel ORA_DISK_1: starting incremental datafile backup set restore channel ORA_DISK_1: using network backup set from service cdb1 destination for restore of datafile 00011: /u01/oradata/CDB2/valid/system01.dbf channel ORA_DISK_1: restore complete, elapsed time: 00:00:15 channel ORA_DISK_1: starting incremental datafile backup set restore channel ORA_DISK_1: using network backup set from service cdb1 destination for restore of datafile 00012: /u01/oradata/CDB2/valid/sysaux01.dbf channel ORA_DISK_1: restore complete, elapsed time: 00:00:15 channel ORA_DISK_1: starting incremental datafile backup set restore channel ORA_DISK_1: using network backup set from service cdb1 destination for restore of datafile 00016: /u01/oradata/CDB2/valid/users01.dbf channel ORA_DISK_1: restore complete, elapsed time: 00:00:01 starting media recovery archived log for thread 1 with sequence 47 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_47_9l0wzq03_.arc archived log file name=/u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_47_9l0wzq03_.arc thread=1 sequence=47 media recovery complete, elapsed time: 00:00:01 Finished recover at 12.03.2014 15:02:06
Now everything looks a lot better:
DGMGRL> edit database cdb2 set state=apply-on; DGMGRL> validate database cdb2; Database Role: Physical standby database Primary Database: cdb1 Ready for Switchover: Yes Ready for Failover: Yes (Primary Running) Flashback Database Status: cdb1: Off cdb2: Off Future Log File Groups Configuration: Thread # Online Redo Log Groups Standby Redo Log Groups (cdb2) (cdb1) 1 3 0 Warning: standby redo logs not configured for thread 3 on cdb1 DGMGRL>
This is confirmed from the standby as well:
SQL> select process,status,thread#,sequence#,block#,delay_mins from v$managed_standby; PROCESS STATUS THREAD# SEQUENCE# BLOCK# DELAY_MINS --------- ------------ ---------- ---------- ---------- ---------- ARCH CLOSING 1 47 26624 0 ARCH CONNECTED 0 0 0 0 ARCH CONNECTED 0 0 0 0 ARCH CONNECTED 0 0 0 0 RFS IDLE 0 0 0 0 RFS IDLE 0 0 0 0 RFS IDLE 1 48 7673 0 MRP0 APPLYING_LOG 1 48 7673 0 8 rows selected.
Excellent! We are back in sync.
Hi Martin,
I think you could have used just “restore database from service” if you had the db_file_name_convert set correctly… Did you test it?
Hi Magnus,
that’s what I wanted to try next. In theory it should work, but I skipped it initially because of this excerpt from the 12.1 documentation:
“RMAN restores database files, over the network, from a physical standby database by using the FROM SERVICE clause of the RESTORE command. The FROM SERVICE clause provides the service name of the physical standby database from which the files must be restored. During the restore operation, RMAN creates backup sets, on the physical standby database, of the files that need to be restored and then transfers these backup sets to the target database over the network.”
(http://docs.oracle.com/cd/E16655_01/backup.121/e17630/rcmadvre.htm#BRADV680)
So to me that sounds that RESTORE … from service is intended to restore the PRIMARY, where RECOVER … from service is for the STANDBY. Could be unlucky wording, but that’s how I understood it.
Martin
I think that’s just unlucky wording, as you say. I’ve used it from primary to standby in lab environments, so it works.
Time to raise a documentation bug then…