Martins Blog

Trying to explain complex things in simple terms

Recovering a standby over the network in 12c

Posted by Martin Bach on March 17, 2014

Another one of the cool but underrated features in 12c is the possibility to recover a physical standby over the network with one line in RMAN.

Why do you need to perform this activity? Assume someone really clever created a segment “nologging” and the database was not in force logging mode. This operation cannot be replicated by redo apply on the standby, and you are bound to have a problem. Or, in my case, I had the standby shut down in my lab environment (intentionally) and created a few PDBs on my primary. For some reason I lost an archived redo log. This would of course not happen in a production environment, but my lab VM is limited when it comes to space and I may have moved my backup to a USB disk that I didn’t bring along.

So a long story short: you either recreate the standby-which is a rather blunt-force approach to the problem-or recover the standby from the SCN where the problem originated. I was hoping the new 12c recover database from service command in RMAN could help. It did, but not as I expected at first.

The problem with my lab

My environment is Oracle Linux 6.4 x86-64 on Virtual Box with 4 GB RAM. Oracle is 12.1.0.1.2. CDB1 is primary, CDB2 is my standby database. Both are on the same VM (I only have 8 GB total memory). CDB2 complained about a missing archived redo log:

SQL> select * from v$archive_gap;

   THREAD# LOW_SEQUENCE# HIGH_SEQUENCE#     CON_ID
---------- ------------- -------------- ----------
         1            27             27          1

Actually this is the first time I ever saw v$archive_gap return something when there was a gap! The issue is also visible in the alert.log

2014-03-12 14:12:29.361000 +00:00
FAL[client]: Failed to request gap sequence
 GAP - thread 1 sequence 27-27
 DBID 793624397 branch 819906960
FAL[client]: All defined FAL servers have been attempted.
------------------------------------------------------------
Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization
parameter is defined to a value that's sufficiently large
enough to maintain adequate log switch information to resolve
archivelog gaps.
------------------------------------------------------------

And to cut a long story short, I couldn’t get that back as I forgot my USB drive at home. Now another useful 12c feature will warn you that you cannot switch over!

In dgmgrl you can validate the database:

DGMGRL> validate database cdb2;

  Database Role:     Physical standby database
  Primary Database:  cdb1

  Ready for Switchover:  No
  Ready for Failover:    Yes (Primary Running)

  Temporary Tablespace File Information:
    cdb1 TEMP Files:  5
    cdb2 TEMP Files:  3

  Flashback Database Status:
    cdb1:  Off
    cdb2:  On

  Standby Apply-Related Information:
    Apply State:      Running
    Apply Lag:        205 days 3 hours 44 minutes 32 seconds
    Apply Delay:      0 minutes

You can fail over, but not switch over. And indeed, 205 days are rather a long delay… The delay is calculated using sysdate-first_time in v$archived_log for the missing archived redo log.

Also notice that I have fewer temp files in CDB1 compared to CDB2, that’s because I have 2 additional PDBs in CDB1 that are not (yet) in CDB2. OK, so here is my first try to get things back into shape (note that I’m connecting to the standby database as the target):

[oracle@server1 CDB2]$ rman target sys/xxx@cdb2 auxiliary sys/xxx@cdb1
Recovery Manager: Release 12.1.0.1.0 - Production on Wed Mar 12 14:19:24 2014

Copyright (c) 1982, 2013, Oracle and/or its affiliates.  All rights reserved.

connected to target database: CDB1 (DBID=793624397, not open)
connected to auxiliary database: CDB1 (DBID=793624397)

RMAN> recover database from service cdb1;

...

Starting recover at 12.03.2014 14:19:57
using channel ORA_DISK_1
using channel ORA_DISK_2
skipping datafile 5; already restored to SCN 1735641
skipping datafile 7; already restored to SCN 1735641
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: using network backup set from service cdb1
destination for restore of datafile 00001: /u01/oradata/CDB2/system01.dbf
channel ORA_DISK_2: starting incremental datafile backup set restore
channel ORA_DISK_2: using network backup set from service cdb1
destination for restore of datafile 00003: /u01/oradata/CDB2/sysaux01.dbf
channel ORA_DISK_1: restore complete, elapsed time: 00:01:15
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: using network backup set from service cdb1
destination for restore of datafile 00004: /u01/oradata/CDB2/undotbs01.dbf
channel ORA_DISK_2: restore complete, elapsed time: 00:01:15
channel ORA_DISK_2: starting incremental datafile backup set restore
channel ORA_DISK_2: using network backup set from service cdb1
destination for restore of datafile 00006: /u01/oradata/CDB2/users01.dbf
channel ORA_DISK_2: restore complete, elapsed time: 00:00:03
channel ORA_DISK_2: starting incremental datafile backup set restore
channel ORA_DISK_2: using network backup set from service cdb1
destination for restore of datafile 00008: /u01/oradata/CDB2/MASTER/system01.dbf
channel ORA_DISK_1: restore complete, elapsed time: 00:00:10
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: using network backup set from service cdb1
destination for restore of datafile 00009: /u01/oradata/CDB2/MASTER/sysaux01.dbf
channel ORA_DISK_2: restore complete, elapsed time: 00:00:08
channel ORA_DISK_2: starting incremental datafile backup set restore
channel ORA_DISK_2: using network backup set from service cdb1
destination for restore of datafile 00010: /u01/oradata/CDB2/MASTER/MASTER_users01.dbf
channel ORA_DISK_2: restore complete, elapsed time: 00:00:07
channel ORA_DISK_1: restore complete, elapsed time: 00:00:16

starting media recovery

archived log for thread 1 with sequence 28 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_28_9l0tjztf_.arc
archived log for thread 1 with sequence 29 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_29_9l0tk008_.arc
archived log for thread 1 with sequence 30 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_30_9l0tk2tq_.arc
archived log for thread 1 with sequence 31 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_31_9l0tk73b_.arc
archived log for thread 1 with sequence 32 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_32_9l0tk7mt_.arc
archived log for thread 1 with sequence 33 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_33_9l0tkrd1_.arc
archived log for thread 1 with sequence 34 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_34_9l0tkrgv_.arc
archived log for thread 1 with sequence 35 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_35_9l0tkzvo_.arc
archived log for thread 1 with sequence 36 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_36_9l0tkyrm_.arc
archived log for thread 1 with sequence 37 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_37_9l0tkwf6_.arc
archived log for thread 1 with sequence 38 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_38_9l0tl281_.arc
archived log for thread 1 with sequence 39 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_39_9l0tl24w_.arc
archived log for thread 1 with sequence 40 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_40_9l0tl3m0_.arc
archived log for thread 1 with sequence 41 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_41_9l0tl7o2_.arc
archived log for thread 1 with sequence 42 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_42_9l0tlos0_.arc
archived log for thread 1 with sequence 43 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_43_9l0tlng5_.arc
archived log for thread 1 with sequence 44 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_44_9l0trto7_.arc
archived log for thread 1 with sequence 45 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_45_9l0v5ho6_.arc
media recovery complete, elapsed time: 00:00:00
Finished recover at 12.03.2014 14:21:41

RMAN>

How very nice! It did it, or let’s say, it didn’t report any errors. Except that the operation had no effect.

Digging deeper

Here is the true problem. Connected to CDB2 I see these data files:

RMAN> report schema;

using target database control file instead of recovery catalog
RMAN-06139: WARNING: control file is not current for REPORT SCHEMA
Report of database schema for database with db_unique_name CDB2

List of Permanent Datafiles
===========================
File Size(MB) Tablespace           RB segs Datafile Name
---- -------- -------------------- ------- ------------------------
1    790      SYSTEM               ***     /u01/oradata/CDB2/system01.dbf
3    750      SYSAUX               ***     /u01/oradata/CDB2/sysaux01.dbf
4    90       UNDOTBS1             ***     /u01/oradata/CDB2/undotbs01.dbf
5    250      PDB$SEED:SYSTEM      ***     /u01/oradata/CDB2/pdbseed/system01.dbf
6    5        USERS                ***     /u01/oradata/CDB2/users01.dbf
7    590      PDB$SEED:SYSAUX      ***     /u01/oradata/CDB2/pdbseed/sysaux01.dbf
8    260      MASTER:SYSTEM        ***     /u01/oradata/CDB2/MASTER/system01.dbf
9    620      MASTER:SYSAUX        ***     /u01/oradata/CDB2/MASTER/sysaux01.dbf
10   26       MASTER:USERS         ***     /u01/oradata/CDB2/MASTER/MASTER_users01.dbf

List of Temporary Files
=======================
File Size(MB) Tablespace           Maxsize(MB) Tempfile Name
---- -------- -------------------- ----------- --------------------
1    60       TEMP                 32767       /u01/oradata/CDB2/temp01.dbf
2    20       PDB$SEED:TEMP        32767       /u01/oradata/CDB2/pdbseed/pdbseed_temp01.dbf
3    20       MASTER:TEMP          32767       /u01/oradata/CDB2/MASTER/temp01.dbf

While for CDB1 I get those. Note there are data files with file# > 10

RMAN> report schema;

using target database control file instead of recovery catalog
Report of database schema for database with db_unique_name CDB1

List of Permanent Datafiles
===========================
File Size(MB) Tablespace           RB segs Datafile Name
---- -------- -------------------- ------- ------------------------
1    790      SYSTEM               ***     /u01/oradata/CDB1/system01.dbf
3    750      SYSAUX               ***     /u01/oradata/CDB1/sysaux01.dbf
4    90       UNDOTBS1             ***     /u01/oradata/CDB1/undotbs01.dbf
5    250      PDB$SEED:SYSTEM      ***     /u01/oradata/CDB1/pdbseed/system01.dbf
6    5        USERS                ***     /u01/oradata/CDB1/users01.dbf
7    590      PDB$SEED:SYSAUX      ***     /u01/oradata/CDB1/pdbseed/sysaux01.dbf
8    260      MASTER:SYSTEM        ***     /u01/oradata/CDB1/MASTER/system01.dbf
9    620      MASTER:SYSAUX        ***     /u01/oradata/CDB1/MASTER/sysaux01.dbf
10   26       MASTER:USERS         ***     /u01/oradata/CDB1/MASTER/MASTER_users01.dbf
11   270      VALID:SYSTEM         ***     /u01/oradata/CDB1/valid/system01.dbf
12   610      VALID:SYSAUX         ***     /u01/oradata/CDB1/valid/sysaux01.dbf
13   260      RCAT:SYSTEM          ***     /u01/oradata/CDB1/rcat/system01.dbf
14   610      RCAT:SYSAUX          ***     /u01/oradata/CDB1/rcat/sysaux01.dbf
15   100      RCAT:RMAN_CATALOG    ***     /u01/oradata/CDB1/rcat/rman_catalog01.dbf
16   50       VALID:USERS          ***     /u01/oradata/CDB1/valid/users01.dbf

List of Temporary Files
=======================
File Size(MB) Tablespace           Maxsize(MB) Tempfile Name
---- -------- -------------------- ----------- --------------------
1    60       TEMP                 32767       /u01/oradata/CDB1/temp01.dbf
2    20       PDB$SEED:TEMP        32767       /u01/oradata/CDB1/pdbseed/pdbseed_temp01.dbf
3    20       MASTER:TEMP          32767       /u01/oradata/CDB1/MASTER/temp01.dbf
4    20       VALID:TEMP           32767       /u01/oradata/CDB1/valid/pdbseed_temp01.dbf
5    20       RCAT:TEMP            32767       /u01/oradata/CDB1/rcat/pdbseed_temp01.dbf

In hindsight it makes sense: the “recover … from service” command did indeed recover the data files it found. But it didn’t create those it didn’t see in the standby database’s controlfile! It was surprising to me at first.

Getting things back to working order

Using the old fashioned way I first have to get those additional data files to my standby database. This requires a backup of the files that didn’t exist at v$archived_log.begin_scn# for sequence# = 27.

In this scenario it is obvious which files are missing but the scientifically correct method of working that out is this command:

SQL> SELECT FILE#, NAME FROM V$DATAFILE WHERE CREATION_CHANGE# > 1893106;

     FILE# NAME
---------- ------------------------------------------------------------
        11 /u01/oradata/CDB1/valid/system01.dbf
        12 /u01/oradata/CDB1/valid/sysaux01.dbf
        13 /u01/oradata/CDB1/rcat/system01.dbf
        14 /u01/oradata/CDB1/rcat/sysaux01.dbf
        15 /u01/oradata/CDB1/rcat/rman_catalog01.dbf
        16 /u01/oradata/CDB1/valid/users01.dbf

1893106 is the first change number for archived log 27. Now in the next step you have to create a backup of these files (backup datafile 11,12,13,14,15,16 format … tag …) and make this backup available on the standby host. You also need a new standby controlfile. Back the current controlfile up and ship it over too. Then shut the standby down, and restore the controlfile. Use the RMAN “catalog start with” command to register the backup of your missing data files and restore them. In my case I’m using a file system for the data files so the restore can be scripted:

SQL>  select 'set newname for datafile ' || file# || ' to ''' || replace(name,'CDB1','CDB2') || ''';'
  2   from v$datafile where file# between 11 and 16;

Resulting in this run block + restore commands, executed on the standby:

run {
set newname for datafile 11 to '/u01/oradata/CDB2/valid/system01.dbf';
set newname for datafile 12 to '/u01/oradata/CDB2/valid/sysaux01.dbf';
set newname for datafile 13 to '/u01/oradata/CDB2/rcat/system01.dbf';
set newname for datafile 14 to '/u01/oradata/CDB2/rcat/sysaux01.dbf';
set newname for datafile 15 to '/u01/oradata/CDB2/rcat/rman_catalog01.dbf';
set newname for datafile 16 to '/u01/oradata/CDB2/valid/users01.dbf';
restore datafile 11;
restore datafile 12;
restore datafile 13;
restore datafile 14;
restore datafile 15;
restore datafile 16;
}

The directories referenced here obviously have to exist. For exactly this reason it is probably not a good idea to use a directory on the file system per PDB: if it doesn’t exist on the standby your MRP0 process will fall over, complaining about the missing directory. When using the “report schema” command you should now have an equivalent number of data files on each database.

Getting the standby into sync with the primary the easy way

And NOW it’s time for the recover from service command just to be safe and have the standby on the same level as the primary. For this to work managed recovery has to be switched off. Here is the output in all its glory (the standby database is the target):

[oracle@server1 CDB2]$ rman target sys/xxx@cdb2 auxiliary sys/xxx@cdb1

Recovery Manager: Release 12.1.0.1.0 - Production on Wed Mar 12 15:00:12 2014

Copyright (c) 1982, 2013, Oracle and/or its affiliates.  All rights reserved.

connected to target database: CDB1 (DBID=793624397, not open)
connected to auxiliary database: CDB1 (DBID=793624397)

RMAN> recover database from service cdb1;

Starting recover at 12.03.2014 15:00:27
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=237 device type=DISK
skipping datafile 5; already restored to SCN 1735641
skipping datafile 7; already restored to SCN 1735641
skipping datafile 8; already restored to SCN 2054839
skipping datafile 9; already restored to SCN 2054839
skipping datafile 10; already restored to SCN 2054839
skipping datafile 13; already restored to SCN 2068617
skipping datafile 14; already restored to SCN 2068617
skipping datafile 15; already restored to SCN 2068617
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: using network backup set from service cdb1
destination for restore of datafile 00001: /u01/oradata/CDB2/system01.dbf
channel ORA_DISK_1: restore complete, elapsed time: 00:00:35
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: using network backup set from service cdb1
destination for restore of datafile 00003: /u01/oradata/CDB2/sysaux01.dbf
channel ORA_DISK_1: restore complete, elapsed time: 00:00:25
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: using network backup set from service cdb1
destination for restore of datafile 00004: /u01/oradata/CDB2/undotbs01.dbf
channel ORA_DISK_1: restore complete, elapsed time: 00:00:03
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: using network backup set from service cdb1
destination for restore of datafile 00006: /u01/oradata/CDB2/users01.dbf
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: using network backup set from service cdb1
destination for restore of datafile 00011: /u01/oradata/CDB2/valid/system01.dbf
channel ORA_DISK_1: restore complete, elapsed time: 00:00:15
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: using network backup set from service cdb1
destination for restore of datafile 00012: /u01/oradata/CDB2/valid/sysaux01.dbf
channel ORA_DISK_1: restore complete, elapsed time: 00:00:15
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: using network backup set from service cdb1
destination for restore of datafile 00016: /u01/oradata/CDB2/valid/users01.dbf
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01

starting media recovery

archived log for thread 1 with sequence 47 is already on disk as file /u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_47_9l0wzq03_.arc
archived log file name=/u01/fra/CDB2/archivelog/2014_03_12/o1_mf_1_47_9l0wzq03_.arc thread=1 sequence=47
media recovery complete, elapsed time: 00:00:01
Finished recover at 12.03.2014 15:02:06

Now everything looks a lot better:

DGMGRL> edit database cdb2 set state=apply-on;

DGMGRL> validate database cdb2;

  Database Role:     Physical standby database
  Primary Database:  cdb1

  Ready for Switchover:  Yes
  Ready for Failover:    Yes (Primary Running)

  Flashback Database Status:
    cdb1:  Off
    cdb2:  Off

  Future Log File Groups Configuration:
    Thread #  Online Redo Log Groups   Standby Redo Log Groups
              (cdb2)                   (cdb1)
    1         3                        0
    Warning: standby redo logs not configured for thread 3 on cdb1

DGMGRL>

This is confirmed from the standby as well:

SQL> select process,status,thread#,sequence#,block#,delay_mins from v$managed_standby;

PROCESS   STATUS          THREAD#  SEQUENCE#     BLOCK# DELAY_MINS
--------- ------------ ---------- ---------- ---------- ----------
ARCH      CLOSING               1         47      26624          0
ARCH      CONNECTED             0          0          0          0
ARCH      CONNECTED             0          0          0          0
ARCH      CONNECTED             0          0          0          0
RFS       IDLE                  0          0          0          0
RFS       IDLE                  0          0          0          0
RFS       IDLE                  1         48       7673          0
MRP0      APPLYING_LOG          1         48       7673          0

8 rows selected.

Excellent! We are back in sync.

About these ads

4 Responses to “Recovering a standby over the network in 12c”

  1. Hi Martin,
    I think you could have used just “restore database from service” if you had the db_file_name_convert set correctly… Did you test it?

    • Martin Bach said

      Hi Magnus,

      that’s what I wanted to try next. In theory it should work, but I skipped it initially because of this excerpt from the 12.1 documentation:

      “RMAN restores database files, over the network, from a physical standby database by using the FROM SERVICE clause of the RESTORE command. The FROM SERVICE clause provides the service name of the physical standby database from which the files must be restored. During the restore operation, RMAN creates backup sets, on the physical standby database, of the files that need to be restored and then transfers these backup sets to the target database over the network.”

      (http://docs.oracle.com/cd/E16655_01/backup.121/e17630/rcmadvre.htm#BRADV680)

      So to me that sounds that RESTORE … from service is intended to restore the PRIMARY, where RECOVER … from service is for the STANDBY. Could be unlucky wording, but that’s how I understood it.

      Martin

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

Join 2,368 other followers

%d bloggers like this: