Martins Blog

Trying to explain complex things in simple terms

RAC 12c enhancements: adding an additional SCAN-part 4

Posted by Martin Bach on May 26, 2014

This is going to be the last part of this series, however long it might end up being in the end. In the previous articles you read how to create a physical standby database from a RAC One database.

Networks (refresher)

To make it easier to follow without going back to the previous articles, here are the networks I’m using, listed for your convenience.

  • 192.168.100/24: Client network
  • 192.168.102/24: Dedicated Data Guard network

Data Guard Broker Configuration

I said it before and I say it again: I like the Data Guard broker interface for managing standby databases. It’s the ultimate simplification and so easy to implement that it’s quite safe to operate even in large organisations. As the added bonus you get OEM integration as well. OEM relies on a Broker configuration. The first step in managing the standby database therefore is to create the Data Guard configuration. And by the way, the documentation was correct, and the parameter “listener_networks” is already set! When the standby database starts you can see it being set by the Oracle agent process. The alert.log reads:

Starting background process DMON
DMON started with pid=34, OS id=11023
ORACLE_BASE from environment = /u01/app/oracle
Using default pga_aggregate_limit of 2048 MB
ALTER SYSTEM SET local_listener=' (ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.58)(PORT=1521))' SCOPE=MEMORY SID='sby_2';
2014-04-28 05:11:47.874000 -04:00
ALTER SYSTEM SET remote_listener=' ron12csby-scan.example.com:1521' SCOPE=MEMORY SID='sby_2';
ALTER SYSTEM SET listener_networks='(( NAME=net2)(LOCAL_LISTENER=(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.102.58)
(PORT=1522)))))','((NAME=net2)(REMOTE_LISTENER=ron12csby-dgscan.dg.example.com:1521))' SCOPE=MEMORY SID='sby_2';
2014-04-28 05:12:45.107000 -04:00
Decreasing number of real time LMS from 1 to 0

So no more need to worry about local listener (unless you have many local listeners and are not on 12c), remote_listener and listener_networks. That’s a big relief, to see that it does work. I can now create the configuration. I like the broker command line interface more than the OEM interface, especially since it is a lot easier to show here in the post.

Before you can work with the broker you need to enabled it. Set the following init.ora parameters to values similar to these below. It is important that these values are set identical on all instances. You also need to ensure the broker configuration files are in ASM.

SQL> show parameter dg_broker

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
dg_broker_config_file1               string      +DATA/ron/d1.dat
dg_broker_config_file2               string      +RECO/ron/d2.dat
dg_broker_start                      boolean     TRUE
SQL>

Make sure to make the settings on both clusters! Now connect using the command line interface and create the configuration. Note that I’m connecting using ronprinet2, which is the TNS name for the primary database using the Broker Network (192.168.102/24)

[oracle@ron12cprinode1 ~]$ dgmgrl sys/xxx@ronprinet2
DGMGRL for Linux: Version 12.1.0.1.0 - 64bit Production

Copyright (c) 2000, 2012, Oracle. All rights reserved.

Welcome to DGMGRL, type "help" for information.
Connected as SYSDBA.
DGMGRL> help create configuraiton

Creates a broker configuration

Syntax:

  CREATE CONFIGURATION <configuration name> AS
    PRIMARY DATABASE IS <database name>
    CONNECT IDENTIFIER IS <connect identifier>;

DGMGRL> create configuration dgtest as primary database is "RON" connect identifier is "ronprinet2";
Configuration "dgtest" created with primary database "RON"

DGMGRL> show database verbose "RON"

Database - RON

  Role:              PRIMARY
  Intended State:    OFFLINE
  Instance(s):
    pri_2

  Properties:
    DGConnectIdentifier             = 'ronprinet2'
    ObserverConnectIdentifier       = ''
    LogXptMode                      = 'ASYNC'
    RedoRoutes                      = ''
    DelayMins                       = '0'
    Binding                         = 'optional'
    MaxFailure                      = '0'
    MaxConnections                  = '1'
    ReopenSecs                      = '300'
    NetTimeout                      = '30'
    RedoCompression                 = 'DISABLE'
    LogShipping                     = 'ON'
    PreferredApplyInstance          = ''
    ApplyInstanceTimeout            = '0'
    ApplyLagThreshold               = '0'
    TransportLagThreshold           = '0'
    TransportDisconnectedThreshold  = '30'
    ArchiveLagTarget                = '0'
    LogArchiveMaxProcesses          = '4'
    LogArchiveMinSucceedDest        = '1'
    FastStartFailoverTarget         = ''
    InconsistentProperties          = '(monitor)'
    InconsistentLogXptProps         = '(monitor)'
    SendQEntries                    = '(monitor)'
    LogXptStatus                    = '(monitor)'
    RecvQEntries                    = '(monitor)'
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.51)
(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=ron_DGMGRL)(INSTANCE_NAME=pri_2)(SERVER=DEDICATED)))'
    StandbyArchiveLocation          = 'USE_DB_RECOVERY_FILE_DEST'
    AlternateLocation               = ''
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = '%t_%s_%r.dbf'
    TopWaitEvents                   = '(monitor)'

Database Status:
DISABLED

DGMGRL>

A few more things worth mentioning here: first of all the database does not know about the other instance. That’s not to worry about: you will see how this is addressed during the online relocation. The StaticConnectIdentifier is also pointing to the wrong network. Let’s have a look at the only instance:

DGMGRL> show instance verbose pri_2

Instance 'pri_2' of database 'RON'

  Host Name: ron12cprinode1.example.com
  PFILE:
  Properties:
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.51)(PORT=1521))
(CONNECT_DATA=(SERVICE_NAME=ron_DGMGRL)(INSTANCE_NAME=pri_2)(SERVER=DEDICATED)))'
    StandbyArchiveLocation          = 'USE_DB_RECOVERY_FILE_DEST'
    AlternateLocation               = ''
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = '%t_%s_%r.dbf'
    TopWaitEvents                   = '(monitor)'

Instance Status:
DISABLED

Unsurprisingly the StaticConnectIdentifier is pointing to the wrong interface again. Let’s correct this.

DGMGRL> edit instance pri_2 set property StaticConnectIdentifier=
> '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.102.51)(PORT=1522))
> (CONNECT_DATA=(SERVICE_NAME=ron_DGMGRL)(INSTANCE_NAME=pri_2)(SERVER=DEDICATED)))';
Property "staticconnectidentifier" updated

This command updated the setting on the database and instance level. OK, let’s move on. I need to add the standby database.

DGMGRL> add database "RONDG" AS CONNECT IDENTIFIER IS ronsbynet2 MAINTAINED AS PHYSICAL;
Database "RONDG" added

DGMGRL> show database verbose "RONDG"

Database - RONDG

  Role:              PHYSICAL STANDBY
  Intended State:    OFFLINE
  Transport Lag:     (unknown)
  Apply Lag:         (unknown)
  Apply Rate:        (unknown)
  Real Time Query:   OFF
  Instance(s):
    sby_2

  Properties:
    DGConnectIdentifier             = 'ronsbynet2'
    ObserverConnectIdentifier       = ''
    LogXptMode                      = 'ASYNC'
    RedoRoutes                      = ''
    DelayMins                       = '0'
    Binding                         = 'optional'
    MaxFailure                      = '0'
    MaxConnections                  = '1'
    ReopenSecs                      = '300'
    NetTimeout                      = '30'
    RedoCompression                 = 'DISABLE'
    LogShipping                     = 'ON'
    PreferredApplyInstance          = ''
    ApplyInstanceTimeout            = '0'
    ApplyLagThreshold               = '0'
    TransportLagThreshold           = '0'
    TransportDisconnectedThreshold  = '30'
    ApplyParallel                   = 'AUTO'
    StandbyFileManagement           = 'AUTO'
    ArchiveLagTarget                = '0'
    LogArchiveMaxProcesses          = '4'
    LogArchiveMinSucceedDest        = '1'
    DbFileNameConvert               = ''
    LogFileNameConvert              = ''
    FastStartFailoverTarget         = ''
    InconsistentProperties          = '(monitor)'
    InconsistentLogXptProps         = '(monitor)'
    SendQEntries                    = '(monitor)'
    LogXptStatus                    = '(monitor)'
    RecvQEntries                    = '(monitor)'
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.58)
(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rondg_DGMGRL)(INSTANCE_NAME=sby_2)(SERVER=DEDICATED)))'
    StandbyArchiveLocation          = 'USE_DB_RECOVERY_FILE_DEST'
    AlternateLocation               = ''
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = '%t_%s_%r.dbf'
    TopWaitEvents                   = '(monitor)'

Database Status:
DISABLED

DGMGRL> show instance verbose sby_2

Instance 'sby_2' of database 'RONDG'

  Host Name: ron12csbynode1.example.com
  PFILE:
  Properties:
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.58)(PORT=1521))
(CONNECT_DATA=(SERVICE_NAME=rondg_DGMGRL)(INSTANCE_NAME=sby_2)(SERVER=DEDICATED)))'
    StandbyArchiveLocation          = 'USE_DB_RECOVERY_FILE_DEST'
    AlternateLocation               = ''
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = '%t_%s_%r.dbf'
    TopWaitEvents                   = '(monitor)'

Instance Status:
DISABLED

Again let’s fix the static connection identifier.

DGMGRL> edit instance sby_2 set property StaticConnectIdentifier =
> '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.102.58)(PORT=1522))(CONNECT_DATA=(SERVICE_NAME=rondg_DGMGRL)
(INSTANCE_NAME=sby_2)(SERVER=DEDICATED)))';
Property "staticconnectidentifier" updated

Now everything is in place, I can enable the configuration.

DGMGRL> enable configuration
Enabled.
DGMGRL> show configuration

Configuration - dgtest

  Protection Mode: MaxPerformance
  Databases:
  RON   - Primary database
    RONDG - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

DGMGRL>

If you haven’t set the standby file management property to auto yet, please do so. I really don’t like MRP0 aborting because a datafile UNNAMED0000010 has been added. This causes confusion that’s better avoided.

If you get errors about missing standby redo logs (SRLs), please add them. I have dones so on the standby and primary, but didn’t include the output here.

Switchover

That should be all that’s needed in preparation. Since I’m impatient I wanted to see if the switchover works. First I check the configuration:

DGMGRL> show configuration

Configuration - dgtest

  Protection Mode: MaxPerformance
  Databases:
  RON   - Primary database
    RONDG - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

Success means: ready to switch over. Let’s try:

DGMGRL> switchover to "RONDG"
Performing switchover NOW, please wait...
Operation requires a connection to instance "sby_2" on database "RONDG"
Connecting to instance "sby_2"...
Connected as SYSDBA.
New primary database "RONDG" is opening...
Operation requires startup of instance "pri_2" on database "RON"
Starting instance "pri_2"...
ORACLE instance started.
Database mounted.
Switchover succeeded, new primary is "RONDG"
DGMGRL> show configuration

Configuration - dgtest

  Protection Mode: MaxPerformance
  Databases:
  RONDG - Primary database
    RON   - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

Great so that worked. It was important that the StaticConnectIdentifiers are set correctly. During testing I found that relying on DGConnectIdentifier did not work, when it pointed to the second SCAN listener. The SCAN listeners do not “know” about the statically registered %_DGMGRL services, and you receive the dreaded “listener does not know about service name in connect descriptor” error. I also had to pick a listener for the StaticConnectIdentifier. I wanted to have both instances as potential targets for starting the instance, but that failed too. So in a way I map node 1 to instance sby_2 by pointing the broker to its listener.

Online Relocation

Now on to part 2: what happens to the standby when I perform an online relocation on the primary. Let’s try.

[oracle@ron12csbynode1 admin]$ srvctl status database -d rondg
Instance sby_2 is running on node ron12csbynode1
Online relocation: INACTIVE
[oracle@ron12csbynode1 admin]$ srvctl status service -d rondg
Service ron12c is running on instance(s) sby_2

[oracle@ron12csbynode1 admin]$ srvctl relocate database -d rondg -timeout 2 -verbose -node ron12csbynode2
Configuration updated to two instances
Instance sby_1 started
Services relocated
Waiting for up to 2 minutes for instance sby_2 to stop ...
Instance sby_2 stopped
Configuration updated to one instance
[oracle@ron12csbynode1 admin]$

In the meantime I connected to the primary (while it was relocating) and switched a few logs. This works, as you can see (RON is the standby database now)

2014-05-23 10:30:16.943000 -04:00
Recovery of Online Redo Log: Thread 1 Group 6 Seq 119 Reading mem 0
  Mem# 0: +DATA/RON/ONLINELOG/group_6.280.846066567
  Mem# 1: +RECO/RON/ONLINELOG/group_6.330.846066593
2014-05-23 10:30:42.897000 -04:00
Archived Log entry 226 added for thread 1 sequence 119 ID 0x64c27dbf dest 1:
2014-05-23 10:30:49.482000 -04:00
Media Recovery Waiting for thread 1 sequence 120 (in transit)
2014-05-23 10:30:52.043000 -04:00
Primary database is in MAXIMUM PERFORMANCE mode
Re-archiving standby log 8 thread 2 sequence 97
2014-05-23 10:31:02.879000 -04:00
Recovery of Online Redo Log: Thread 1 Group 5 Seq 120 Reading mem 0
  Mem# 0: +DATA/RON/ONLINELOG/group_5.279.846066497
  Mem# 1: +RECO/RON/ONLINELOG/group_5.329.846066535
2014-05-23 10:31:12.400000 -04:00
RFS[8]: Assigned to RFS process (PID:13182)
RFS[8]: Selected log 9 for thread 2 sequence 98 dbid 1681008808 branch 838874793
2014-05-23 10:31:14.987000 -04:00
Archived Log entry 227 added for thread 2 sequence 97 ID 0x64c27dbf dest 1:
2014-05-23 10:31:50.408000 -04:00
Media Recovery Waiting for thread 2 sequence 98 (in transit)

2014-05-23 10:32:02.193000 -04:00
RFS[8]: Selected log 8 for thread 2 sequence 99 dbid 1681008808 branch 838874793
2014-05-23 10:32:05.872000 -04:00
Recovery of Online Redo Log: Thread 2 Group 9 Seq 98 Reading mem 0
  Mem# 0: +DATA/RON/ONLINELOG/group_9.283.846066895
  Mem# 1: +RECO/RON/ONLINELOG/group_9.333.846066947

The logs are from 2 threads-the relocation temporarily changes the single instance database to a cluster database by starting instances on both hosts.

Eventually instance sby_2 stops and causes TNS errors in the standby’s alert.log:

***********************************************************************

Fatal NI connect error 12528, connecting to:
 (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=ron12csby-dgscan.dg.example.com)(PORT=1521))
(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=RONDG_DGB)(INSTANCE_NAME=sby_2)(CID=(PROGRAM=oracle)
(HOST=ron12cprinode1.example.com)(USER=oracle))))

  VERSION INFORMATION:
TNS for Linux: Version 12.1.0.1.0 - Production
Oracle Bequeath NT Protocol Adapter for Linux: Version 12.1.0.1.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 12.1.0.1.0 - Production
  Time: 23-MAY-2014 10:34:25
  Tracing not turned on.
  Tns error struct:
    ns main err code: 12564
...

Nothing to be alarmed about, this is normal, and shown in the output of the relocate command.

The good news is that the primary receives more redo:

2014-05-23 10:35:40.756000 -04:00
RFS[9]: Assigned to RFS process (PID:13329)
RFS[9]: Selected log 9 for thread 2 sequence 100 dbid 1681008808 branch 838874793
2014-05-23 10:36:08.256000 -04:00
Archived Log entry 231 added for thread 2 sequence 100 ID 0x64c27dbf dest 1:
2014-05-23 10:36:14.754000 -04:00
Media Recovery Waiting for thread 2 sequence 101 (in transit)
2014-05-23 10:36:15.972000 -04:00
RFS[7]: Opened log for thread 2 sequence 101 dbid 1681008808 branch 838874793
2014-05-23 10:36:20.162000 -04:00
Archived Log entry 232 added for thread 2 sequence 101 rlc 838874793 ID 0x64c27dbf dest 2:
2014-05-23 10:36:29.656000 -04:00
Media Recovery Log +RECO/RON/ARCHIVELOG/2014_05_23/thread_2_seq_101.393.848313375

And after instance 2 is shut down, you can see redo from thread 1 being transferred.

2014-05-23 10:46:58.257000 -04:00
RFS[6]: Selected log 5 for thread 1 sequence 122 dbid 1681008808 branch 838874793
Archived Log entry 233 added for thread 1 sequence 121 ID 0x64c27dbf dest 1:
2014-05-23 10:47:00.479000 -04:00
Media Recovery Waiting for thread 1 sequence 122 (in transit)
2014-05-23 10:47:03.686000 -04:00
Recovery of Online Redo Log: Thread 1 Group 5 Seq 122 Reading mem 0
  Mem# 0: +DATA/RON/ONLINELOG/group_5.279.846066497
  Mem# 1: +RECO/RON/ONLINELOG/group_5.329.846066535

If you are unsure about the thread to instance mapping, check v$thread, this example is from the primary:

SQL> select thread#, status, enabled, instance from v$thread;

   THREAD# STATUS ENABLED  INSTANCE
---------- ------ -------- ------------------------------
         1 OPEN   PUBLIC   sby_1
         2 CLOSED PUBLIC   sby_2

In summary my test showed that an online relocation does not seem to cause trouble for the standby database. I need to run a benchmark like Swingbench against it to see how the online relocation behaves when the system is under load though, the database was pretty much idle during the online relocation.

Amending the Broker configuration

Circling back to the introduction: what happens to the DG broker configuration after the database has been a RAC database (albeit briefly)? Here is the output from the configuration after the first relocation:

DGMGRL> show database verbose "RONDG"

Database - RONDG

  Role:              PRIMARY
  Intended State:    TRANSPORT-ON
  Instance(s):
    sby_1
    sby_2

  Properties:
    DGConnectIdentifier             = 'ronsbynet2'
    ObserverConnectIdentifier       = ''
    LogXptMode                      = 'ASYNC'
    RedoRoutes                      = ''
    DelayMins                       = '0'
    Binding                         = 'optional'
    MaxFailure                      = '0'
    MaxConnections                  = '1'
    ReopenSecs                      = '300'
    NetTimeout                      = '30'
    RedoCompression                 = 'DISABLE'
    LogShipping                     = 'ON'
    PreferredApplyInstance          = ''
    ApplyInstanceTimeout            = '0'
    ApplyLagThreshold               = '0'
    TransportLagThreshold           = '0'
    TransportDisconnectedThreshold  = '30'
    ApplyParallel                   = 'AUTO'
    StandbyFileManagement           = 'AUTO'
    ArchiveLagTarget                = '0'
    LogArchiveMaxProcesses          = '4'
    LogArchiveMinSucceedDest        = '1'
    DbFileNameConvert               = ''
    LogFileNameConvert              = ''
    FastStartFailoverTarget         = ''
    InconsistentProperties          = '(monitor)'
    InconsistentLogXptProps         = '(monitor)'
    SendQEntries                    = '(monitor)'
    LogXptStatus                    = '(monitor)'
    RecvQEntries                    = '(monitor)'
    StaticConnectIdentifier(*)
    StandbyArchiveLocation(*)
    AlternateLocation(*)
    LogArchiveTrace(*)
    LogArchiveFormat(*)
    TopWaitEvents(*)
    (*) - Please check specific instance for the property value

Database Status:
SUCCESS

Notice how the second instance (sby_1) has been added. The first question I had: what’s it done to the connection identifiers? As expected it’s set to the wrong network for the newly added instance:

DGMGRL> show instance verbose sby_1;

Instance 'sby_1' of database 'RONDG'

  Host Name: ron12csbynode2.example.com
  PFILE:
  Properties:
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.60)(PORT=1521))
(CONNECT_DATA=(SERVICE_NAME=rondg_DGMGRL)(INSTANCE_NAME=sby_1)(SERVER=DEDICATED)))'
    StandbyArchiveLocation          = 'USE_DB_RECOVERY_FILE_DEST'
    AlternateLocation               = ''
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = '%t_%s_%r.dbf'
    TopWaitEvents                   = '(monitor)'

Instance Status:
SUCCESS

So that needs to be changed as well. It would make sense to systematically perform a relocation as part of the build of the Data Guard environment on primary and standby to register all instances and update them. Not that it’s a big issue, the switchover is most likely not going to fail, but you have to manually start the database if the connection information is wrong.

Summary

It’s been quite a journey! This article series explained how to add a second SCAN to the environment including all the other infrastructure such as VIPs and listeners. You then saw how to duplicate a database for Data Guard and finally

About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

Join 2,417 other followers

%d bloggers like this: