Category Archives: 12c Release 1

A quick look at Oracle 12.2 performance instrumentation

Thanks to the prep-work done by Andy Colvin, who continuously and boldly goes where no man has gone before, one of our Exadata systems in the lab is now fully upgraded to 12.2.0.1. It comes fully equipped with the matching cellos to support all the cool new features. Exciting times!

The reason for this post is simple: I have started working on our talk for @Enkitec’s E4 conference in June but thought I’d share this little article with you as a teaser :) There might be one or two more of these posts but if you want the full story make sure you catch us (online) during the conference.

More diagnostic information in 12.2

The Oracle database truly champions diagnosability in a way I have not seen with any other system, and it does so out of the box. Granted, some of that requires an investment into extra cost options but the value one gets in form of Active Session History (ASH) and Automatic Workload Repository (AWR) is real. After I read the chapter on instrumentation in “Insights-tales from the Oaktable” (Apress) a long time ago, I started to realise the inherent beauty of having insights to code. This code can be your code if you instrumented it properly, or the Oracle codepath externalised as wait events. Sadly most application developers do not adhere to the principle of instrumenting code (or maybe don’t even know about the possibility?) and therefore complicate troubleshooting unnecessarily. The latter is not so much an issue on many platforms where you don’t have an equivalent of the Oracle Wait Interface and session counters anyway, but on Oracle it’s a real wasted opportunity as others have pointed out before me.

I’ll now take my idealist hat off and focus on the real world :) In the far more common case where the application isn’t instrumented by the developer, you can still get to some conclusions by using the Wait Interface and session counters. In most scenarios I am involved in the first step is to use Tanel Poder’s session snapper script which gives me insights to both.

So what has changed in this respect with 12.2? The test environment I am using is an Exadata quarter rack as mentioned before. The findings should be comparable with other Oracle software offerings, in the cloud and on premises.

Wait Interface

The Wait Interface is one of the most useful features for the performance engineer, and one of the building blocks for Statspack, ASH and AWR. I was curious if new events have been introduced in 12.2, hence this investigation. The sheer number of events tracked in the database prevents them from being listed verbally in this post. If you want to, you can use Tanel’s “sed.sql” to find out more, or simply query v$event_name.

An interesting tidbit for 12.2 has been covered by @FranckPachot: some of the more misleading event names such as db file scattered read and db file sequential readhave been clarified in 12.2. Search for events where the name is not equal to the display_name, or read Franck’s post on the DBI blog.

Wait events in 11.2.0.3

I like to start comparisons with a benchmark, and 11.2.0.3 seems to be a good candidate. Just looking at the wait_classes and counting events per class should be a good starting point:

SQL> select count(*), wait_class 
  2  from v$event_name 
  3  group by rollup(wait_class) 
  4  order by wait_class;

   COUNT(*) WAIT_CLASS
----------- ----------------------------------------------------------------
         55 Administrative
         17 Application
         50 Cluster
          2 Commit
         33 Concurrency
         24 Configuration
         95 Idle
         35 Network
        745 Other
          9 Queueing
          8 Scheduler
         31 System I/O
         48 User I/O
       1152

14 rows selected.

So there are 1152 events total in 11.2.0.3, keep that number in mind.

Wait events in 12.1.0.2

In my opinion 12.1 is a major step ahead, and I said it many times: I haven’t seen so many fundamental changes to the Oracle database for a long time. For various reasons though 12.1 hasn’t seen the traction in the field it probably deserved. Note how the diagnosability has been greatly enhanced:

SQL> select count(*), wait_class 
  2  from v$event_name 
  3  group by rollup(wait_class) 
  4  order by wait_class;

   COUNT(*) WAIT_CLASS
----------- ----------------------------------------------------------------
         57 Administrative
         17 Application
         64 Cluster
          4 Commit
         38 Concurrency
         26 Configuration
        121 Idle
         28 Network
       1186 Other
          9 Queueing
          9 Scheduler
         35 System I/O
         56 User I/O
       1650

14 rows selected.

A quick calculation reveals that 12.1 features 498 additional events, a lot more than 11.2.0.3. Not too bad in my opinion. Unfortunately most of these additional events ended up in the “Other” wait class. This is a real shame, and I have remarked this before when blogging about the new Data Guard Wait events: they probably should have ended up in the Network class instead. I am sure there are other examples like this.

Wait events in 12.2.0.1

Now what about 12.2? First let’s start with the list:

SQL> select count(*), wait_class
  2  from v$event_name
  3  group by rollup(wait_class)
  4  order by wait_class;

  COUNT(*) WAIT_CLASS
---------- ----------------------------------------------------------------
        57 Administrative
        17 Application
        68 Cluster
         4 Commit
        49 Concurrency
        27 Configuration
       135 Idle
        29 Network
      1314 Other
         9 Queueing
        10 Scheduler
        35 System I/O
        57 User I/O
      1811

14 rows selected.

There are indeed some new events, most of them can be found in the Other wait class. Again, this is quite unfortunate as it prevents the performance architect from identifying unknown wait events quickly.

I have decided to keep this post short-ish and will spend more time some other day to investigate the exact difference between 12.1.0.2 and 12.2.0.1. Most likely after E4 this year.

Session Counters

With the wait interface covered, it’s time to move on to the session counters. Continuing the approach I took with wait events I will group all session counters by class. Instead of re-inventing the wheel I am using a slightly adapted version of Tanel Poder’s “mys.sql” script to group counters by class. Most of them fall in just one, but there are others where more than one class is applicable. The 12.2 Reference Guide explains v$statname.class in more detail.

Session counters in 11.2.0.3

Before investigating 12.1 and 12.2 I’ll look at 11.2.0.3 first, as in the first section of the article. The SQL statement I used is this:

with classification as (
select name, TRIM(
  CASE WHEN BITAND(class,  1) =   1 THEN 'USER  ' END ||
  CASE WHEN BITAND(class,  2) =   2 THEN 'REDO  ' END ||
  CASE WHEN BITAND(class,  4) =   4 THEN 'ENQ   ' END ||
  CASE WHEN BITAND(class,  8) =   8 THEN 'CACHE ' END ||
  CASE WHEN BITAND(class, 16) =  16 THEN 'OS    ' END ||
  CASE WHEN BITAND(class, 32) =  32 THEN 'RAC   ' END ||
  CASE WHEN BITAND(class, 64) =  64 THEN 'SQL   ' END ||
  CASE WHEN BITAND(class,128) = 128 THEN 'DEBUG ' END
) class_name 
from v$statname
) select count(*), class_name from classification 
group by rollup(class_name)
order by class_name;

Executed on an 11.2.0.3 database this reveals the following numbers:

   COUNT(*) CLASS_NAME
----------- ------------------------------------------------
        121 CACHE
         27 CACHE RAC
         15 CACHE SQL
        188 DEBUG
          9 ENQ
         16 OS
         25 RAC
         32 REDO
         93 SQL
          2 SQL   DEBUG
        107 USER
          3 USER  RAC
        638

13 rows selected.

638 of them altogether. Keep the number in mind when moving to 12.1.

Session counters in 12.1.0.2

There was a major increase in the number of counters between 11.2.0.3 and 12.1.0.2. Consider the numbers:

  COUNT(*) CLASS_NAME
---------- ------------------------------------------------
       151 CACHE
        53 CACHE RAC
        15 CACHE SQL
       565 DEBUG
         9 ENQ
        16 OS
        35 RAC
        68 REDO
         1 REDO  RAC
       130 SQL
         2 SQL   DEBUG
       130 USER
         3 USER  RAC
      1178

14 rows selected.

That nearly doubles the number of counters available. Note that quite a few of the new counters fall into the DEBUG section. Looking a bit more closely you can see they seem to be used by the In Memory (IM) Option:

SQL>  SELECT
  2      regexp_substr(name,'\w+ \w+') AS short_name,
  3      COUNT(*)
  4  FROM
  5      v$statname
  6  WHERE
  7      class = 128
  8  GROUP BY
  9      regexp_substr(name,'\w+ \w+')
 10  ORDER BY
 11      2 DESC
 12  FETCH FIRST 5 ROWS ONLY;

SHORT_NAME                        COUNT(*)
------------------------------ -----------
IM repopulate                           49
IM scan                                 47
IM populate                             37
spare statistic                         35
IM space                                26

5 rows selected.

There are 198 session counters beginning with ‘IM %’. I can feel another post about DB In Memory coming …

Session counters in 12.2.0.1

Finally, here is the list of statistics in 12.2.0.1:

  COUNT(*) CLASS_NAME
---------- ------------------------------------------------
       174 CACHE
        73 CACHE RAC
        15 CACHE SQL
      1067 DEBUG
         9 ENQ
        16 OS
        35 RAC
        75 REDO
         1 REDO  RAC
       190 SQL
         2 SQL   DEBUG
       144 USER
         3 USER  RAC
      1804

Another 626 additional counters, that’s almost the number of counters available in total on the 11.2.0.3 system! Running my previous query again it seems that IM-related statistics dominate, but there are lots of others of interest.

As with the 12.2 wait events I don’t want to give too much away at this point (and the post is long enough anyway) so stay tuned for an update at a later time.

Summary

Oracle has been one of the best database engines around, and with 12.2.0.1 instrumentation is further improved. This post has again become too long, so I’ll stop here and defer the write-up of my investigation into those stats relevant for Exadata to a later point.

Happy troubleshooting!

Advertisements

Little things worth knowing: Executing RDA on RAC

Result! I have finally been able to gather a complete RDA (Oracle Remote Diagnostic Agent) output on my 2 node RAC system. After consulting the relevant documentation on MOS-which is spread over at least 42 Doc IDs-I found them not to be very helpful to the degree that some of what I read is actually wrong or contradicting. I put together a short note, primarily to myself, to remind me of the process. I hope you find it useful, too.

The RDA version I used for this post is 8.14.x from MOS March 4th 2017. My RAC nodes are based on Oracle Linux 7.3/UEK 4.

Starting the data collection

RDA is one of the standard tools I use and I have previously blogged about it. RDA started off as a simple-to-use tool. After having used it for some years I started to run into issues with the interactive configuration which simply took too long to complete. As soon as I learned about them I fell in love with profiles. RDA profiles only prompt you for relevant information about the product you want to collect data for.

Running RDA on a RAC system is similar to single instance, and you automatically use profiles-a nice touch. It appears to me that RAC data collection is triggered via SSH on the remote nodes with the results being transferred to the local node.

I believe there are two methods of data gathering on RAC: one for Grid Infrastructure, and another one for the RDBMS layer. Depending on your settings of ORACLE_HOME and ORACLE_SID different parts of the stack are analysed. In this post I was primarily interested in running RDA for the RDBMS part of my cluster.

I started off by downloading the current RDA version to the system I want to run it on. This is a 2-node RAC, with nodes named rac12pri1 and rac12pri2. Both are based on Oracle Linux 7.3 with the most current UEK4 at the time of writing. My RDBMS homes are 12.1.0.2 patched with the January 2017 proactive bundle patch + OJVM.

The first caveat looms right at the beginning of the entire process: I didn’t find it stated explicitly in the documentation on MOS, but it seemed that you need to deploy RDA on all nodes in the cluster, and in the exact same location, before you start the data collection. I use /home/oracle/rda for that purpose. Make sure you have plenty of space in the location you chose, sometimes /home doesn’t provide enough for larger, more active systems.

During my first unsuccessful attempts I didn’t deploy RDA on all nodes before starting the data collection, only to see it not gather any information from the remote node. This is somewhat confusing, because the output (when collecting data) states this:

NOD002: Installing RDA software

NOD002 is short for the second node in my cluster.

Also, when configuring the data gathering process, you will see a prompt like this one:

Enter the absolute path of the directory where RDA will be installed on the remote nodes.

To the casual observer like me these messages suggest that RDA is actually installed on the remote nodes as part of the data collection process-but it is only partially done. Comparing directory sizes between remote and local node revealed /home/oracle/rda to be greater than 200M locally, while on the remote nodes it was only 37MB in size. Something seems to be missing…

Once deployed, you can change to the RDA directory and prepare for the data collection. You actually execute RDA on the first node, the rest of the work is done programatically. The MOS note seems to be correct this time, here is an example of my configuration session:

[oracle@rac12pri1 rda]$ ./rda.pl -vXRda start CLOUD -pRac_Assessment
Creating collection "output" ...
        - Trying to identify the domain ... (can take time)
 Inside DFT scenario, define the input and profile targets
 Inside DFT scenario, check Oracle home or Middleware presence
        - RDA:DCbegin ...
------------------------------------------------------------------------------
RDA.BEGIN: Initializes the Data Collection
------------------------------------------------------------------------------
Enter the Oracle home to be used for data analysis
Press Return to accept the default (/u01/app/oracle/product/12.1.0.2/dbhome_1)
>

        - RDA:DCconfig ...
------------------------------------------------------------------------------
RDA.CONFIG: Collects Key Configuration Information
------------------------------------------------------------------------------
        - RDA:DCocm ...
------------------------------------------------------------------------------
RDA.OCM: Set up the Configuration Manager Interface
------------------------------------------------------------------------------
        - RDA:DCstatus ...
------------------------------------------------------------------------------
RDA.STATUS: Produces the Remote Data Collection Reports
------------------------------------------------------------------------------
        - RDA:DCload ...
------------------------------------------------------------------------------
RDA.LOAD: Produces the External Collection Reports
------------------------------------------------------------------------------
        - RDA:DCfilter ...
------------------------------------------------------------------------------
RDA.FILTER: Controls Report Content Filtering
------------------------------------------------------------------------------
        - RDA:DCend ...
------------------------------------------------------------------------------
RDA.END: Finalizes the Data Collection
------------------------------------------------------------------------------
In CRS start scenario, getting nodes using /u01/app/12.1.0.2/grid/bin/olsnodes
2>/dev/null
------------------------------------------------------------------------------
Requesting common cluster information
------------------------------------------------------------------------------
Enter the absolute path of the directory where RDA will be installed on the
remote nodes.
Press Return to accept the default (/home/oracle/rda)
>

Do you want RDA to use an alternative login to execute remote requests (Y/N)?
Press Return to accept the default (N)
>

In CRS start scenario, getting local node information
In CRS start scenario, pinging node rac12pri2
------------------------------------------------------------------------------
Requesting information for node rac12pri1
------------------------------------------------------------------------------
Enter the Oracle home to be analyzed on the node rac12pri1
Press Return to accept the default (/u01/app/oracle/product/12.1.0.2/dbhome_1)
>

Enter the Oracle SID to be analyzed on the node rac12pri1
Press Return to accept the default (NCDBA1)
>

------------------------------------------------------------------------------
Requesting information for node rac12pri2
------------------------------------------------------------------------------
Enter the Oracle home to be analyzed on the node rac12pri2
Press Return to accept the default (/u01/app/oracle/product/12.1.0.2/dbhome_1)
>

Enter the Oracle SID to be analyzed on the node rac12pri2
> NCDBA2

        - RDA:DCremote ...
------------------------------------------------------------------------------
RAC Setup Summary
------------------------------------------------------------------------------
Nodes:
. NOD001  rac12pri1/NCDBA1
. NOD002  rac12pri2/NCDBA2
2 nodes found

So RDA understands the RAC scenario, and it gathers data for each node that is part of the cluster as reported by the olsnodes tool. There was nothing really surprising about the prompts, except that I haven’t found a way to analyse more than one database, or ASM and a database together.

Be careful to enter the correct ORACLE_SID for your (remote) RAC nodes. RDA picks up your ORACLE_HOME and ORACLE_SID if they are set.

Optionally verify the correct number of nodes was detected

I am only mentioning this for the sake of completeness, the RAC Setup Summary you saw in the previous step was more than adequate for me. If you really want to find out about the number of nodes you are about to collect information, be careful: MOS Doc ID 359395.1 is wrong – the command ./rda.pl -vC RDA.REMTOE list will not only list the nodes, it will also start the data collection. Use this one instead, which I found in MOS Doc ID 1682909.1:

[oracle@rac12pri1 rda]$ ./rda.pl -XRemote list
Defined nodes:
  NOD001  rac12pri1  NCDBA1
  NOD002  rac12pri2  NCDBA2
[oracle@rac12pri1 rda]$

It merely lists the nodes, without actually starting to do any work.

Initiate data collection

When you are happy with the output and configuration, start collecting data. Here is an example of my session:

[oracle@rac12pri1 rda]$ ./rda.pl -v -e TRC/TRACE=1
Collecting diagnostic data ...
------------------------------------------------------------------------------
RDA Data Collection Started 04-Mar-2017 16:10:15
------------------------------------------------------------------------------
Processing RDA.BEGIN module ...
 Inside BEGIN module, testing the RDA engine code build
 Inside BEGIN module, testing the report directory
 Inside BEGIN module, testing the module targets
 Inside BEGIN module, launching parallel executions
Processing RDA.CONFIG module ...
 Inside CONFIG module, listing Oracle homes
 Inside CONFIG module, getting Oracle home inventory (can take time)
Processing RDA.REMOTE module ...
NOD001: Detecting storage type
NOD002: Detecting storage type
NOD001: Running RDA command
NOD002: Installing RDA software
NOD002: Running RDA command
NOD002: Transfering report package
NOD001: Transfering report package
Processing RDA.END module ...
 Inside END module, gathering system information
 Inside END module, getting CPU information (linux)
 Inside END module, getting memory information (linux)
 Inside END module, producing the file catalog
 Inside END module, producing target overview
 Inside END module, waiting for parallel execution completion
 Inside END module, producing setting overview
------------------------------------------------------------------------------
RDA Data Collection Ended 04-Mar-2017 16:17:44
------------------------------------------------------------------------------
Generating the reports ...
        - collect/RDA_CONFIG_homes.txt ...
        - collect/RDA_CONFIG_oh_inv.txt ...
        - collect/RDA_END_system.txt ...
        - collect/RDA_END_files.txt ...
        - collect/RDA_END_target.txt ...
        - collect/RDA_END_report.txt ...
        - Report index ...
Packaging collection results ...

[...skipping a lot of text...]

  You can review the reports by transferring the /home/oracle/rda/output
  directory structure to a location where you have web-browser access.
Then, point your browser at this file to display the reports:
    RDA__start.htm

[...skipping a lot of text...]

It is crucially important to see these lines:

Processing RDA.REMOTE module ...
NOD001: Detecting storage type
NOD002: Detecting storage type
NOD001: Running RDA command
NOD002: Installing RDA software
NOD002: Running RDA command
NOD002: Transfering report package
NOD001: Transfering report package

In my first attempts, when I didn’t deploy RDA on all nodes myself, the lines “NOD002: Running RDA command” and “NOD002: Transfering report package” weren’t shown. Unsurprisingly no data was gathered on the remote nodes.

Viewing the output

At the end of the data collection you should see a *.tar.gz file per node. In my 2 node cluster setup, there are two:

[oracle@rac12pri1 rda]$ ls output/remote/
RDA_nod001_output.tar.gz  RDA_nod002_output.tar.gz

You can view these after extracting to a temporary location in your browser, start with the file named RDA__start.htm which presents the various parts of the report.

Further reading: Testing user equivalence prior to gathering data

While trying to troubleshoot my remote data gathering problems (I suspected a problem with SSH at first) I noticed that RDA offers test modules as well (see MOS Doc ID: 330760.1). What’s missing from the Doc ID is an example on how to invoke the SSH test module, or rather the command in the RAC specific instructions seems not to work. But it isn’t too hard to figure out the proper call to the RDA executable. The following worked for me:

[oracle@rac12pri1 rda]$ ./rda.pl -T ssh
Processing remote operation test module ...

Command Availability:
  rcp          (Not found)
  remsh        (Not found)
  rsh          (Not found)
  scp          /bin/scp
  ssh          /bin/ssh
  ssh-add      /bin/ssh-add
  ssh-agent    /bin/ssh-agent

Related files:
  /etc/ssh/moduli
  /etc/ssh/ssh_config
  /etc/ssh/ssh_host_ecdsa_key
  /etc/ssh/ssh_host_ecdsa_key.pub
  /etc/ssh/ssh_host_ed25519_key
  /etc/ssh/ssh_host_ed25519_key.pub
  /etc/ssh/ssh_host_rsa_key
  /etc/ssh/ssh_host_rsa_key.pub
  /etc/ssh/sshd_config
  /home/oracle/.ssh/authorized_keys
  /home/oracle/.ssh/config
  /home/oracle/.ssh/config.backup
  /home/oracle/.ssh/id_rsa
  /home/oracle/.ssh/id_rsa.pub
  /home/oracle/.ssh/known_hosts

Check if an authentication agent has been started
  SSH_AGENT_PID=1
  SSH_AUTH_SOCK=1
Agent identities:
  1024 some:colums:separated:by:colons:indicating:the:ID
  /home/oracle/.ssh/id_rsa (RSA)

Driver Availability:
    da Available
  jsch Available
   ssh Available
   rsh -

Settings modified by the remote operation initialization:
  REMOTE.F_SCP_COMMAND=/bin/scp
  REMOTE.F_SSH_COMMAND=/bin/ssh
  REMOTE.T_SCP_OPTIONS=-BCpq -o ConnectTimeout=30
  REMOTE.T_SSH_OPTIONS=-Cnq -o ConnectTimeout=30

------------------------------------------------------------------------------
Test a remote command
------------------------------------------------------------------------------
Enter the remote host name
Press Return to accept the default (localhost)
> rac12pri2

Do you want RDA to use an alternative login to execute remote requests (Y/N)?
Press Return to accept the default (N)
>

Check remote command execution with best driver
Exit code: 0

Check remote command execution using DA
Exit code: 0

Check remote command execution using JSCH
Exit code: 0

Check remote command execution using SSH
Exit code: 0

Check remote command execution using SSH (no timeout)
Exit code: 0
[oracle@rac12pri1 rda]$

Looking at the above output let me to believe that there isn’t a SSH-related problem with my cluster.

New Events for Data Guard and Synchronous Redo Transport in 12c (2)

After the baseline has been established in the first part of this series it’s time to measure the effect of the network in this part. The second test will introduce an interesting feature: Using Linux’s own Traffic Shaper/Quality of Services module I will add a delay of 100ms to the Data Guard network interface card (NIC) to slow things down a little.

WARNING: this is of course a lab or VM-only situation. I can’t exert control over wire quality in my (own) switches, hence some software magic is needed on my virtual ones. This post is intended to be merely for educational purposes, not for use at work.

I am continuing to use the 2 node RAC 12.1.0.2.170117 primary database on Oracle Linux 7 with UEK 4 and an identical RAC to host my standby database.

All database traffic still uses eth0 – my public network (192.168.100.0/24). In order not to affect my SSH session with the planned network changes I created a new out-of-band management interface (eth2) on the Data Guard standby database servers. I love KVM and virtio drivers for the NICs: a few commands in virsh later and every RAC node has a new NIC, and of course without rebooting. Adding some nmcli magic and I can configure all interfaces in 1 line!

Here’s what it looks like from connection manager (the system is Oracle Linux 7.3)

[root@rac12sec1 ~]# nmcli c show
NAME  UUID                                  TYPE            DEVICE 
eth0  3e8915f1-10ce-4da7-99b8-d2a6947f9ff7  802-3-ethernet  eth0   
eth1  0ebe1279-40e1-4faf-a0be-49588d347fcf  802-3-ethernet  eth1   
mgmt  51054729-19ae-49e7-ba18-c6b41475ec1b  802-3-ethernet  eth2   

My eth0 device maps to the public networks, e.g. all database traffic including Data Guard. The private interconnect uses eth1 with eth2 as the management interface. I have deliberately kept this simple, usually I’d request a dedicated set of interfaces to route Data Guard traffic (I blogged about that earlier in the context of RAC) here:

Please read past part I of the Adding an additional SCAN-series to get the full story.

Change protection mode

Following the scenario I outlined in part I I first have to change the protection mode to Maximum Availability. Again, please see part I of this series for an explanation about Protection Levels, my choice of protection level, and why “Maximum Availability” might or might not work for you.

Since I’m using the Data Guard Broker, I need to head over and change the mode in dgmgrl.

DGMGRL> show configuration

Configuration - ractest

  Protection Mode: MaxPerformance
  Members:
  NCDBA - Primary database
    NCDBB - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 40 seconds ago)

As with many other maintenance tasks, you use the edit command to change the configuration.

DGMGRL> EDIT CONFIGURATION SET PROTECTION MODE AS MaxAvailability;
Error: ORA-16627: operation disallowed since no standby databases would 
remain to support protection mode

Failed.

I seem to have gotten a bit ahead of myself – changing to the more serious protection modes requires changing the transport mode from async to sync. Let’s do that, and then change the protection mode:

DGMGRL> edit database 'NCDBA' set property LogXptMode = 'SYNC';
Property "logxptmode" updated
DGMGRL> edit database 'NCDBB' set property LogXptMode = 'SYNC';
Property "logxptmode" updated
DGMGRL> EDIT CONFIGURATION SET PROTECTION MODE AS MaxAvailability;
Succeeded.
DGMGRL> show configuration

Configuration - ractest

  Protection Mode: MaxAvailability
  Members:
  NCDBA - Primary database
    NCDBB - Physical standby database 

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 19 seconds ago)

DGMGRL> 

That looks better now, the first half of the job is done. The second half of this article details how to make the network look a lot worse.

Net Emulator

Using the Linux netem kernel module I can simulate many complex network situations – I found https://wiki.linuxfoundation.org/networking/netem to be a good reference.

I should probably say at this point that no one would use a link that has 100 ms delay between sites for (sync) redo transport. That simply doesn’t make sense. I used 100ms to have a large enough factor so that I can actually be sure that a deviation of performance figures is actually a result of my fiddling with the network latency. Large numbers make spotting outliers easier.

How do you mess with the network? That is simple when you are on Linux! On my standby cluster I use the tc command, part of the iproute package. This is the “normal” state:

[root@rac12sec1 ~]# tc qdisc
qdisc pfifo_fast 0: dev eth0 root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: dev eth1 root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: dev eth2 root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
[root@rac12sec1 ~]# 

I am merely using these tools, if you want to understand them, have a look at the Linux Advanced Routing & Traffic Control HOWTO chapter 9, and refer to the link to the Linux Foundation wiki I mentioned earlier. Then take it from there.

The syntax of tc qdisc is complex at first sight:

[root@rac12sec1 ~]# tc qdisc help
Usage: tc qdisc [ add | del | replace | change | show ] dev STRING
       [ handle QHANDLE ] [ root | ingress | parent CLASSID ]
       [ estimator INTERVAL TIME_CONSTANT ]
       [ stab [ help | STAB_OPTIONS] ]
       [ [ QDISC_KIND ] [ help | OPTIONS ] ]

       tc qdisc show [ dev STRING ] [ingress]
Where:
QDISC_KIND := { [p|b]fifo | tbf | prio | cbq | red | etc. }
OPTIONS := ... try tc qdisc add <desired QDISC_KIND> help
STAB_OPTIONS := ... try tc qdisc add stab help

The above allows us to introduce NetEM to the mix. On my Linux 7.3/UEK4 system it offers the following options:

[root@rac12sec1 ~]# tc qdisc add netem help
Usage: ... netem [ limit PACKETS ] 
                 [ delay TIME [ JITTER [CORRELATION]]]
                 [ distribution {uniform|normal|pareto|paretonormal} ]
                 [ corrupt PERCENT [CORRELATION]] 
                 [ duplicate PERCENT [CORRELATION]]
                 [ loss random PERCENT [CORRELATION]]
                 [ loss state P13 [P31 [P32 [P23 P14]]]
                 [ loss gemodel PERCENT [R [1-H [1-K]]]
                 [ ecn ]
                 [ reorder PRECENT [CORRELATION] [ gap DISTANCE ]]
                 [ rate RATE [PACKETOVERHEAD] [CELLSIZE] [CELLOVERHEAD]]

That’s a wealth of options. I can simulate somewhat unreliable WAN links, even with variations in delay! Refer to the Linux Foundation wiki for more details on how to simulate networks more realistically.

Introducing Delay

I prefer to keep it simple for this post, so I’ll stick with the most basic setting for network delay. I am introducing this to the Data Guard network on both of my standby nodes while connected to the management interface eth2:

[root@rac12sec1 ~]# tc qdisc add dev eth0 root netem delay 100ms

[root@rac12sec1 ~]# tc qdisc
qdisc netem 8001: dev eth0 root refcnt 2 limit 1000 delay 100.0ms
qdisc pfifo_fast 0: dev eth1 root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: dev eth2 root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1

[root@rac12sec2 ~]# tc qdisc
qdisc netem 8001: dev eth0 root refcnt 2 limit 1000 delay 100.0ms
qdisc pfifo_fast 0: dev eth1 root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: dev eth2 root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1

[root@rac12sec2 ~]# tc qdisc show dev eth0 
qdisc netem 8001: root refcnt 2 limit 1000 delay 100.0ms

The effect can be seen immediately by pinging one of the standby nodes:

[oracle@rac12pri1 ~]$ ping rac12sec1
PING rac12sec1.example.com (192.168.100.20) 56(84) bytes of data.
64 bytes from rac12sec1.example.com (192.168.100.20): icmp_seq=1 ttl=64 time=100 ms
64 bytes from rac12sec1.example.com (192.168.100.20): icmp_seq=2 ttl=64 time=100 ms
64 bytes from rac12sec1.example.com (192.168.100.20): icmp_seq=3 ttl=64 time=100 ms
64 bytes from rac12sec1.example.com (192.168.100.20): icmp_seq=4 ttl=64 time=100 ms
^C
--- rac12sec1.example.com ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3003ms
rtt min/avg/max/mdev = 100.368/100.415/100.478/0.229 ms
[oracle@rac12pri1 ~]$ 

So 100ms delay it is!

Starting the Benchmark

With the scene set, it’s time to start the benchmark. The command to start it resembles the one I used for part I, except for the name of the results file.

[oracle@loadgen1 bin]$ ./charbench -cs //rac12pri-scan/swingbench_both \
> -uc 20 -r results_20_maxavail_netem.xml -rt 00:10
Author  :	 Dominic Giles
Version :	 2.5.0.971

Results will be written to results_20_maxavail_netem.xml.
Hit Return to Terminate Run...

Time		Users	TPM	TPS

1:08:56 PM      20      5188    94          

What effect does this have on the database? Let’s check the Data Guard broker first:

DGMGRL> show database 'NCDBB'

Database - NCDBB

  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      0 seconds (computed 0 seconds ago)
  Apply Lag:          10 seconds (computed 0 seconds ago)
  Average Apply Rate: 207.00 KByte/s
  Real Time Query:    OFF
  Instance(s):
    NCDBB1
    NCDBB2 (apply instance)

Database Status:
SUCCESS

DGMGRL> show database 'NCDBB'

Database - NCDBB

  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      0 seconds (computed 0 seconds ago)
  Apply Lag:          0 seconds (computed 0 seconds ago)
  Average Apply Rate: 211.00 KByte/s
  Real Time Query:    OFF
  Instance(s):
    NCDBB1
    NCDBB2 (apply instance)

Database Status:
SUCCESS

So it looks like there isn’t any lag visible with the benchmark underway, however it took a little while for it to start as visible in the first example. Here is the output from Tanel Poder’s ASHTOP script again (using Tanel Poder’s ashtop again – which requires you to have the Diagnostic Pack license as it access v$active_session_history):

SQL> @ashtop program,event,wait_class,session_state session_type='BACKGROUND' "to_date('17.02.2017 08:10','dd.mm.yyyy hh24:mi')" "to_date('17.02.2017 08:15','dd.mm.yyyy hh24:mi')"

    Total
  Seconds     AAS %This   PROGRAM                                          EVENT                                    WAIT_CLASS      SESSION FIRST_SEEN          LAST_SEEN           DIST_SQLEXEC_SEEN
--------- ------- ------- ------------------------------------------------ ---------------------------------------- --------------- ------- ------------------- ------------------- -----------------
      298     1.0   24% | oracle@rac12pri1 (LGWR)                          SYNC Remote Write                        Other           WAITING 2017-02-17 08:10:00 2017-02-17 08:14:59                 1
      298     1.0   24% | oracle@rac12pri1 (NSS2)                          Redo Transport MISC                      Other           WAITING 2017-02-17 08:10:00 2017-02-17 08:14:59                 1
      276      .9   23% | oracle@rac12pri2 (LGWR)                          SYNC Remote Write                        Other           WAITING 2017-02-17 08:10:00 2017-02-17 08:14:59                 1
      274      .9   23% | oracle@rac12pri2 (NSS2)                          Redo Transport MISC                      Other           WAITING 2017-02-17 08:10:00 2017-02-17 08:14:59                 1
        8      .0    1% | oracle@rac12pri2 (LMS1)                                                                                   ON CPU  2017-02-17 08:10:16 2017-02-17 08:14:59                 1
        7      .0    1% | oracle@rac12pri2 (LMS0)                          gcs log flush sync                       Other           WAITING 2017-02-17 08:10:43 2017-02-17 08:14:51                 1
        5      .0    0% | oracle@rac12pri1 (LMS0)                          gcs log flush sync                       Other           WAITING 2017-02-17 08:10:23 2017-02-17 08:13:48                 1
        5      .0    0% | oracle@rac12pri2 (LMS0)                                                                                   ON CPU  2017-02-17 08:10:00 2017-02-17 08:13:24                 1
        5      .0    0% | oracle@rac12pri2 (NSV1)                                                                                   ON CPU  2017-02-17 08:12:43 2017-02-17 08:14:44                 1
        4      .0    0% | oracle@rac12pri1 (DBW0)                                                                                   ON CPU  2017-02-17 08:11:16 2017-02-17 08:13:35                 1
        3      .0    0% | oracle@rac12pri2 (CJQ0)                                                                                   ON CPU  2017-02-17 08:12:39 2017-02-17 08:12:51                 2
        3      .0    0% | oracle@rac12pri2 (RSM0)                                                                                   ON CPU  2017-02-17 08:11:44 2017-02-17 08:14:44                 3
        2      .0    0% | oracle@rac12pri1 (LMON)                          control file sequential read             System I/O      WAITING 2017-02-17 08:11:51 2017-02-17 08:14:01                 1
        2      .0    0% | oracle@rac12pri1 (LMS1)                          gcs log flush sync                       Other           WAITING 2017-02-17 08:11:14 2017-02-17 08:11:44                 1
        2      .0    0% | oracle@rac12pri2 (DBW0)                          db file parallel write                   System I/O      WAITING 2017-02-17 08:12:14 2017-02-17 08:14:24                 1
        2      .0    0% | oracle@rac12pri2 (LGWR)                          log file parallel write                  System I/O      WAITING 2017-02-17 08:12:59 2017-02-17 08:14:42                 1
        1      .0    0% | oracle@rac12pri1 (CKPT)                          control file sequential read             System I/O      WAITING 2017-02-17 08:13:46 2017-02-17 08:13:46                 1
        1      .0    0% | oracle@rac12pri1 (DBW0)                          db file parallel write                   System I/O      WAITING 2017-02-17 08:10:28 2017-02-17 08:10:28                 1
        1      .0    0% | oracle@rac12pri1 (LGWR)                          Redo Transport Close                     Other           WAITING 2017-02-17 08:12:22 2017-02-17 08:12:22                 1
        1      .0    0% | oracle@rac12pri1 (LGWR)                          log file sequential read                 System I/O      WAITING 2017-02-17 08:14:16 2017-02-17 08:14:16                 1

20 rows selected.

As you can see the top waits are in wait class “other”: “SYNC Remote Write” and “Redo Transport MISC”. I have pulled an AWR diff report that shows an interesting picture. The first snapshot set represents the full network performance, while the second one shows the effect of the 100ms network delay. Only relevant information is shown.

WORKLOAD REPOSITORY RAC Compare Period Report for

Database Summary
~~~~~~~~~~~~~~~~
                            Database                   Snapshot Ids     Number of Instances    Number of Hosts    Report Total (minutes)
               ----------------------------------   -----------------   -----------------   -----------------   ---------------------
Snapshot Set            Id Name      RAC Block Si      Begin      End   In Repor    Total   In Repor    Total      DB time Elapsed ti
------------   ----------- --------- --- --------   -------- --------   -------- --------   -------- --------   ---------- ----------
First (1st)      481582958 NCDBA     YES     8192        303      304          2        2          2        2         22.1       10.2
Second (2nd)     481582958 NCDBA     YES     8192        301      302          2        2          2        2        141.9       10.6
Database Instances Included In Report
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                                                                                             Elapsed              Instance Avg Activ
Set       DB Id Inst Instance  Release         Host       Startup         Begin Snap Time End Snap Time   Time (min) DB time (m Up Time (h  Sessions Platform
--- ----------- ---- --------- --------------- ---------- --------------- --------------- --------------- ---------- ---------- ---------- --------- ---------------
1st   481582958    1 NCDBA1    12.1.0.2.0      rac12pri1  17-Feb-17 04:00 17-Feb-17 08:59 17-Feb-17 09:09       10.0        2.0        5.2       0.2 Linux x86 64-bi
                   2 NCDBA2    12.1.0.2.0      rac12pri2  17-Feb-17 04:00 17-Feb-17 08:59 17-Feb-17 09:09       10.0       20.1        5.2       2.0 Linux x86 64-bi

2nd   481582958    1 NCDBA1    12.1.0.2.0      rac12pri1  17-Feb-17 04:00 17-Feb-17 08:06 17-Feb-17 08:17       10.5      134.0        4.3      12.8 Linux x86 64-bi
                   2 NCDBA2    12.1.0.2.0      rac12pri2  17-Feb-17 04:00 17-Feb-17 08:07 17-Feb-17 08:17       10.5        7.9        4.3       0.8 Linux x86 64-bi

...

Workload Comparison
~~~~~~~~~~~~~~~~~~~                      1st Per Sec          2nd Per Sec      %Diff              1st Per Txn          2nd Per Txn      %Diff
                                     ---------------      ---------------     ------          ---------------      ---------------     ------
                       DB time:                  2.2                 13.3      514.3                      0.0                  0.2    1,600.0
                      CPU time:                  1.5                  0.5      -66.0                      0.0                  0.0        0.0
           Background CPU time:                  0.5                  0.2      -56.9                      0.0                  0.0        0.0
             Redo size (bytes):            983,488.1            322,026.1      -67.3                  4,164.1              4,192.4        0.7
         Logical read (blocks):             30,773.7              8,299.3      -73.0                    130.3                108.0      -17.1
                 Block changes:              5,665.8              1,886.0      -66.7                     24.0                 24.6        2.3
        Physical read (blocks):                 10.5                 18.3       74.4                      0.0                  0.2      500.0
       Physical write (blocks):                654.9                196.8      -69.9                      2.8                  2.6       -7.6
              Read IO requests:                 10.1                 12.9       27.3                      0.0                  0.2      325.0
             Write IO requests:                503.8                110.4      -78.1                      2.1                  1.4      -32.4
                  Read IO (MB):                  0.1                  0.1       75.0                      0.0                  0.0        0.0
                 Write IO (MB):                  5.1                  1.5      -69.9                      0.0                  0.0        0.0
                  IM scan rows:                  0.0                  0.0        0.0                      0.0                  0.0        0.0
       Session Logical Read IM:
                    User calls:                515.8                188.2      -63.5                      2.2                  2.5       12.4
                  Parses (SQL):                 24.9                 17.7      -28.8                      0.1                  0.2      109.1
             Hard parses (SQL):                  0.1                  0.2       45.5                      0.0                  0.0        0.0
            SQL Work Area (MB):                  0.1                  0.5      253.8                      0.0                  0.0      253.8
                        Logons:                  1.3                  3.9      191.7                      0.0                  0.1      400.0
                Executes (SQL):              2,587.7                853.0      -67.0                     11.0                 11.1        1.4
                  Transactions:                236.2                 76.8      -67.5

...

Top Timed Events   First DB/Inst: NCDBA/NCDBA1 Snaps: 303-304 (Elapsed time: 612.266 sec  DB time: 1326.09 sec),  Second DB/Inst: NCDBA/NCDBA1 Snaps: 301-302 (Elapsed time: 638.691 sec  DB time: 8514.42 sec)
-> Events with a "-" did not make the Top list in this set of snapshots, but are displayed for comparison purposes
-> Summary for all instances

                                               1st                                                                                                2nd
------------------------------------------------------------------------------------------------   ------------------------------------------------------------------------------------------------
Event                          Wait Class           Waits      Time(s)  Avg Time(ms)    %DB time   Event                          Wait Class           Waits      Time(s)  Avg Time(ms)    %DB time
------------------------------ ------------- ------------ ------------ ------------- -----------   ------------------------------ ------------- ------------ ------------ ------------- -----------
 CPU time                                             N/A        919.5           N/A        69.3    log file sync                 Commit              49,010      7,976.7         162.8        93.7
 log file sync                 Commit             144,284        270.4           1.9        20.4    Redo Transport MISC           Other               11,392      1,192.0         104.6        14.0
 log file parallel write       System I/O         141,581        156.6           1.1        11.8    SYNC Remote Write             Other               11,392      1,179.1         103.5        13.8
 gc current block 2-way        Cluster             98,483         71.9           0.7         5.4    CPU time                                             N/A        328.7           N/A         3.9
 gc cr block 2-way             Cluster             71,469         47.9           0.7         3.6    gcs log flush sync            Other                4,480         56.0          12.5         0.7
 target log write size         Other               22,284         26.5           1.2         2.0    gc current block 2-way        Cluster             62,843         43.3           0.7         0.5
 gc current grant busy         Cluster             22,561         16.8           0.7         1.3    gc current block busy         Cluster                284         39.3         138.4         0.5
 control file sequential read  System I/O           8,497          6.0           0.7         0.5    log file switch (checkpoint i Configuration           75         37.1         495.1         0.4
 db file sequential read       User I/O             5,862          5.4           0.9         0.4    rdbms ipc reply               Other                   32         27.1         845.4         0.3
 log file sequential read      System I/O          14,895          4.8           0.3         0.4    gc cr block 2-way             Cluster             39,015         26.0           0.7         0.3
-log file switch (checkpoint i Configuration          212          3.6          17.2         0.3   -log file parallel write       System I/O          11,267         20.3           1.8         0.2
-gcs log flush sync            Other                  461          0.4           0.9         0.0   -gc current grant busy         Cluster             12,427          8.2           0.7         0.1
-gc current block busy         Cluster                126          0.3           2.7         0.0   -control file sequential read  System I/O           7,326          5.0           0.7         0.1
-                                                     N/A          N/A           N/A         N/A   -db file sequential read       User I/O             6,615          4.7           0.7         0.1
-                                                     N/A          N/A           N/A         N/A   -log file sequential read      System I/O             174          0.4           2.4         0.0
                          --------------------------------------------------------------------------------------------------------------------

This doesn’t look too good, does it? The throughput is down substantially, which is visible in Swingbench itself: in the first example I managed to achieve 255 TPS while I’m averaging 90-95 in the synch + delay example. The numbers from the AWR excerpt shown above confirm this, and allows you to work out where the problem is.

Here is a print screen taken from OEM (that’s a different benchmark execution), showing the same situation. Make sure you are licensed appropriately to use this view.

oemexpress_with_latency

Summary Part II

So yes, there is a very noticeable affect of the network delay, and the AWR report shows the relevant wait events in 12c indicating where time is spent.

In my opinion it’s a bit unfortunate to have the new waits in the “Other” wait class, but that’s how it is.

References

http://www.oracle.com/technetwork/database/availability/sync-2437177.pdf

New Events for Data Guard and Synchronous Redo Transport in 12c (1)

I may have said it before but I consider presenting and teaching a great way to expand one’s knowledge: first of all it requires me to really understand a subject. Secondly, when presenting, you get lots of interesting questions that can turn into blog posts like this one.

Lately I have been asked about the impact of synchronous log shipping to a physical standby database. I was sure there was an effect to be observed, depending most likely on the network latency between systems but I didn’t have any evidence I could pull out of the hat to back up my thoughts. So what better than trying! I also read that some of the events have changed in 12c, and wanted to make them visible. My environment is based on the 2 node RAC primary/2 node RAC standby configuration I wrote about in my previous posts.

Since their initial setup I upgraded the cluster to 12.1.0.2.170117 for Clusterware and RDBMS.

I am planning to share my findings in two posts: the first one (the one you are reading) will present the baseline, the second one to follow shows how performance can be impacted by poor network performance. I guess the second one will be a bit longer …

Background

Data Guard allows the administrator to define the databases in a Data Guard configuration to be in one of three possible modes. The first one is Maximum Performance, the second Maximum Availability and finally Maximum Protection. All three modes are covered in the official documentation set and MOS. Without saying it very scientifically:

  • Maximum Performance uses asynchronous log shipping and is the default. The performance of the standby database (or rather lack thereof) should not impact the performance of the primary. At a cost: it is not guaranteed that redo is actually written to the standby when the commit command returns.
  • Maximum Protection allows you to enforce a Zero Data Loss (ZDL) strategy but also at a cost: The primary will shut down if it can’t transmit redo information to the standby. That is how you prevent data loss!
  • Maximum Availability is the middle ground and I’ll focus on that protection mode here.

Make sure to read the documentation and understand the implications of each protection mode when assessing the most suitable mode for your environment!

Background: New Events

Before going into details I should probably point out that there are some new Data Guard events. Hurray! More events and more meaningful names, great. Just one caveat-have a look at the output from my 12c database:

select name, wait_class from v$event_name
where name in (
'ARCH Remote Write','ASYNC Remote Write','Redo Transport Attach',
'Redo Transport Close','Redo Transport Detach','Redo Transport Open',
'Redo Transport Ping','Redo Transport Slave Shutdown','Redo Transport Slave Startup',
'Redo Writer Remote Sync Complete', 'Redo Writer Remote Sync Notify',
'Remote SYNC Ping','SYNC Remote Write');

NAME                                               WAIT_CLASS
-------------------------------------------------- ----------------------------------------
ASYNC Remote Write                                 Other
SYNC Remote Write                                  Other
ARCH Remote Write                                  Other
Redo Transport Attach                              Other
Redo Transport Detach                              Other
Redo Transport Open                                Other
Redo Transport Close                               Other
Redo Transport Ping                                Other
Remote SYNC Ping                                   Other
Redo Transport Slave Startup                       Other
Redo Transport Slave Shutdown                      Other
Redo Writer Remote Sync Notify                     Other
Redo Writer Remote Sync Complete                   Other

13 rows selected.

Compare this to the pre 12c events (taken from an 11.2.0.3 database since they are mostly gone in 12c)

select name, wait_class from v$event_name
where name in (
'ARCH wait on ATTACH','ARCH wait on SENDREQ','ARCH wait on DETACH',
'LNS wait on ATTACH', 'LNS wait on SENDREQ','LNS wait on DETACH',
'LGWR wait on LNS','LNS wait on LGWR','LGWR-LNS wait on channel');

NAME                                               WAIT_CLASS
-------------------------------------------------- --------------------
ARCH wait on ATTACH                                Network
LNS wait on ATTACH                                 Network
LNS wait on SENDREQ                                Network
LNS wait on DETACH                                 Network
ARCH wait on SENDREQ                               Network
ARCH wait on DETACH                                Network
LNS wait on LGWR                                   Network
LGWR wait on LNS                                   Network
LGWR-LNS wait on channel                           Other

9 rows selected.

For the sake of completeness here are those events that remained in 12c:

select name, wait_class from v$event_name
where name in (
'ARCH wait on ATTACH','ARCH wait on SENDREQ','ARCH wait on DETACH',
'LNS wait on ATTACH', 'LNS wait on SENDREQ','LNS wait on DETACH',
'LGWR wait on LNS','LNS wait on LGWR','LGWR-LNS wait on channel');

NAME                           WAIT_CLASS
------------------------------ ------------------------------
LNS wait on LGWR               Network
LGWR wait on LNS               Network
LGWR-LNS wait on channel       Other

Bugger. The new events are all in the “Other” wait class. My initial attempts at filtering information from v$event_histogram based on events with the “Network” class had little success. I guess friends of the OEM performance pages might also find that interesting. If you wonder from where I pulled these events – I used a white paper named “Best Practices for Synchronous Redo Transport” as reference.

Testing: The Environment

As you read in the introduction, this environment is based on a two node RAC primary -> two node RAC standby configuration:

DGMGRL> show configuration verbose

Configuration - ractest

  Protection Mode: MaxPerformance
  Members:
  NCDBA - Primary database
    NCDBB - Physical standby database

  Properties:
    FastStartFailoverThreshold      = '30'
    OperationTimeout                = '30'
    TraceLevel                      = 'USER'
    FastStartFailoverLagLimit       = '30'
    CommunicationTimeout            = '180'
    ObserverReconnect               = '0'
    FastStartFailoverAutoReinstate  = 'TRUE'
    FastStartFailoverPmyShutdown    = 'TRUE'
    BystandersFollowRoleChange      = 'ALL'
    ObserverOverride                = 'FALSE'
    ExternalDestination1            = ''
    ExternalDestination2            = ''
    PrimaryLostWriteAction          = 'CONTINUE'

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS
DGMGRL> show database 'NCDBA';

Database - NCDBA

  Role:               PRIMARY
  Intended State:     TRANSPORT-ON
  Instance(s):
    NCDBA1
    NCDBA2

Database Status:
SUCCESS

DGMGRL> show database 'NCDBB'

Database - NCDBB

  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      0 seconds (computed 0 seconds ago)
  Apply Lag:          0 seconds (computed 0 seconds ago)
  Average Apply Rate: 177.00 KByte/s
  Real Time Query:    OFF
  Instance(s):
    NCDBB1
    NCDBB2 (apply instance)

Database Status:
SUCCESS

DGMGRL> show database 'NCDBA' LogXptMode
  LogXptMode = 'async'
DGMGRL> show database 'NCDBB' LogXptMode
  LogXptMode = 'async'

The configuration is currently set to “Maximum Performance”, databases ship redo asynchronously.

All systems are patched to the January 2017 Proactive Bundle Patch:

[oracle@rac12pri1 ~]$ $ORACLE_HOME/OPatch/opatch lspatches
24917972;Database PSU 12.1.0.2.170117, Oracle JavaVM Component (JAN2017)
24732082;Database Patch Set Update : 12.1.0.2.170117 (24732082)
24828633;OCW Patch Set Update : 12.1.0.2.170117 (24828633)

OPatch succeeded.

Testing part I: The Baseline

I am using Swingbench’s Order Entry benchmark to generate a little bit of load. For the avoidance of doubt: I am not interested in a comparison between asynchronous and synchronous log shipping. I would like to see the effect caused by the difference in latency in the form of hugely increased waits. In my first test I am running with the defaults of my VM (the NIC uses virtio drivers), no change to SDU or send/receive buffers.

$ date
Fri Feb 17 13:58:58 UTC 2017
$ ./charbench -cs //rac12pri-scan/swingbench_both \
> -uc 20 -r results_20_maxperf.xml -rt 00:10
Author  :        Dominic Giles
Version :        2.5.0.971

Results will be written to results_20_maxperf.xml.
Hit Return to Terminate Run...

Time            Users   TPM     TPS

1:59:55 PM      20      3852    255

On the apply side this translates to the following figures (the average apply rate is misleading and should be ignored; it captures all apply since MRP0 was started)

DGMGRL> show database 'NCDBB'

Database - NCDBB

  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      0 seconds (computed 0 seconds ago)
  Apply Lag:          0 seconds (computed 0 seconds ago)
  Average Apply Rate: 713.00 KByte/s
  Real Time Query:    OFF
  Instance(s):
    NCDBB1
    NCDBB2 (apply instance)

Database Status:
SUCCESS

There is no transport lag, and no apply lag either. When checking for waits on background processes (using Tanel Poder’s ashtop – which requires you to have the Diagnostic Pack license as it access v$active_session_history) I noticed the following:

SQL> @ashtop program,event,wait_class,session_state session_type='BACKGROUND' "to_date('17.02.2017 09:02','dd.mm.yyyy hh24:mi')" "to_date('17.02.2017 09:07','dd.mm.yyyy hh24:mi')"

    Total
  Seconds     AAS %This   PROGRAM                                          EVENT                                    WAIT_CLASS      SESSION FIRST_SEEN          LAST_SEEN           DIST_SQLEXEC_SEEN
--------- ------- ------- ------------------------------------------------ ---------------------------------------- --------------- ------- ------------------- ------------------- -----------------
       53      .2   29% | oracle@rac12pri2 (LGWR)                          log file parallel write                  System I/O      WAITING 2017-02-17 09:03:42 2017-02-17 09:06:53                 1
       23      .1   13% | oracle@rac12pri2 (LG00)                          log file parallel write                  System I/O      WAITING 2017-02-17 09:02:03 2017-02-17 09:03:26                 1
       19      .1   10% | oracle@rac12pri2 (DBW0)                                                                                   ON CPU  2017-02-17 09:02:07 2017-02-17 09:06:39                 1
       12      .0    7% | oracle@rac12pri2 (LGWR)                          target log write size                    Other           WAITING 2017-02-17 09:02:03 2017-02-17 09:03:27                 1
        9      .0    5% | oracle@rac12pri1 (LMS1)                                                                                   ON CPU  2017-02-17 09:04:37 2017-02-17 09:06:58                 1
        8      .0    4% | oracle@rac12pri2 (TT00)                                                                                   ON CPU  2017-02-17 09:02:15 2017-02-17 09:06:56                 1
        6      .0    3% | oracle@rac12pri1 (LG00)                          log file parallel write                  System I/O      WAITING 2017-02-17 09:03:08 2017-02-17 09:06:44                 1
        6      .0    3% | oracle@rac12pri2 (LGWR)                                                                                   ON CPU  2017-02-17 09:04:05 2017-02-17 09:06:41                 1
        5      .0    3% | oracle@rac12pri2 (LG00)                                                                                   ON CPU  2017-02-17 09:02:02 2017-02-17 09:03:27                 1
        5      .0    3% | oracle@rac12pri2 (LMS0)                                                                                   ON CPU  2017-02-17 09:02:07 2017-02-17 09:06:14                 1
        5      .0    3% | oracle@rac12pri2 (LMS1)                                                                                   ON CPU  2017-02-17 09:02:24 2017-02-17 09:06:28                 1
        4      .0    2% | oracle@rac12pri1 (LMS0)                                                                                   ON CPU  2017-02-17 09:04:21 2017-02-17 09:06:05                 1
        3      .0    2% | oracle@rac12pri1 (LGWR)                          LGWR all worker groups                   Other           WAITING 2017-02-17 09:03:08 2017-02-17 09:06:13                 1
        2      .0    1% | oracle@rac12pri1 (DBW0)                          db file parallel write                   System I/O      WAITING 2017-02-17 09:03:08 2017-02-17 09:03:42                 1
        2      .0    1% | oracle@rac12pri2 (CKPT)                                                                                   ON CPU  2017-02-17 09:05:10 2017-02-17 09:05:13                 1
        2      .0    1% | oracle@rac12pri2 (TT00)                          log file sequential read                 System I/O      WAITING 2017-02-17 09:03:17 2017-02-17 09:03:46                 1
        1      .0    1% | oracle@rac12pri1 (CJQ0)                                                                                   ON CPU  2017-02-17 09:04:44 2017-02-17 09:04:44                 1
        1      .0    1% | oracle@rac12pri1 (DBW0)                                                                                   ON CPU  2017-02-17 09:03:05 2017-02-17 09:03:05                 1
        1      .0    1% | oracle@rac12pri1 (LG00)                                                                                   ON CPU  2017-02-17 09:02:02 2017-02-17 09:02:02                 1
        1      .0    1% | oracle@rac12pri1 (LGWR)                          target log write size                    Other           WAITING 2017-02-17 09:05:47 2017-02-17 09:05:47                 1

The time difference between charbench and the database can be explained by time zones: my load generator is set to use UTC whereas the database is on EST; the result is a 6 hour time difference. This is a minor oversight on my part and has nothing to do with the location of the machines. In fact, they are all virtualised.

Summary Part I

In this part I tried to lay the foundation for the next one where I’m using my lab environment to simulate an artificially high network latency. I have to admit again that my setup is somewhat over-simplified-using Data Guard with asynchronous log shipping and Maximum Performance mode is not something I’d like to support for anything more important than my toy databases in the lab.

References

http://www.oracle.com/technetwork/database/availability/sync-2437177.pdf

Creating a RAC 12.1 Data Guard Physical Standby environment (4)

In the previous three parts of this series a lot of preparation work, needed for the configuration of Data Guard, was performed. In this part of the mini-series they all come to fruition. Using the Data Guard broker a switchover operation will be performed. A couple of new features in 12c make this easier. According to the “Changes in This Release for Oracle Data Guard Concepts and Administration” chapter of the 12.1 Data Guard Concepts and Administration guide:

When [you, ed.] perform a switchover from an Oracle RAC primary database to a physical standby database, it is no longer necessary to shut down all but one primary database instance.

I have always wanted to test that in a quiet moment…

I have previously blogged about another useful change that should make my life easier: the static registration of the *_DGMGRL services in the listener.ora file is no longer needed. Have a look at my Data Guard Broker Setup Changes post for more details and reference to the documentation.

NOTE: As always, this is just a demonstration using VMs in my lab, based on my notes. Your system is most likely different, so in real-life you might take a different approach. The techniques I am using here were suitable for me, and my own small scale testing. I tried to make sure they are valid, but you may want to allocate more resources in your environment. Test, test, test on your own environment on test kit first!

Now let’s get to it.

Step 1: Check the status of the configuration

In the first step I always check the configuration and make sure I can switch over. Data Guard 12c has a nifty automatic check that helps, but I always have a list of tasks I perform prior to a switchover (not shown in this blog post).

The following commands are somewhat sensitive to availability of the network – you should protect your sessions against any type of network failure! I am using screen (1) for that purpose, there are other tools out there doing similar things. Network glitches are too common to ignore, and I have come to appreciate the ability to resume work without too many problems after having seen the dreaded “broken pipe” message in my terminal window…

[oracle@rac12sec1 ~]$ dgmgrl
DGMGRL for Linux: Version 12.1.0.2.0 - 64bit Production

Copyright (c) 2000, 2013, Oracle. All rights reserved.

Welcome to DGMGRL, type "help" for information.
DGMGRL> connect sys@ncdbb
Password:
Connected as SYSDBA.
DGMGRL> show configuration

Configuration - ractest

  Protection Mode: MaxPerformance
  Members:
  NCDBA - Primary database
    NCDBB - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 55 seconds ago)

DGMGRL> validate database 'NCDBB'
...

The command to check for switchover readiness is new to 12c as well and called “validate database”. I don’t have screen output from the situation at this point-just take my word that I was ready :) Don’t switch over if you have any concerns or doubts the operation might not succeed! “Validate database” does not relieve you from your duties to check for switchover readiness – follow your procedures.

Step 2: Switch Over

Finally, the big moment has come! It takes just one line to perform the switchover:

DGMGRL> switchover to 'NCDBB'
Performing switchover NOW, please wait...
New primary database "NCDBB" is opening...
Oracle Clusterware is restarting database "NCDBA" ...
Switchover succeeded, new primary is "NCDBB"
DGMGRL> 

DGMGRL> show database 'NCDBA';

Database - NCDBA

  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      0 seconds (computed 1 second ago)
  Apply Lag:          0 seconds (computed 1 second ago)
  Average Apply Rate: 9.00 KByte/s
  Real Time Query:    ON
  Instance(s):
    NCDBA1
    NCDBA2 (apply instance)

Database Status:
SUCCESS

DGMGRL> show database 'NCDBB';

Database - NCDBB

  Role:               PRIMARY
  Intended State:     TRANSPORT-ON
  Instance(s):
    NCDBB1
    NCDBB2

Database Status:
SUCCESS

DGMGRL> 

Well that was easy! Did you notice Data Guard Broker telling us that ‘Oracle Clusterware is restarting database “NCDBA” …’ ? I like it.

If you get stuck at this point something has gone wrong with the database registration in the OCR. You shouldn’t run into problems though, because you tested every aspect of the RAC system before handing the system over to its intended users, didn’t you?

Validating the new standby database shows no issues. I haven’t noticed it before but “validate database” allows you to get more verbose output:

DGMGRL> validate database verbose 'NCDBA';

  Database Role:     Physical standby database
  Primary Database:  NCDBB

  Ready for Switchover:  Yes
  Ready for Failover:    Yes (Primary Running)

  Capacity Information:
    Database  Instances        Threads
    NCDBB     2                2
    NCDBA     2                2

  Temporary Tablespace File Information:
    NCDBB TEMP Files:  1
    NCDBA TEMP Files:  1

  Flashback Database Status:
    NCDBB:  On
    NCDBA:  Off

  Data file Online Move in Progress:
    NCDBB:  No
    NCDBA:  No

  Standby Apply-Related Information:
    Apply State:      Running
    Apply Lag:        0 seconds (computed 1 second ago)
    Apply Delay:      0 minutes

  Transport-Related Information:
    Transport On:      Yes
    Gap Status:        No Gap
    Transport Lag:     0 seconds (computed 1 second ago)
    Transport Status:  Success

  Log Files Cleared:
    NCDBB Standby Redo Log Files:  Cleared
    NCDBA Online Redo Log Files:   Cleared
    NCDBA Standby Redo Log Files:  Available

  Current Log File Groups Configuration:
    Thread #  Online Redo Log Groups  Standby Redo Log Groups Status
              (NCDBB)                 (NCDBA)
    1         2                       3                       Sufficient SRLs
    2         2                       3                       Sufficient SRLs

  Future Log File Groups Configuration:
    Thread #  Online Redo Log Groups  Standby Redo Log Groups Status
              (NCDBA)                 (NCDBB)
    1         2                       3                       Sufficient SRLs
    2         2                       3                       Sufficient SRLs

  Current Configuration Log File Sizes:
    Thread #   Smallest Online Redo      Smallest Standby Redo
               Log File Size             Log File Size
               (NCDBB)                   (NCDBA)
    1          50 MBytes                 50 MBytes
    2          50 MBytes                 50 MBytes

  Future Configuration Log File Sizes:
    Thread #   Smallest Online Redo      Smallest Standby Redo
               Log File Size             Log File Size
               (NCDBA)                   (NCDBB)
    1          50 MBytes                 50 MBytes
    2          50 MBytes                 50 MBytes

  Apply-Related Property Settings:
    Property                        NCDBB Value              NCDBA Value
    DelayMins                       0                        0
    ApplyParallel                   AUTO                     AUTO

  Transport-Related Property Settings:
    Property                        NCDBB Value              NCDBA Value
    LogXptMode                      ASYNC                    ASYNC
    RedoRoutes                      <empty>                  <empty>
    Dependency                      <empty>                  <empty>
    DelayMins                       0                        0
    Binding                         optional                 optional
    MaxFailure                      0                        0
    MaxConnections                  1                        1
    ReopenSecs                      300                      300
    NetTimeout                      30                       30
    RedoCompression                 DISABLE                  DISABLE
    LogShipping                     ON                       ON

  Automatic Diagnostic Repository Errors:
    Error                       NCDBB    NCDBA
    No logging operation        NO       NO
    Control file corruptions    NO       NO
    SRL Group Unavailable       NO       NO
    System data file missing    NO       NO
    System data file corrupted  NO       NO
    System data file offline    NO       NO
    User data file missing      NO       NO
    User data file corrupted    NO       NO
    User data file offline      NO       NO
    Block Corruptions found     NO       NO

DGMGRL> 

Isn’t that cool? That’s more information at my fingertips than I can shake a stick at! It’s also a lot more than I could think of (eg online datafile move!).

Interestingly the Broker reports that I have “Sufficient SRLs”. I have seen it complain about the number of Standby Redo Logs in the past and blogged about this Interesting observation about standby redo logs in Data Guard

Summary

After 4 (!) posts about the matter I have finally been able to perform a switchover operation. Role reversals are a much neglected operation a DBA should be comfortable with. In a crisis situation everyone needs to be clear about what needs to be done to restore service to the users. The database is usually the easier part … Success of Data Guard switchover operations also depends on the quality of change management: it is easy to “forget” applying configuration changes on the DR site.

In today’s busy times only few of us are lucky enough to intimately know each and every database we look after. What’s more common (sadly!) is that a DBA looks after 42 or more databases. This really only works without too many issues if procedures and standards are rock solid, and enforced.

Creating a RAC 12.1 Data Guard Physical Standby environment (3)

In the previous two parts of this series you read about my lab environment and the preparations on the network side as well as the database. In this part I’ll cover the database duplication. Again, this won’t be a short post …

NOTE: As always, this is just a demonstration using VMs in my lab, based on my notes. Your system is most likely different, so in real-life you might take a different approach. The techniques I am using here were suitable for me, and my own small scale testing. I tried to make sure they are valid, but you may want to allocate more resources in your environment. Test, test, test on your own environment on test kit first!

Step 1: Create an initialisation file

The next step is the preparation of an initialisation file. I am taking NCDBA as the sample and transfer it over to rac12sec1:

SQL> create pfile='/tmp/initNCDBA1.ora' from spfile;

File created.

[oracle@rac12pri1 ~]$ scp /tmp/initNCDBA1.ora rac12sec1:/tmp
oracle@rac12sec1's password: 
initNCDBA1.ora                                            100% 1515     1.5KB/s   00:00    
[oracle@rac12pri1 ~]$ 

The file is a bit verbose, and needs some editing. I appreciate that RMAN replaces the file anyway with a copy but I like to keep it tidy, at least at first … The end result is shown here-as always, you make sure that your version of the pfile matches your site’s requirements and standards. The important part is where you set the directory paths correctly, and you also ensure that db_name and db_unique_name are set correctly.

*.audit_file_dest='/u01/app/oracle/admin/NCDBB/adump'
*.audit_trail='db'
*.cluster_database=true
*.compatible='12.1.0.2.0'
*.db_block_size=8192
*.db_create_file_dest='+DATA'
*.db_domain=''
*.db_name='NCDBA'
*.db_unique_name='NCDBB'
*.db_recovery_file_dest='+RECO'
*.db_recovery_file_dest_size=4785m
*.diagnostic_dest='/u01/app/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=NCDBBXDB)'
NCDBB1.instance_number=1
NCDBB2.instance_number=2
*.open_cursors=300
*.pga_aggregate_target=1024m
*.processes=300
*.remote_login_passwordfile='exclusive'
*.sga_target=3072m
NCDBB2.thread=2
NCDBB1.thread=1
NCDBB1.undo_tablespace='UNDOTBS1'
NCDBB2.undo_tablespace='UNDOTBS2'

The file needs to be transferred to $ORACLE_HOME/dbs as initNCDBB1.ora.

Step 2: Create a temporary password file

You undoubtedly spotted that remote_login_passwordfile is set, which is a pre-requisite for connecting to the database as SYS via the network. The password file is required for an active database duplication, and without a password file this wouldn’t work. Since Oracle takes care of the password file during the RMAN duplicate it appears to be sufficient to simply create a password file in $ORACLE_HOME/dbs on rac12sec1:

[oracle@rac12sec1 dbs]$ orapwd file=orapwNCDBB1

Enter password for SYS:
[oracle@rac12sec1 dbs]$

During my testing this was enough to proceed to the next step.

Step 3: Duplicate the database “for standby”

The system should now be ready for the standby database’s creation. It’s been a bit of a task to get there, but from now on things will be easier … Using the *_DUP entries in the system’s tnsnames.ora file I connect to the target and auxiliary database, followed by the command to create the standby. In my case the database is more than tiny and the RMAN defaults are enough. Larger databases would probably need different settings and commands.


[oracle@rac12sec1 dbs]$ rman target sys@ncdba_dup auxiliary sys@ncdbb_dup

Recovery Manager: Release 12.1.0.2.0 - Production on Wed Dec 14 10:25:47 2016

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

target database Password: 
connected to target database: NCDBA (DBID=479005270)
auxiliary database Password: 
connected to auxiliary database: NCDBA (DBID=479005270, not open)

RMAN> startup clone nomount

connected to auxiliary database (not started)
Oracle instance started

Total System Global Area    3221225472 bytes

Fixed Size                     2929552 bytes
Variable Size                838863984 bytes
Database Buffers            2365587456 bytes
Redo Buffers                  13844480 bytes

RMAN> duplicate target database for standby from active database;

Starting Duplicate Db at 14-DEC-16
allocated channel: ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: SID=240 instance=NCDBB1 device type=DISK

contents of Memory Script:
{
   backup as copy reuse
   targetfile  '+DATA/NCDBA/PASSWORD/pwdncdba.256.930561891' auxiliary format
 '/u01/app/oracle/product/12.1.0.2/dbhome_1/dbs/orapwNCDBB1'   ;
}
executing Memory Script

Starting backup at 14-DEC-16
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=276 instance=NCDBA1 device type=DISK
Finished backup at 14-DEC-16

contents of Memory Script:
{
   sql clone "create spfile from memory";
   shutdown clone immediate;
   startup clone nomount;
   restore clone from service  'ncdba_dup' standby controlfile;
}  
executing Memory Script

sql statement: create spfile from memory

Oracle instance shut down

sql statement: alter system archive log current

contents of Memory Script:
{  
   switch clone datafile all;
}
executing Memory Script

datafile 1 switched to datafile copy
input datafile copy RECID=7 STAMP=930565247 file name=+DATA/NCDBB/DATAFILE/system.259.930565213
datafile 2 switched to datafile copy
input datafile copy RECID=8 STAMP=930565247 file name=+DATA/NCDBB/DATAFILE/undotbs2.260.930565221
datafile 3 switched to datafile copy
input datafile copy RECID=9 STAMP=930565247 file name=+DATA/NCDBB/DATAFILE/sysaux.261.930565223
datafile 4 switched to datafile copy
input datafile copy RECID=10 STAMP=930565247 file name=+DATA/NCDBB/DATAFILE/undotbs1.262.930565231
datafile 5 switched to datafile copy
input datafile copy RECID=11 STAMP=930565247 file name=+DATA/NCDBB/DATAFILE/example.263.930565233
datafile 6 switched to datafile copy
input datafile copy RECID=12 STAMP=930565247 file name=+DATA/NCDBB/DATAFILE/users.264.930565241
Finished Duplicate Db at 14-DEC-16

RMAN>

Perfect – the duplicate finished ok and all 6 datafiles have been transferred to my standby cluster. There are a few things worth noting here:

  • The password file of the primary (residing in ASM) has been copied to ‘/u01/app/oracle/product/12.1.0.2/dbhome_1/dbs/orapwNCDBB1’ (not in ASM)
  • The standby database’s spfile has been created from memory – a lot of unneccesary and some obsolete parameters are now in the standby database’s spfile

Step 4: Finishing Touches

There are a few things still outstanding before declaring the environment ready. First of all you need to register the standby database in the Cluster Registry (OCR). Use srvctl for this task. One way to do this is shown here:

[oracle@rac12sec1 dbs]$ srvctl add database -db NCDBB -oraclehome $ORACLE_HOME -dbtype RAC \
> -role PHYSICAL_STANDBY -startoption MOUNT -dbname NCDBA -diskgroup "DATA,RECO"

[oracle@rac12sec1 dbs]$ srvctl add instance -db NCDBB -instance NCDBB1 -node rac12sec1
[oracle@rac12sec1 dbs]$ srvctl add instance -db NCDBB -instance NCDBB2 -node rac12sec2

When registering your standby database make sure you are license compliant!

Next I move the password file to ASM. The file currently resides on rac12sec1 in the $ORACLE_HOME/dbs directory. You can use the orapwd command to move it. If you registered the database in the OCR as I did in the previous step you will get an updated configuration as a result – the password file configuration is changed automatically.

[oracle@rac12sec1 dbs]$ srvctl config database -db NCDBB | grep -i password
Password file: +DATA/NCDBB/orapwncdbb

Since I like error messages when the database starts even less than unnecessary parameters in my spfile I tend to remove these from the standby database. And besides – in its current form the spfile wouldn’t allow me to mount both RAC instances anyway. For some odd reason the mapping between SID and instances is lost during the RMAN duplicate.

So I start off by creating a pfile from the spfile, and remove all unneeded underscore parameters and deprecated ones. In the resulting step I ensure that all the necessary RAC parameters are in. The minimalistic end result looks like this for my database:

*.audit_file_dest='/u01/app/oracle/admin/NCDBB/adump'
*.audit_trail='DB'
*.cluster_database=TRUE
*.compatible='12.1.0.2.0'
*.control_files='+DATA/NCDBB/CONTROLFILE/current.258.930565201',
   '+RECO/NCDBB/CONTROLFILE/current.258.930565201'#Restore Controlfile
*.db_block_size=8192
*.db_create_file_dest='+DATA'
*.db_domain=''
*.db_name='NCDBA'
*.db_recovery_file_dest='+RECO'
*.db_recovery_file_dest_size=4785M
*.db_unique_name='NCDBB'
*.diagnostic_dest='/u01/app/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=NCDBBXDB)'
NCDBB1.instance_number=1
NCDBB2.instance_number=2
*.open_cursors=300
*.pga_aggregate_target=1G
*.processes=300
*.remote_login_passwordfile='EXCLUSIVE'
*.sga_target=3G
NCDBB2.thread=2
NCDBB1.thread=1
NCDBB1.undo_tablespace='UNDOTBS1'
NCDBB2.undo_tablespace='UNDOTBS2'

Next I can move the spfile to ASM and update the OCR profile. After these steps, the OCR profile reflects the new location of the spfile. Be sure to check all $ORACLE_HOME/dbs directories on the cluster nodes to prevent local pfiles/spfiles taking precedence over the one referred to in the OCR.

[oracle@rac12sec1 dbs]$ srvctl config database -d NCDBB | grep spfile
Spfile: +DATA/NCDBB/spfileNCDBB

Everything is awesome!

It’s been a bit of a journey, but it was worth it. Before continuing with the remaining configuration steps, here is a little bit of detail about the system as it is:

[oracle@rac12sec1 dbs]$ srvctl status database -db NCDBB -verbose
Instance NCDBB1 is running on node rac12sec1. Instance status: Mounted (Closed).
Instance NCDBB2 is running on node rac12sec2. Instance status: Mounted (Closed).
[oracle@rac12sec1 dbs]$ 

SQL> select name,name,db_unique_name,database_role,log_mode,force_logging,flashback_on from v$database;

NAME      NAME      DB_UNIQUE_NAME   DATABASE_ROLE    LOG_MODE     FORCE_LOGGING   FLASHBACK_ON
--------- --------- ---------------- ---------------- ------------ --------------- ------------------
NCDBA     NCDBA     NCDBB            PHYSICAL STANDBY ARCHIVELOG   YES             NO

SQL> select * from v$active_instances;

INST_NUMBER INST_NAME                                                        CON_ID
----------- ------------------------------------------------------------ ----------
          1 rac12sec1:NCDBB1                                                      0
          2 rac12sec2:NCDBB2                                                      0

In my initial drafts I intended to continue this article with the Data Guard configuration, but having looked at the WordPress-formatted part 3 up to here I decided to split it into 2 pieces. Next up is the Data Guard Broker configuration, followed by more posts about switching over and application design considerations. Stay tuned!

Creating a RAC 12.1 Data Guard Physical Standby environment (2)

In the first part of this mini-series you saw me define the environment as well as creating a primary database. With that out of the way it’s time to think about the standby. Before the standby can be created, a few preparations are necessary both on the primary as well as the standby cluster.

NOTE: As always, this is just a demonstration using VMs in my lab, based on my notes. Your system is most likely different, so in real-life you might take a different approach. The techniques I am using here were suitable for me, and my own small scale testing. I tried to make sure they are valid, but you may want to allocate more resources in your environment. Test, test, test on your own environment on test kit first!

Preparing the Creation of the Standby Database

It’s easier to follow the discussion if you keep in mind that my primary database is NCDBA and the standby is NCDBB.

Step 1: Listener Configuration

The first step to be completed is the network setup. I am planning on using an active database duplication which requires my node listeners on the cluster to “know” about my database instances. Thankfully since RAC 12.1 we don’t need to register the *_DGMGRL service (used for switchover operations by the Broker) anymore, making life a bit easier. The static registration of databases may not be necessary if you are using a different RMAN duplicate approach.

By default Oracle does not statically register databases, so it’s up to you to do so. On rac12sec1 I added a SID_LIST_LISTENER section (note: only the node listener!) to /u01/app/12.1.0.2/grid/network/admin/listener.ora. Remember that the node listeners reside in the GRID_HOME.

...
SID_LIST_LISTENER =
  (SID_LIST =
...
    (SID_DESC =
      (GLOBAL_DBNAME = NCDBB)
      (ORACLE_HOME = /u01/app/oracle/product/12.1.0.2/dbhome_1)
      (SID_NAME = NCDBB1)
    )
...
  )
...

Likewise, I added a static registration to rac12sec2 – identical to the previous example except for the ORACLE_SID, which is NCDBB2 of course. After which I reloaded the listeners on both nodes.

Step 2: Service Name Configuration

The next file to change is the RDBMS home’s tnsnames.ora. To make sure I cover all eventualities I created the following tnsnames.ora on all cluster nodes, primary and standby RAC

[oracle@rac12pri1 ~]$ cat $ORACLE_HOME/network/admin/tnsnames.ora 
# tnsnames.ora Network Configuration File: /u01/app/oracle/product/12.1.0.2/dbhome_1/network/admin/tnsnames.ora
# Generated by Oracle configuration tools.

NCDBA =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = rac12pri-scan)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = NCDBA)
    )
  )

NCDBB =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = rac12sec-scan)(PORT = 1521))
    )
    (CONNECT_DATA =
      (SERVICE_NAME = NCDBB)
    )
  )

# only used for RMAN duplicate ... from active database
NCDBA_DUP =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = rac12pri1-vip)(PORT = 1521))
      (ADDRESS = (PROTOCOL = TCP)(HOST = rac12pri2-vip)(PORT = 1521))
    )
    (CONNECT_DATA =
      (SERVICE_NAME = NCDBA)
    )
  )

NCDBB_DUP =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = rac12sec1-vip)(PORT = 1521))
      (ADDRESS = (PROTOCOL = TCP)(HOST = rac12sec2-vip)(PORT = 1521))
    )
    (CONNECT_DATA =
      (SERVICE_NAME = NCDBB)
    )
  )

The first two entires are self-explanatory. The second pair requires a little more explanation. Since I have statically registered my standby database with the node listeners only (eg not with the SCAN listeners-I never tried and don’t like the thought of that) the duplicate would fail trying to restart the auxiliary instance. I haven’t captured the output of the failed RMAN duplicate but it’s possible to show you the effect of using the “wrong” tnsname with a simple test (the database I want to connect to is of course down)

[oracle@rac12pri1 ~]$ sqlplus sys@NCDBB as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on Wed Dec 14 14:36:49 2016

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Enter password: 
ERROR:
ORA-12514: TNS:listener does not currently know of service requested in connect
descriptor


Enter user-name: 
[oracle@rac12pri1 ~]$ sqlplus sys@NCDBB_DUP as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on Wed Dec 14 14:36:59 2016

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Enter password: 
Connected to an idle instance.

SQL> 

So it seems to be better to use the net service name where I can be sure the database can be controlled from a remote session :)

NB: this test required a password file for the standby database which doesn’t exist yet and will be created in part 3.

Step 3: Modify the oratab file

You also should change the oratab and add NCDBB on both standby hosts, as shown here for and rac12sec2.

[oracle@rac12sec2 ~]$ grep NCDBB /etc/oratab
NCDBB:/u01/app/oracle/product/12.1.0.2/dbhome_1:N

Step 4: Prepare the Primary Database: Archivelog mode, Force Logging, and maybe even Flashback Database

There are a number of tasks to be completed on the primary database before I can start duplicating it for a standby environment. I need to make sure it’s in Archivelog mode, and that force logging is enabled. If I can afford to take the overhead, I enable Flashback Database as well to help reinstating a failed primary should that need arise. Of course, you don’t just enable Flashback Database, because it can have a negative impact on performance, and the Flashback logs will use space. Your FRA space usage calculation can be derailed if you are not careful. Thorough testing is needed to assess whether it is ok to operate with Flashback Database enabled.

The first task in my scenario is to enable Archivelog mode and to enforce logging.

[oracle@rac12pri1 ~]$ sq

SQL*Plus: Release 12.1.0.2.0 Production on Wed Dec 14 09:52:11 2016

Copyright (c) 1982, 2014, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL> archive log list
Database log mode              No Archive Mode
Automatic archival             Disabled
Archive destination            USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence     15
Current log sequence           16
SQL> exit

...

[oracle@rac12pri1 ~]$ srvctl stop database -d NCDBA
[oracle@rac12pri1 ~]$ sq

SQL*Plus: Release 12.1.0.2.0 Production on Wed Dec 14 10:00:34 2016

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup mount
ORACLE instance started.

Total System Global Area 3221225472 bytes
Fixed Size                  2929552 bytes
Variable Size             838863984 bytes
Database Buffers         2365587456 bytes
Redo Buffers               13844480 bytes
Database mounted.
SQL> alter database archivelog;

Database altered.

SQL> alter database force logging;

Database altered.

SQL>  alter database open;

Database altered.

SQL> exit

It is easy to check if that was a success:

[oracle@rac12pri1 ~]$ srvctl start instance -db NCDBA -node rac12pri2

[oracle@rac12pri1 ~]$ sq

SQL*Plus: Release 12.1.0.2.0 Production on Wed Dec 14 15:04:28 2016

Copyright (c) 1982, 2014, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL> archive log list
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence     15
Next log sequence to archive   16
Current log sequence           16

Step 5: Add Standby Redo Logs

In order to use Real Time Apply you need Standby Redo Logs. If you create them now (eg before the RMAN duplicate) you are saving time. First of all you don’t need to create Standby Redo Logs (SRL) post switchover when NCDBA operates as a standby database. Secondly, if there are SRLs in place, they are automatically created on the standby database as part of the RMAN duplicate.

You need n+1 SRLs per thread. You can work the number out by looking at v$log. My tiny database has 2 threads with 2 members each – ok for the lab, almost certainly not ok for anything serious. Helps me keep storage footprint in the lab low though.

SQL> select * from v$log;

    GROUP#    THREAD#  SEQUENCE#      BYTES  BLOCKSIZE    MEMBERS ARC STATUS           FIRST_CHANGE# FIRST_TIM NEXT_CHANGE# NEXT_TIME     CON_ID
---------- ---------- ---------- ---------- ---------- ---------- --- ---------------- ------------- --------- ------------ --------- ----------
         1          1         15   52428800        512          2 YES INACTIVE               1621242 14-DEC-16      1621618 14-DEC-16          0
         2          1         16   52428800        512          2 NO  CURRENT                1621618 14-DEC-16   2.8147E+14                    0
         3          2          1   52428800        512          2 YES INACTIVE               1620677 14-DEC-16      1628674 14-DEC-16          0
         4          2          2   52428800        512          2 NO  CURRENT                1629454 14-DEC-16   2.8147E+14 14-DEC-16          0

SQL> 

With that information I can create 3 SRLs per thread. Did I mention that I love Oracle Managed Files?

SQL> alter database add standby logfile thread 1 size 52428800;

Database altered.

SQL> /

Database altered.

SQL> /

Database altered.

SQL> alter database add standby logfile thread 2 size 52428800;

Database altered.

SQL> /

Database altered.

SQL> /

Database altered.

SQL> select group#,thread#,sequence#,bytes,status from v$standby_log;

    GROUP#    THREAD#  SEQUENCE#      BYTES STATUS
---------- ---------- ---------- ---------- ----------
         5          1          0   52428800 UNASSIGNED
         6          1          0   52428800 UNASSIGNED
         7          1          0   52428800 UNASSIGNED
         8          2          0   52428800 UNASSIGNED
         9          2          0   52428800 UNASSIGNED
        10          2          0   52428800 UNASSIGNED

6 rows selected.

The primary database should be in relatively good shape now – Flashback Database is off to simulate the case where my fictional load is considered to impact performance too much.

SQL> select name,db_unique_name,database_role,log_mode,force_logging,flashback_on from v$database;

NAME      DB_UNIQUE_NAME  DATABASE_ROLE    LOG_MODE     FORCE_LOGGING  FLASHBACK_ON
--------- --------------- ---------------- ------------ -------------- ------------------
NCDBA     NCDBA           PRIMARY          ARCHIVELOG   YES            NO

SQL>

Summary

This concludes part 2. In this blog post I shared my notes about configuration the network for RMAN duplicate … from active database. I also configured the primary database in preparation for the initial duplicate command.

Note that I’m not setting any initialisation parameters related to Data Guard: over time I have become a great fan of the Data Guard Broker and plan on using that instead of a manual configuration.