Monthly Archives: April 2015

Little things worth knowing: Is there a penalty in establishing a connection to Oracle using the MAA connection string? Part 2

In the very lengthy previous post about the MAA connect string I wanted to explain the use of the MAA connection string as promoted by Oracle. I deliberately kept the first part simple: both primary and standby cluster were up, and although the database was operating in the primary role on what I called standby cluster (again it’s probably not a good idea to include the intended role in the infrastructure names) there was no penalty establishing a connection.

As pointed out by readers of the blog entry that is of course only one part of the story (you may have guessed by the TNS alias MAA_TEST1 … there are more to come). When you define the connection string this way, you cater for the situation where something goes wrong. Let’s try a few more scenarios. Here is the current Data Guard situation:

DGMGRL> show configuration

Configuration - martin

  Protection Mode: MaxPerformance
  Members:
  CDB   - Primary database
    Error: ORA-16810: multiple errors or warnings detected for the database

    STDBY - Physical standby database 
      Error: ORA-12543: TNS:destination host unreachable

Fast-Start Failover: DISABLED

Configuration Status:
ERROR   (status updated 71 seconds ago)
DGMGRL> show database 'CDB'

Database - CDB

  Role:               PRIMARY
  Intended State:     TRANSPORT-ON
  Instance(s):
    CDB1
      Error: ORA-16737: the redo transport service for standby database "STDBY" has an error

    CDB2
      Error: ORA-16737: the redo transport service for standby database "STDBY" has an error

Database Status:
ERROR

As you can see CDB is operating in primary role, and redo transport is broken. This simple reason for broken redo transport is the fact that the standby cluster in its entirety is down. For the sake of completeness I have repeated the MAA connection string here:

MAA_TEST1 =
  (DESCRIPTION_LIST=
     (LOAD_BALANCE=off)(FAILOVER=on)
       (DESCRIPTION=
         (CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)
           (ADDRESS_LIST= (LOAD_BALANCE=on) (ADDRESS=(PROTOCOL=TCP)(HOST=rac12pri-scan)(PORT=1521)))
           (CONNECT_DATA=(SERVICE_NAME=rootsrv)))
       (DESCRIPTION= (CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)
           (ADDRESS_LIST= (LOAD_BALANCE=on) (ADDRESS=(PROTOCOL=TCP)(HOST= rac12sby-scan)(PORT=1521)))
           (CONNECT_DATA=(SERVICE_NAME=rootsrv)))
  )

Scenario 1: shut down of the standby cluster, database in primary role on primary cluster

This does not seem to be much of a problem. Using the same test as in part 1 I get no noteworthy penalty at all:

[oracle@lab tns]$ time sqlplus system/secret@maa_test1 <<EOF
select sys_context('userenv','instance_name') from dual;
exit
EOF

SQL*Plus: Release 12.1.0.2.0 Production on Thu Apr 23 10:11:53 2015

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Last Successful login time: Thu Apr 23 2015 09:59:48 +01:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL> 
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
CDB2

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real	0m0.246s
user	0m0.020s
sys	0m0.025s
[oracle@lab tns]$ time sqlplus system/secret@maa_test1 <<EOF
select sys_context('userenv','instance_name') from dual;
exit
EOF

SQL*Plus: Release 12.1.0.2.0 Production on Thu Apr 23 10:12:12 2015

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Last Successful login time: Thu Apr 23 2015 10:11:52 +01:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL> 
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
CDB1

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real	0m0.171s
user	0m0.018s
sys	0m0.018s
[oracle@lab tns]$ 

Trying from a 12.1.0.1.0 client on a different machine:

[oracle@oracledev ~]$ time sqlplus system/secret@maa_test1 <<EOF
> select sys_context('userenv','instance_name') from dual;
> exit
> EOF

SQL*Plus: Release 12.1.0.1.0 Production on Thu Apr 23 10:13:50 2015

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Last Successful login time: Thu Apr 23 2015 10:12:11 +01:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL> 
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
CDB1

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real	0m0.427s
user	0m0.012s
sys	0m0.027s
[oracle@oracledev ~]$ time sqlplus system/secret@maa_test1 <<EOF
select sys_context('userenv','instance_name') from dual;
exit
EOF

SQL*Plus: Release 12.1.0.1.0 Production on Thu Apr 23 10:13:52 2015

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Last Successful login time: Thu Apr 23 2015 10:13:50 +01:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL> 
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
CDB2

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real	0m0.158s
user	0m0.012s
sys	0m0.016s

That seems ok, but can be easily explained by the (LOAD_BALANCE=off)(FAILOVER=on) in the DESCRIPTION_LIST. But what if we do it the other way around? Let’s start by reversing the roles:

DGMGRL for Linux: Version 12.1.0.2.0 - 64bit Production

Copyright (c) 2000, 2013, Oracle. All rights reserved.

Welcome to DGMGRL, type "help" for information.
Connected as SYSDBA.
DGMGRL> switchover to 'STDBY'
Performing switchover NOW, please wait...
New primary database "STDBY" is opening...
Oracle Clusterware is restarting database "CDB" ...
Switchover succeeded, new primary is "STDBY"

DGMGRL> show configuration

Configuration - martin

  Protection Mode: MaxPerformance
  Members:
  STDBY - Primary database
    CDB   - Physical standby database 

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 48 seconds ago)


Scenario 2: black-out of the primary cluster, database in primary role on standby cluster

The next is the worst case: a date centre has gone down, and none of the hosts in the other cluster (rac12pri) is reachable. To simulate this I simply shut down rac12pri1 and rac12pri2, that’s as dead as it gets. The SCAN will still resolve in DNS, but there is neither a SCAN VIP nor a SCAN LISTENER anywhere to answer on the primary cluster.

As you saw in the output leading to this paragraph the primary database is started on rac12sby, nodes rac12sby1 and rac12sby2. In the current format, there is – for the first time – a connection penalty:

preventde
[oracle@rac12sby1 ~]$ time sqlplus system/secret@maa_test1 <<EOF
> select sys_context('userenv','instance_name') from dual;
> exit;
> EOF

SQL*Plus: Release 12.1.0.2.0 Production on Tue Apr 21 07:17:36 2015

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Last Successful login time: Mon Apr 20 2015 10:39:29 +01:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL>
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
STDBY2

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real    0m35.770s
user    0m0.024s
sys     0m0.043s

[oracle@rac12sby1 ~]$ time sqlplus system/secret@maa_test1 <<EOF
select sys_context('userenv','instance_name') from dual;
exit;
EOF

SQL*Plus: Release 12.1.0.2.0 Production on Tue Apr 21 07:18:39 2015

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Last Successful login time: Tue Apr 21 2015 07:18:09 +01:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL>
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
STDBY2

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real    0m36.219s
user    0m0.021s
sys     0m0.022s

During testing with 12c these ranged between 30 seconds and 38. Here is proof that I truly cannot connect to the primary cluster:

[oracle@lab tns]$ ping rac12pri-scan
PING rac12pri-scan.example.com (192.168.100.111) 56(84) bytes of data.
From lab.example.com (192.168.100.1) icmp_seq=1 Destination Host Unreachable
From lab.example.com (192.168.100.1) icmp_seq=2 Destination Host Unreachable
From lab.example.com (192.168.100.1) icmp_seq=3 Destination Host Unreachable
From lab.example.com (192.168.100.1) icmp_seq=4 Destination Host Unreachable
^C
--- rac12pri-scan.example.com ping statistics ---
5 packets transmitted, 0 received, +4 errors, 100% packet loss, time 4001ms

[oracle@rac12sby1 ~]$ time telnet rac12pri-scan 1521
Trying 192.168.100.112...
telnet: connect to address 192.168.100.112: No route to host
Trying 192.168.100.113...
telnet: connect to address 192.168.100.113: No route to host
Trying 192.168.100.111...
telnet: connect to address 192.168.100.111: No route to host

real	0m9.022s
user	0m0.001s
sys	0m0.003s

I found it interesting that the telnet command took almost no time at all to complete, yet trying to connect using SQL_Net it seems to take forever. Why could that be? Trace!

Getting to the bottom of this

Tracing can help sometimes, so I tried that. I try and keep my working TNS configuration separate from the production entries in $ORACLE_HOME. Using ~/tns I defined sqlnet.ora to read:

  • diag_adr_enabled = off
  • trace_level_client = support
  • trace_directory_client = /home/oracle/tns
  • log_directory_client = /home/oracle/tns
  • trace_unique_client = true

And connected again. I noticed time increases whenever nsc2addr was invoked:

[oracle@lab tns]$ grep "nsc2addr.*DESC" cli_6630.trc | nl
     1  (1947452928) [23-APR-2015 11:07:45:376] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.112)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
     2  (1947452928) [23-APR-2015 11:07:48:383] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.111)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
     3  (1947452928) [23-APR-2015 11:07:51:390] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.113)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
     4  (1947452928) [23-APR-2015 11:07:54:396] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.113)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
     5  (1947452928) [23-APR-2015 11:07:57:402] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.111)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
     6  (1947452928) [23-APR-2015 11:08:00:408] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.112)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
     7  (1947452928) [23-APR-2015 11:08:03:414] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.113)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
     8  (1947452928) [23-APR-2015 11:08:06:421] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.111)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
     9  (1947452928) [23-APR-2015 11:08:09:427] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.112)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
    10  (1947452928) [23-APR-2015 11:08:12:433] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.111)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
    11  (1947452928) [23-APR-2015 11:08:15:439] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.112)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
    12  (1947452928) [23-APR-2015 11:08:18:445] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.113)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
    13  (1947452928) [23-APR-2015 11:08:21:452] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.120)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))

Sorry for the lengthy listing but that way it shows best! The primary SCAN is set to 111,112 and 113 with the standby SCAN on 119,120, and 121.

What you can see clearly by looking at the timestamps is that the transport_connect_timout works perfectly well-a new address is tried every 3 seconds. What surprises me is that each SCAN VIP is tried 4 times before the second SCAN is tried (lines 1-12). I thought I had that limited to 3 retries … Line 13 is the first reference to the second SCAN, and as soon as that is referenced a connection is established.

That can’t be it, can it?

Hmm maybe Oracle went a bit too far by adding 3 retries to every single SCAN VIP. The whole process can be sped up by trying every SCAN VIP only once before failing over to the second SCAN. This can be done with the following connection string:

MAA_TEST2 =
  (DESCRIPTION_LIST=
     (LOAD_BALANCE=off)(FAILOVER=on)
     (CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)
       (DESCRIPTION=
           (ADDRESS_LIST= (LOAD_BALANCE=on) (ADDRESS=(PROTOCOL=TCP)(HOST=rac12pri-scan)(PORT=1521)))
           (CONNECT_DATA=(SERVICE_NAME=rootsrv)))
       (DESCRIPTION= 
           (ADDRESS_LIST= (LOAD_BALANCE=on) (ADDRESS=(PROTOCOL=TCP)(HOST= rac12sby-scan)(PORT=1521)))
           (CONNECT_DATA=(SERVICE_NAME=rootsrv)))
  )

Trying this works a lot better from a timing perspective:

[oracle@lab tns]$ time sqlplus system/secret@maa_test2<<EOF
exit
EOF


SQL*Plus: Release 12.1.0.2.0 Production on Thu Apr 23 11:35:17 2015

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Last Successful login time: Thu Apr 23 2015 11:08:21 +01:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real	0m9.829s
user	0m0.041s
sys	0m0.022s
[oracle@lab tns]$ 

Visible in the SQL*Net trace:

[oracle@lab tns]$ grep "nsc2addr.*DESC" cli_7305.trc | nl 
     1	(3474875904) [23-APR-2015 11:35:17:935] nsc2addr: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.113)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
     2	(3474875904) [23-APR-2015 11:35:20:940] nsc2addr: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.112)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
     3	(3474875904) [23-APR-2015 11:35:23:948] nsc2addr: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.111)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
     4	(3474875904) [23-APR-2015 11:35:26:956] nsc2addr: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.119)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))

That was the solution to the “slow” connectivity that the person who asked me the question in Paris shared, but I haven’t been able to find proof as to why it works better: this is it. Now the question remains why the MAA string in the Application Continuity white paper is more complex? I think what’s failing here is the CONNECT_TIMEOUT. According to tnsnames.ora reference in 12c the purpose of the parameter is “To specify the timeout duration in seconds for a client to establish an Oracle Net connection to an Oracle database”. You could read this so that after CONNECT_TIMEOUT is reached the failover to the next SCAN occurs. However it appears – by looking at the trace – that the connection timeout is similar to transport timeout and the lower of both is used.

There is a high probability that the syntax is in the document there for a reason, although it escapes me right now. I guess the penalty when creating new connections with connection pools doesn’t really matter that much, because you create the pool (where you “pay” the penalty) only once, and subsequently pull connections from there, which should be very fast.

Little things worth knowing: Is there a penalty in establishing a connection to Oracle using the MAA connection string?

Sorry for the long title!

I had a question during my session about “advanced RAC programming features” during the last Paris Oracle Meetup about the MAA connection string. I showed an example taken from the Appication Continuity White Paper (http://www.oracle.com/technetwork/database/options/clustering/application-continuity-wp-12c-1966213.pdf). Someone from the audience asked me if I had experienced any problems with it, such as very slow connection timeouts. I haven’t, but wanted to double-check anyway. This is a simplified test using a sqlplus connection since it is easier to time than a call to a connection pool creation. If you know of a way to reliably do so in Java/UCP let me know and I’ll test it.

My system is 12.1.0.2 on Oracle Linux 7.1 with UEK (3.8.13-55.1.6.el7uek.x86_64) and the following patches on the RDBMS and Grid Infrastructure side:

[oracle@rac12sby1 ~]$ opatch lspatches -oh /u01/app/12.1.0.2/grid
19872484;WLM Patch Set Update: 12.1.0.2.2 (19872484)
19769480;Database Patch Set Update : 12.1.0.2.2 (19769480)
19769479;OCW Patch Set Update : 12.1.0.2.2 (19769479)
19769473;ACFS Patch Set Update : 12.1.0.2.2 (19769473)

[oracle@rac12sby1 ~]$ opatch lspatches -oh /u01/app/oracle/product/12.1.0.2/dbhome_1
19877336;Database PSU 12.1.0.2.2, Oracle JavaVM Component (Jan2015)
19769480;Database Patch Set Update : 12.1.0.2.2 (19769480)
19769479;OCW Patch Set Update : 12.1.0.2.2 (19769479)

This is the January System Patch by the way, 20132450. At first I wanted to blog about the patch application but it was so uneventful I decided against it.

I have two clusters, rac12pri and rac12sby. Both consist of 2 nodes each. Database CDB is located on rac12pri, STDBY on rac12sby. Normally I am against naming databases by function as you might end up using STDBY in primary role after a data centre migration, and people tend to find the use of STDBY as primary database odd. In this case I hope it helps understanding the concept better, and it’s a lab environment anyway …

Creating the broker configuration

There is nothing special about creating the Broker Configuration, just remember to define the broker configuration files on shared storage and enabling automatic standby file management. I also recommend standby redo logs on both the primary and standby databases. Once you have the configuration in place, check the database(s) in verbose mode to get the broker connection string. You can copy/paste the connection string to sqlplus to ensure that every instance can be started thanks to a static listener registration (Data Guard will restart databases during switchover and failover operations, which is bound to fail unless the databases are statically registered with the listener). Here is what I mean:

DGMGRL> show instance verbose 'CDB1';

Instance 'CDB1' of database 'CDB'

  Host Name: rac12pri1
  PFILE:     
  Properties:
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.108)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=CDB_DGMGRL)(INSTANCE_NAME=CDB1)(SERVER=DEDICATED)))'
    StandbyArchiveLocation          = 'USE_DB_RECOVERY_FILE_DEST'
    AlternateLocation               = ''
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = '%t_%s_%r.dbf'
    TopWaitEvents                   = '(monitor)'

Instance Status:
SUCCESS

Now I can take the static connection identifier and use it (be sure to specify the “as sysdba” at the end):

[oracle@rac12pri1 ~]$ sqlplus 

SQL*Plus: Release 12.1.0.2.0 Production on Sat Apr 18 10:50:09 2015

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Enter user-name: sys@(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.108)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=CDB_DGMGRL)(INSTANCE_NAME=CDB1)(SERVER=DEDICATED))) as sysdba
Enter password: 

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL>

When this worked for all instances in the configuration you are a big step further in your Data Guard setup!

Service setup

I have created a new service, named rootsrv on both databases.

[oracle@rac12sby1 ~]$ srvctl add service -d STDBY -s rootsrv -role primary -policy automatic -preferred STDBY1,STDBY2

[oracle@rac12pri1 ~]$ srvctl add service -d CDB -s rootsrv -role primary -policy automatic -preferred CDB1,CDB2

To better follow along here is the current status of the configuration (In real life you might want to use a different protection mode):

DGMGRL> show configuration

Configuration - martin

  Protection Mode: MaxPerformance
  Members:
  STDBY - Primary database
    CDB   - Physical standby database 

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 28 seconds ago)

As per its definition, the service is started on the primary but not on the standby. In fact, I can’t start it on the standby:

[oracle@rac12sby1 ~]$ srvctl status service -d STDBY -s rootsrv
Service rootsrv is running on instance(s) STDBY1,STDBY2

[oracle@rac12pri1 ~]$ srvctl status service -d CDB -s rootsrv
Service rootsrv is not running.
[oracle@rac12pri1 ~]$ srvctl start service -d CDB -s rootsrv
PRCD-1084 : Failed to start service rootsrv
PRCR-1079 : Failed to start resource ora.cdb.rootsrv.svc
CRS-2800: Cannot start resource 'ora.cdb.db' as it is already in the INTERMEDIATE state on server 'rac12pri1'
CRS-2632: There are no more servers to try to place resource 'ora.cdb.rootsrv.svc' on that would satisfy its placement policy
CRS-2800: Cannot start resource 'ora.cdb.db' as it is already in the INTERMEDIATE state on server 'rac12pri2'
[oracle@rac12pri1 ~]$

So no more need for database triggers to start and stop services (this was another question I had during my talk) depending on the database’s role.

The MAA connection string

The MAA connection string as taken from the white paper and slightly adapted is this one:

MAA_TEST1 =
  (DESCRIPTION_LIST=
     (LOAD_BALANCE=off)(FAILOVER=on)
       (DESCRIPTION=
         (CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)
           (ADDRESS_LIST= (LOAD_BALANCE=on) (ADDRESS=(PROTOCOL=TCP)(HOST=rac12pri-scan)(PORT=1521)))
           (CONNECT_DATA=(SERVICE_NAME=rootsrv)))
       (DESCRIPTION= (CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)
           (ADDRESS_LIST= (LOAD_BALANCE=on) (ADDRESS=(PROTOCOL=TCP)(HOST= rac12sby-scan)(PORT=1521)))
           (CONNECT_DATA=(SERVICE_NAME=rootsrv)))
  )

Please refer to the white paper for an explanation of the various parameters.

Notice that both the primary and standby SCAN are referenced in there, both connecting to “rootsrv” which is currently active on STDBY. To get some proper timing about the connection delay I use the following little snippet:


$ time sqlplus system/secret@maa_test1 <<EOF
> select sys_context('userenv','instance_name') from dual;
> exit
> EOF

Testing on the “primary cluster” first, the local database is operating in the standby role:

[oracle@rac12pri1 ~]$ time sqlplus system/secret@maa_test1 <<EOF
select sys_context('userenv','instance_name') from dual;
exit
EOF

SQL*Plus: Release 12.1.0.2.0 Production on Sat Apr 18 10:41:25 2015

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Last Successful login time: Sat Apr 18 2015 10:33:53 +01:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL>
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
STDBY2

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real    0m0.432s
user    0m0.026s
sys     0m0.028s

[oracle@rac12pri1 ~]$ time sqlplus system/secret@maa_test1 <<EOF
select sys_context('userenv','instance_name') from dual;
exit
EOF

SQL*Plus: Release 12.1.0.2.0 Production on Sat Apr 18 10:31:25 2015

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Last Successful login time: Sat Apr 18 2015 10:31:24 +01:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL>
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
STDBY2

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real    0m0.415s
user    0m0.023s
sys     0m0.025s

That was quick-not even a second. And as you can see the connection is set up against database STDBY on the other cluster.

Next on the standby cluster:

[oracle@rac12sby1 ~]$ time sqlplus system/secret@maa_test1 <<EOF
select sys_context('userenv','instance_name') from dual;
exit
EOF

SQL*Plus: Release 12.1.0.2.0 Production on Sat Apr 18 10:33:26 2015

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Last Successful login time: Sat Apr 18 2015 10:33:22 +01:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL>
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
STDBY2

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real    0m0.613s
user    0m0.025s
sys     0m0.019s

[oracle@rac12sby1 ~]$ time sqlplus system/secret@maa_test1 <<EOF
select sys_context('userenv','instance_name') from dual;
exit
EOF

SQL*Plus: Release 12.1.0.2.0 Production on Sat Apr 18 10:33:52 2015

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Last Successful login time: Sat Apr 18 2015 10:33:50 +01:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL>
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
STDBY2

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real    0m0.199s
user    0m0.024s
sys     0m0.021s

So this was quick too. And finally on a different machine altogether (my lab server)

[oracle@lab tns]$ time sqlplus system/secret@maa_test1 <<EOF
select sys_context('userenv', 'instance_name') from dual;
exit
EOF

SQL*Plus: Release 12.1.0.2.0 Production on Sat Apr 18 08:13:50 2015

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Last Successful login time: Sat Apr 18 2015 08:13:47 +01:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL> 
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
STDBY2

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real	0m0.387s
user	0m0.020s
sys	0m0.018s

Does it change after a role reversal? I think I’ll cover that in a different post …

Appendix: setting up Data Guard for RAC

This is something I always forget so it’s here for reference. I use an identical tnsnames.ora file across all nodes, here it is:

CDB =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = rac12pri-scan)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = CDB)
    )
  ) 

STDBY =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = rac12sby-scan)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = STDBY)
    )
  ) 

# used for DG and RMAN duplicate
STDBY_NON_SCAN =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = rac12sby1-vip)(PORT = 1521))
      (ADDRESS = (PROTOCOL = TCP)(HOST = rac12sby2-vip)(PORT = 1521))
    )
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = STDBY)
    )
  )

CDB_NON_SCAN =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = rac12pri1-vip)(PORT = 1521))
      (ADDRESS = (PROTOCOL = TCP)(HOST = rac12pri2-vip)(PORT = 1521))
    )
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = CDB)
    )
  )

The non-scan TNS entries are a quick workaround to the problem that I registered the SIDs statically with the node listeners only, not the SCAN listeners. If there is a more elegant way of statically registering a database, please let me know! I couldn’t find any documentation on how to do this. I guess adding a SID_LIST_LISTENER_SCANx would do, too. Anyway, here is an example for the local node listener configuration (not the complete file contents). The ORACLE_SID and host names must be adapted for each node in the cluster.

LISTENER =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = IPC)(KEY = LISTENER))
    )
  )

# for broker
SID_LIST_LISTENER=
  (SID_LIST=
    (SID_DESC=
      (GLOBAL_DBNAME=STDBY)
      (ORACLE_HOME=/u01/app/oracle/product/12.1.0.2/dbhome_1)
      (SID_NAME=STDBY1))
    (SID_DESC=
      (GLOBAL_DBNAME=STDBY_DGMGRL)
      (ORACLE_HOME=/u01/app/oracle/product/12.1.0.2/dbhome_1)
      (SID_NAME=STDBY1))
  )

After changing the listener configuration you need to reload the local node listener.

Little things worth knowing: what does opatchauto actually do?

I recently applied system patch 20132450 to my 12.1.0.2.0 installation on a 2 node RAC system on Oracle Linux 7.1. While ensuring that OPatch is the latest version available I came across an interesting command line option in opatchauto. It is called “-generateSteps”.

[oracle@rac12sby1 ~]$ $ORACLE_HOME/OPatch/opatchauto apply -help
OPatch Automation Tool
Copyright (c) 2015, Oracle Corporation.  All rights reserved.


DESCRIPTION
    Apply a System Patch to Oracle Home. User specified the patch
    location or the current directory will be taken as the patch location.
    opatchauto must run from the GI Home as root user.

SYNTAX
<GI_HOME>/OPatch/opatchauto apply
                            [-analyze]
                            [-database <database names> ]
                            [-generateSteps]
                            [-invPtrLoc <Path to oraInst.loc> ]
                            [-jre <LOC> ] 
                            [-norestart ]
                            [-nonrolling ]
                            [-ocmrf <OCM response file location> ]
                            [-oh <OH_LIST> ]
                            [ <Patch Location> ]

OPTIONS


       -analyze
              This option runs all the required prerequisite checks to confirm
              the patchability of the system without actually patching or 
              affecting the system in any way.

       -database
              Used to specify the RDBMS home(s) to be patched. Option value 
              is oracle database name separated by comma.

       -generateSteps
              Generate the manual steps of apply session. These steps are what
              'opatchauto apply' does actually.

       -invPtrLoc
              Used to locate the oraInst.loc file when the installation used 
              the -invPtrLoc. This should be the path to the oraInst.loc file.

       -jre
              This option uses JRE (java) from the
              specified location instead of the default location
              under Oracle Home.

       -norestart
              This option tells opatchauto not to restart the Grid Infrastructure
              stack and database home resources after patching.

       -nonrolling
              This option makes the patching session run in 'nonrolling' mode.
              It is required that the stack on the local node is running while
              it must be stopped on all the remaining nodes before the patching
              session starts.

       -ocmrf 
              This option specifies the absolute path to the OCM response file. 
              It is required if the target Oracle Home doesn't have OCM 
              installed and configured.

       -oh
              This option specifies Oracle Home(s) to be patched. This can be a
              RAC home, a Grid Home or comma separated list of multiple homes.


PARAMETERS
      Patch Location
                    Path to the location for the patch. If the patch 
                    location is not specified, then the current
                    directory is taken as the patch location.


Example:
      To patch GI home and all RAC homes: 
      '<GI_HOME>/OPatch/opatchauto apply'

      To patch multiple homes:
      '<GI_HOME>/OPatch/opatchauto apply -oh <GI_HOME>,<RAC_HOME1>,<RAC_HOME2>'

      To patch databases running from RAC homes only:
      '<RAC_HOME>/OPatch/opatchauto apply -database db1,db2...dbn'

      To patch software-only installation:
      '<RAC_HOME>/OPatch/opatchauto apply -oh <RAC_HOME>' OR
      '<GI_HOME>/OPatch/opatchauto apply -oh <GI_HOME>'


opatchauto succeeded.
[oracle@rac12sby1 ~]$ 

So this could be quite interesting so I wanted to see what it does with the system patch. I don’t have a database installed in my cluster yet, it’s going to be the standby site for another cluster (rac12pri). Unfortunately opatchauto still skips RDBMS homes without a database in it in my tests so I end up using the “-oh” flag in this case. Here’s the result, formatted for better readability. In reality all calls to opatchauto are in a single line:

[root@rac12sby1 ~]# opatchauto apply /u01/patches/20132450 \
-oh /u01/app/12.1.0.2/grid -generateSteps -ocmrf /u01/patches/ocm.rsp
OPatch Automation Tool
Copyright (c) 2015, Oracle Corporation.  All rights reserved.

OPatchauto version : 12.1.0.1.5
OUI version        : 12.1.0.2.0
Running from       : /u01/app/12.1.0.2/grid

Invoking opatchauto utility "generateapplysteps"

To apply the patch, Please do the following manual actions:

Step 1 As the "oracle" user on the host "rac12sby1", please run the following commands:
  Action 1.1 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch version -oh /u01/app/12.1.0.2/grid -invPtrLoc \
/u01/app/12.1.0.2/grid/oraInst.loc -v2c 12.1.0.1.5

Step 2 As the "oracle" user on the host "rac12sby1", please run the following commands:
  Action 2.1 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch lsinventory -invPtrLoc \
/u01/app/12.1.0.2/grid/oraInst.loc -oh /u01/app/12.1.0.2/grid

Step 3 As the "oracle" user on the host "rac12sby1", please run the following commands:
  Action 3.1 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch prereq CheckComponents \
-ph /u01/patches/20132450/19769473 -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc \
-oh /u01/app/12.1.0.2/grid

  Action 3.2 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch prereq CheckComponents \
-ph /u01/patches/20132450/19769479 -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc \
-oh /u01/app/12.1.0.2/grid

  Action 3.3 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch prereq CheckComponents \
-ph /u01/patches/20132450/19769480 -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc \
-oh /u01/app/12.1.0.2/grid

  Action 3.4 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch prereq CheckComponents \
-ph /u01/patches/20132450/19872484 -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc \
-oh /u01/app/12.1.0.2/grid

Step 4 As the "oracle" user on the host "rac12sby1", please run the following commands:
  Action 4.1 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch prereq CheckConflictAgainstOH \
-ph /u01/patches/20132450/19769473 -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc \
-oh /u01/app/12.1.0.2/grid

  Action 4.2 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch prereq CheckConflictAgainstOH \
-ph /u01/patches/20132450/19769479 -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc \
-oh /u01/app/12.1.0.2/grid

  Action 4.3 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch prereq CheckConflictAgainstOH \
-ph /u01/patches/20132450/19769480 -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc \
-oh /u01/app/12.1.0.2/grid

  Action 4.4 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch prereq CheckConflictAgainstOH \
-ph /u01/patches/20132450/19872484 -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc \
-oh /u01/app/12.1.0.2/grid

Step 5 As the "root" user on the host "rac12sby1", please run the following commands:
  Action 5.1 [root@rac12sby1]#
  /u01/app/12.1.0.2/grid/perl/bin/perl -I/u01/app/12.1.0.2/grid/perl/lib \
-I/u01/app/12.1.0.2/grid/crs/install /u01/app/12.1.0.2/grid/crs/install/rootcrs.pl -prepatch

Step 6 As the "oracle" user on the host "rac12sby1", please run the following commands:
  Action 6.1 [oracle@rac12sby1]$
  echo /u01/patches/20132450/19769473 > /tmp/OraGI12Home1_patchList

  Action 6.2 [oracle@rac12sby1]$
  echo /u01/patches/20132450/19769479 >> /tmp/OraGI12Home1_patchList

  Action 6.3 [oracle@rac12sby1]$
  echo /u01/patches/20132450/19769480 >> /tmp/OraGI12Home1_patchList

  Action 6.4 [oracle@rac12sby1]$
  echo /u01/patches/20132450/19872484 >> /tmp/OraGI12Home1_patchList

  Action 6.5 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch napply -phBaseFile /tmp/OraGI12Home1_patchList \
-local  -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc -oh /u01/app/12.1.0.2/grid -silent

Step 7 As the "root" user on the host "rac12sby1", please run the following commands:
  Action 7.1 [root@rac12sby1]#
  /u01/app/12.1.0.2/grid/rdbms/install/rootadd_rdbms.sh

Step 8 As the "root" user on the host "rac12sby1", please run the following commands:
  Action 8.1 [root@rac12sby1]#
  /u01/app/12.1.0.2/grid/perl/bin/perl -I/u01/app/12.1.0.2/grid/perl/lib \
-I/u01/app/12.1.0.2/grid/crs/install /u01/app/12.1.0.2/grid/crs/install/rootcrs.pl -postpatch


Step 9 As the "oracle" user on the host "rac12sby1", please run the following commands:
  Action 9.1 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch lsinventory \
-invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc -oh /u01/app/12.1.0.2/grid | grep 19769473

  Action 9.2 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch lsinventory \
-invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc -oh /u01/app/12.1.0.2/grid | grep 19769479

  Action 9.3 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch lsinventory \
-invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc -oh /u01/app/12.1.0.2/grid | grep 19769480

  Action 9.4 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch lsinventory \
-invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc -oh /u01/app/12.1.0.2/grid | grep 19872484

You could of course use oplan which is bundled in $ORACLE_HOME/OPatch/oplan/oplan to generate a lot more detailed profile for the patch application. For a first glance at the activity opatchauto -generateSteps seems quite useful.

The steps all rang a bell from my 11.2.0.1 days when this was the procedure to patch Grid Infrastructure (thank god that’s over). What I didn’t recognise was rootadd_rdbms.sh. Looking at the file I can see it is changing ownership and permissions for oradism (important!) and some other executables in $ORACLE_HOME and fixes potential problems with “fs.file-max*” in /etc/sysctl.conf.

Little things worth knowing: direct path inserts and referential integrity

This is another post to remind myself that Oracle evolves, and what I thought I knew might no longer be relevant. So double-checking instead of assuming should become a habit!

Today’s example: direct path inserts.┬áI seemed to remember from Oracle 9i that a direct path insert┬áignores referential integrity. This is still confirmed in the 9i Release 2 Concepts Guide, chapter 19 “Direct Path Insert”. Quoting from there:

During direct-path INSERT operations, Oracle appends the inserted data after existing data in the table. Data is written directly into datafiles, bypassing the buffer cache. Free space in the existing data is not reused, and referential integrity constraints are ignored

That sounds a bit harsh in today’s times so it’s worth a test. On Oracle 12.1.0.2 I created a parent/child relationship, admittedly rather crude:

SQL> create table parent (id, vc) as
  2   select distinct data_object_id, subobject_name from dba_objects
  3   where data_object_id is not null;

Table created.

SQL> alter table parent add constraint pk_parent primary key (id);

Table altered.

SQL> create table child (id number, p_id number not null, vc varchar2(100),
  2  constraint pk_child primary key (id),
  3  constraint fk_parent_child foreign key (p_id) references parent (id));

Table created.

Now when I try to insert data using a direct path insert it fails:

SQL> insert /*+ append */ into child select s_child.nextval, -1, 'this will not work' from parent;
insert /*+ append */ into child select s_child.nextval, -1, 'this will not work' from parent
*
ERROR at line 1:
ORA-02291: integrity constraint (MARTIN.FK_PARENT_CHILD) violated - parent key not found

Which is brilliant and what I expected since it prevents me from writing a lot of garbage into my table. If you have developed with Oracle then you probably know about deferrable constraints. An existing constraint can’t be changed to a status of “initially deferrable”, which is why I have to drop it and then try the insert again:

SQL> alter table child drop constraint FK_PARENT_CHILD;

Table altered.

Elapsed: 00:00:00.04
SQL> alter table child add constraint FK_PARENT_CHILD foreign key (p_id) references parent (id)
  2  deferrable initially deferred;

Table altered.

Elapsed: 00:00:00.02
SQL> insert /*+ append */ into child select s_child.nextval, -1, 'this will not work' from parent;

7640 rows created.

Elapsed: 00:00:00.30
SQL> commit;
commit
*
ERROR at line 1:
ORA-02091: transaction rolled back
ORA-02291: integrity constraint (MARTIN.FK_PARENT_CHILD) violated - parent key not found

So this contraint fires as well! Good news for me. If it’s all ok then the insert will of course succeed.

Trouble is that Oracle “silently” ignores the direct path load! I haven’t initially put this into the post but thanks to Oren for adding it to it in the comments section (make sure to have a look at it).

Back to the “hot” constraint definition and inserting into the table yields the expected result.

SQL> alter table child drop constraint FK_PARENT_CHILD;

Table altered.

SQL> alter table child add constraint FK_PARENT_CHILD foreign key (p_id) references parent (id);

Table altered.

SQL> insert /*+ append */ into child select s_child.nextval, 2, 'this should work however' from parent;

7640 rows created.

SQL> commit;

Commit complete.

Summary

Again-the insert wasn’t really a “direct path insert”, see comments. It really pays off to verify and update “old knowledge” from time to time! I still prefer valid data over garbage, and when it comes to ILM I can “move” my table to get the required compression result.

References

Have fun!