Monthly Archives: April 2014

RAC 12c enhancements: adding an additional SCAN-part 2

In the first part of this article you could read how to add an additional network resource, additional VIPs and SCAN to an 12.1.0.1.2 cluster. In this part I hope to show you the next steps such as adding the SCAN listeners and other resources.

New SCAN listener

With the second SCAN added it is time to add the next set of SCAN listeners. This is really simple, and here is the code to add them:

[oracle@ron12cprinode1 ~]# srvctl add scan_listener -netnum 2 -listener dgscanlsnr

After starting the SCAN listeners on network 2, I can see they are indeed working correctly:

[oracle@ron12cprinode1 ~]$ srvctl start scan_listener -k 2

[oracle@ron12cprinode1 ~]$ srvctl config scan_listener -k 2
SCAN Listener DGSCANLSNR_SCAN1_NET2 exists. Port: TCP:1521
Registration invited nodes:
Registration invited subnets:
SCAN Listener DGSCANLSNR_SCAN2_NET2 exists. Port: TCP:1521
Registration invited nodes:
Registration invited subnets:
SCAN Listener DGSCANLSNR_SCAN3_NET2 exists. Port: TCP:1521
Registration invited nodes:
Registration invited subnets:
[oracle@ron12cprinode1 ~]$

This step needs to be repeated on the second cluster as well. As it’s more or less the same I didn’t repeat it here.

Now it’s getting slightly busier in the infrastructure! In between the various tasks I verified the status of Clusterware. At this point I had additional VIPs, a new SCAN, and new SCAN listeners.

crsctl stat res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE       ron12cprinode1           STABLE
               ONLINE  ONLINE       ron12cprinode2           STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       ron12cprinode1           STABLE
               ONLINE  ONLINE       ron12cprinode2           STABLE
ora.OCR.dg
               ONLINE  ONLINE       ron12cprinode1           STABLE
               ONLINE  ONLINE       ron12cprinode2           STABLE
ora.RECO.dg
               ONLINE  ONLINE       ron12cprinode1           STABLE
               ONLINE  ONLINE       ron12cprinode2           STABLE
ora.asm
               ONLINE  ONLINE       ron12cprinode1           Started,STABLE
               ONLINE  ONLINE       ron12cprinode2           Started,STABLE
ora.net1.network
               ONLINE  ONLINE       ron12cprinode1           STABLE
               ONLINE  ONLINE       ron12cprinode2           STABLE
ora.net2.network
               ONLINE  ONLINE       ron12cprinode1           STABLE
               ONLINE  ONLINE       ron12cprinode2           STABLE
ora.ons
               ONLINE  ONLINE       ron12cprinode1           STABLE
               ONLINE  ONLINE       ron12cprinode2           STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.DGSCANLSNR_SCAN1_NET2.lsnr
      1        ONLINE  ONLINE       ron12cprinode2           STABLE
ora.DGSCANLSNR_SCAN2_NET2.lsnr
      1        ONLINE  ONLINE       ron12cprinode2           STABLE
ora.DGSCANLSNR_SCAN3_NET2.lsnr
      1        ONLINE  ONLINE       ron12cprinode2           STABLE
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       ron12cprinode1           STABLE
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE       ron12cprinode1           STABLE
ora.LISTENER_SCAN3.lsnr
      1        ONLINE  ONLINE       ron12cprinode1           STABLE
ora.cvu
      1        ONLINE  ONLINE       ron12cprinode1           STABLE
ora.oc4j
      1        OFFLINE OFFLINE                               STABLE
ora.ron.db
      1        ONLINE  ONLINE       ron12cprinode1           Open,STABLE
ora.ron.ron12c.svc
      1        ONLINE  ONLINE       ron12cprinode1           STABLE
ora.ron12cprinode1.vip
      1        ONLINE  ONLINE       ron12cprinode1           STABLE
ora.ron12cprinode1_2.vip
      1        ONLINE  ONLINE       ron12cprinode1           STABLE
ora.ron12cprinode2.vip
      1        ONLINE  ONLINE       ron12cprinode2           STABLE
ora.ron12cprinode2_2.vip
      1        ONLINE  ONLINE       ron12cprinode2           STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       ron12cprinode1           STABLE
ora.scan1_net2.vip
      1        ONLINE  ONLINE       ron12cprinode2           STABLE
ora.scan2.vip
      1        ONLINE  ONLINE       ron12cprinode1           STABLE
ora.scan2_net2.vip
      1        ONLINE  ONLINE       ron12cprinode2           STABLE
ora.scan3.vip
      1        ONLINE  ONLINE       ron12cprinode1           STABLE
ora.scan3_net2.vip
      1        ONLINE  ONLINE       ron12cprinode2           STABLE
--------------------------------------------------------------------------------

Additional Node Listeners

Now I need another set of listeners on the 2nd network. You can use netca for this, but it’s far easier to add it on the command line.

I was going ahead and use port 1522 for this new set of listeners to remove some of the ambiguity in having multiple listeners. There is not a technical need to do so, since the listeners listen on different networks (IP addresses) they could all use port 1521.

Here is the code for reference:

[oracle@ron12cprinode1 ~]$ srvctl add listener -listener LISTENER_DG -netnum 2
[oracle@ron12cprinode1 ~]$ srvctl config listener
Name: LISTENER
Network: 1, Owner: oracle
Home: <CRS home>
End points: TCP:1521
Name: LISTENER_DG
Network: 2, Owner: oracle
Home: <CRS home>
End points: TCP:1522

Who needs to start a GUI if this can be done with 1 line ;) Don’t forget to create the node listeners on the standby cluster as well.

Listener (cross) registration

One of the perks of 12c Clusterware is that you don’t need to set listener_networks manually anymore. If you haven’t heard about this particular parameter here is some background.

You have to define listener_networks to allow cross-registration of cluster-listener (SCAN!) and the node listener. Following the above example I need to ensure that the SCAN on network 1-the one created during the installation-registers with the listeners on the same network. I also need to ensure that my Data Guard SCAN listener registers with the DG node listeners. In versions prior to 12c this was done by manually setting the initialisation parameter listener_networks.

In 12c the documentation claims this happens automatically, and the ever-sceptic me didn’t want to believe this at first. But after a restart of the primary database I could see the following entries in the alert.log, these are set by Clusterware, not me:

NOTE: dependency between database ron and diskgroup resource ora.DATA.dg is established
ALTER SYSTEM SET local_listener=' (ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.51)(PORT=1521))' SCOPE=MEMORY SID='pri_1';
ALTER SYSTEM SET remote_listener=' ron12cpri-scan.example.com:1521' SCOPE=MEMORY SID='pri_1';
ALTER SYSTEM SET listener_networks='(( NAME=net2)(LOCAL_LISTENER=(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.102.51)(PORT=1522)))))','((NAME=net2)(REMOTE_LISTENER=ron12cpri-dgscan.dg.example.com:1521))' SCOPE=MEMORY SID='pri_1';
2014-04-28 02:44:46.242000 -04:00
ALTER DATABASE   MOUNT

192.168.100.51 is the IP address of ron12cprinode1-vip, and the local_listener defaults to the listener on that interface. The remote listener is also set to the recommended value automatically. The interesting bit lies in the next command, and I want to make sure it all works. 192.168.102.51 is the first VIP on the second (DG) network and you saw that ron12cpri-dgscan.dg.example.com has just been created. For those of you who remember setting listener_networks in 11.2 you will find this automation a big relief.

Did it work?

Cross registration is best checked with the listeners. There are quiet a few listeners to check (do a ps -ef | grep tns for fun on one of your cluster nodes):

  • local node listener on network 1
  • (any) SCAN listener on network 1
  • local node listener on network 2
  • (any) SCAN listener on network 2

A quick check to the listeners on the primary node reveals the following:

[oracle@ron12cprinode1 ~]$ lsnrctl status

LSNRCTL for Linux: Version 12.1.0.1.0 - Production on 28-APR-2014 02:50:25

Copyright (c) 1991, 2013, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 12.1.0.1.0 - Production
Start Date                23-APR-2014 10:55:44
Uptime                    4 days 15 hr. 54 min. 41 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/12.1.0.1/grid/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/ron12cprinode1/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.100.50)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.100.51)(PORT=1521)))
Services Summary...
Service "+ASM" has 1 instance(s).
  Instance "+ASM1", status READY, has 1 handler(s) for this service...
Service "pdb1" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "priXDB" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "ron" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "ron12c" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "ron_DGB" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
The command completed successfully
[oracle@ron12cprinode1 ~]$

As was expected, the node listener “knows” about my database instance. What about the initial SCAN listeners?

[oracle@ron12cprinode1 ~]$ lsnrctl status LISTENER_SCAN1

LSNRCTL for Linux: Version 12.1.0.1.0 - Production on 28-APR-2014 02:54:31

Copyright (c) 1991, 2013, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN1)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER_SCAN1
Version                   TNSLSNR for Linux: Version 12.1.0.1.0 - Production
Start Date                23-APR-2014 10:55:41
Uptime                    4 days 15 hr. 58 min. 50 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/12.1.0.1/grid/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/ron12cprinode1/listener_scan1/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_SCAN1)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.100.54)(PORT=1521)))
Services Summary...
Service "pdb1" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "priXDB" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "ron" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "ron12c" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "ron_DGB" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
The command completed successfully
[oracle@ron12cprinode1 ~]$

Looks ok too. The new Data Guard SCAN listener is next:

[oracle@ron12cprinode2 ~]$ lsnrctl status DGSCANLSNR_SCAN1_NET2

LSNRCTL for Linux: Version 12.1.0.1.0 - Production on 28-APR-2014 02:54:54

Copyright (c) 1991, 2013, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=DGSCANLSNR_SCAN1_NET2)))
STATUS of the LISTENER
------------------------
Alias                     DGSCANLSNR_SCAN1_NET2
Version                   TNSLSNR for Linux: Version 12.1.0.1.0 - Production
Start Date                28-APR-2014 02:16:46
Uptime                    0 days 0 hr. 38 min. 8 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/12.1.0.1/grid/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/ron12cprinode2/dgscanlsnr_scan1_net2/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=DGSCANLSNR_SCAN1_NET2)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.102.54)(PORT=1521)))
Services Summary...
Service "pdb1" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "ron" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "ron12c" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "ron_DGB" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
The command completed successfully
[oracle@ron12cprinode2 ~]$

And finally, what about the new LISTENER_DG, the Data Guard node listener:

[oracle@ron12cprinode1 admin]$ lsnrctl status LISTENER_DG

LSNRCTL for Linux: Version 12.1.0.1.0 - Production on 28-APR-2014 03:14:05

Copyright (c) 1991, 2013, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_DG)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER_DG
Version                   TNSLSNR for Linux: Version 12.1.0.1.0 - Production
Start Date                23-APR-2014 10:55:41
Uptime                    4 days 16 hr. 18 min. 25 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/12.1.0.1/grid/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/ron12cprinode1/listener_dg/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_DG)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.102.51)(PORT=1522)))
Services Summary...
Service "RON" has 3 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "pdb1" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "ron12c" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "ron_DGB" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
The command completed successfully
[oracle@ron12cprinode1 admin]$

So it appears the listeners have been able to register with the LREG-processes of my database. In the next part of the series I’ll add the static database registration to listener.ora and duplicate my database for use as a physical standby.

RAC 12c enhancements: adding an additional SCAN-part 1

Based on customer request Oracle has added the functionality to add a second SCAN, completely independent of the SCAN defined/created during the cluster creation. Why would you want to use this feature? A few reasons that spring to mind are:

  • Consolidation: customers insist on using a different network
  • Separate network for Data Guard traffic

To demonstrate the concept I am going to show you in this blog post how I

  1. Add a new network resource
  2. Create new VIPs
  3. Add a new SCAN
  4. Add a new SCAN listener

It actually sounds more complex than it is, but I have a feeling I need to split this article in multiple parts as it’s far too long.

The lab setup

When you install RAC 11.2 and 12.1 you are prompted to specify a Single Client Access Name, or SCAN. This SCAN is usually defined in the corporate DNS server and resolves to 3 IP addresses. This allows for an easy way to implement client-side load balancing. The SCAN is explained in more detail in Pro Oracle Database 11g RAC on Linux for 11.2 and on OTN for 11.2 and 12.1. To spice the whole configuration up a little bit I decided to use RAC One Node on the clusters I am using for this demonstration.

I created 2 12.1.0.1.2 clusters for this Data Guard test. Hosts ron12cprinode1 and ron12cprinode2 form the primary cluster, ron12csbynode1 and ron12csbynode2 will form the standby cluster. The RAC One Node database is named RON:

[oracle@ron12cprinode1 ~]$ srvctl config database -db ron
Database unique name: ron
Database name: ron
Oracle home: /u01/app/oracle/product/12.1.0.1/dbhome_1
Oracle user: oracle
Spfile: +DATA/ron/spfilepri.ora
Password file: +DATA/ron/orapwpri
Domain:
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: ron
Database instances:
Disk Groups: RECO
Mount point paths:
Services: ron12c
Type: RACOneNode
Online relocation timeout: 30
Instance name prefix: pri
Candidate servers: ron12cprinode1,ron12cprinode2
Database is administrator managed
[oracle@ron12cprinode1 ~]$

To make things even more interesting I defined my ORACLE_SID prefix on the primary to be “pri” and “sby” on the standby.

[oracle@ron12cprinode1 ~]$ ps -ef | grep smon
oracle    2553     1  0 Feb06 ?        00:00:09 asm_smon_+ASM1
oracle   15660 15578  0 05:05 pts/3    00:00:00 grep smon
oracle   28241     1  0 Feb07 ?        00:00:18 ora_smon_pri_1
[oracle@ron12cprinode1 ~]$

A quick check with gpnptool reveals the network usage before the addition of the second SCAN:

<gpnp:Network-Profile>
 <gpnp:HostNetwork id="gen" HostName="*">
  <gpnp:Network id="net1" IP="192.168.100.0" Adapter="eth0" Use="public"/>
  <gpnp:Network id="net2" IP="192.168.101.0" Adapter="eth1" Use="cluster_interconnect"/>
 </gpnp:HostNetwork>
</gpnp:Network-Profile>

There is the default network, (“netnum 1”) that is created on the network defined as “public” during the installation. I have another spare network port (eth2) reserved for the new network and Data Guard traffic. Currently network 1 is the only one available.

[root@ron12cprinode1 ~]# srvctl config network
Network 1 exists
Subnet IPv4: 192.168.100.0/255.255.255.0/eth0, static
Subnet IPv6:

As you can see RAC 12c now supports IPv6. I have another network available that I want to make available for Data Guard traffic. For this purpose I added all nodes into DNS. I am a bit old-fashioned when it comes to DNS, I am still using bind most of the time. Here is an excerpt of my reverse name resolution file:

; hosts - primary cluster
50	PTR	ron12cprinode1.dg.example.com.
51	PTR	ron12cprinode1-vip.example.com.
52	PTR	ron12cprinode2.dg.example.com.
53	PTR	ron12cprinode2-vip.dg.example.com.
; Data Guard SCAN - primary cluster
54	PTR	ron12cpri-scan.dg.example.com.
55	PTR	ron12cpri-scan.dg.example.com.
56	PTR	ron12cpri-scan.dg.example.com.

; hosts - standby cluster
57	PTR	ron12csbynode1.dg.example.com.
58	PTR	ron12csbynode1-vip.dg.example.com.
59	PTR	ron12csbynode2.dg.example.com.
60	PTR	ron12csbynode2-vip.dg.example.com.
; Data Guard SCAN - standby cluster
61	PTR	ron12csby-scan.dg.example.com.
62	PTR	ron12csby-scan.dg.example.com.
63	PTR	ron12csby-scan.dg.example.com.

The domain is *.dg.example.com, the primary database client traffic will be routed through *.example.com.

Adding the new network

The first step to be performed is to make Clusterware aware of the second network. I am doing this on both sides of the cluster. Notice that the primary nodes are called *pri* whereas the standby cluster is called *sby*

[root@ron12cprinode1 ~]# srvctl add network -netnum 2 -subnet 192.168.102.0/255.255.255.0/eth2 -nettype static -verbose
Successfully added Network.

[root@ron12csbynode1 ~]# srvctl add network -netnum 2 -subnet 192.168.102.0/255.255.255.0/eth2 -nettype static -verbose
Successfully added Network.

So this worked, now I have 2 networks:

[root@ron12cprinode1 ~]# srvctl config network
Network 1 exists
Subnet IPv4: 192.168.100.0/255.255.255.0/eth0, static
Subnet IPv6:
Network 2 exists
Subnet IPv4: 192.168.102.0/255.255.255.0/eth2, static
Subnet IPv6:
[root@ron12cprinode1 ~]#

In the next step I have to add VIPs for the new nodes on the *.dg.example.com subnet. The VIPs must be added on all cluster nodes, 4 in my case.

[oracle@ron12cprinode2 ~]$ srvctl add vip -h

Adds a VIP to the Oracle Clusterware.

Usage: srvctl add vip -node <node_name> -netnum <network_number> -address {<name>|<ip>}/<netmask>[/if1[|if2...]] [-skip] [-verbose]
    -node     <node_name>          Node name
    -address                       <vip_name|ip>/<netmask>[/if1[|if2...]] VIP address specification for node applications
    -netnum   <net_num>            Network number (default number is 1)
    -skip                          Skip reachability check of VIP address
    -verbose                       Verbose output
    -help                          Print usage
[oracle@ron12cprinode2 ~]$

So I did this on each node in my cluster

[root@ron12cprinode1 ~]# srvctl add vip -node ron12cprinode1 -netnum 2 -address 192.168.102.51/255.255.255.0/eth2 -verbose
Network exists: 2/192.168.102.0/255.255.255.0/eth2, type static
Successfully added VIP.
[root@ron12cprinode2 ~]# srvctl add vip -node ron12cprinode2 -netnum 2 -address 192.168.102.53/255.255.255.0/eth2 -verbose
Network exists: 2/192.168.102.0/255.255.255.0/eth2, type static
Successfully added VIP.

[root@ron12csbynode1 ~]# srvctl add vip -node ron12csbynode1 -netnum 2 -address 192.168.102.58/255.255.255.0/eth2 -verbose
Network exists: 2/192.168.102.0/255.255.255.0/eth2, type static
Successfully added VIP.
[root@ron12csbynode2 ~]# srvctl add vip -node ron12csbynode2 -netnum 2 -address 192.168.102.60/255.255.255.0/eth2 -verbose
Network exists: 2/192.168.102.0/255.255.255.0/eth2, type static
Successfully added VIP.

And I need to start the VIPs. They have some funny names as you can see in crsctl status resource (the names can’t be defined, see output of srvctl add scan -h above)

[root@ron12cprinode1 ~]# srvctl status vip -vip ron12cprinode1_2
VIP 192.168.102.51 is enabled
VIP 192.168.102.51 is not running
[root@ron12cprinode1 ~]# srvctl start vip -vip ron12cprinode1_2
[root@ron12cprinode1 ~]# srvctl start vip -vip ron12cprinode2_2
[root@ron12cprinode1 ~]# srvctl status vip -vip ron12cprinode1_2
VIP 192.168.102.51 is enabled
VIP 192.168.102.51 is running on node: ron12cprinode1
[root@ron12cprinode1 ~]#

Add the second SCAN

At this time you can add the second SCAN. The command syntax is shown here:

[oracle@ron12cprinode1 ~]$ srvctl add scan -h

Adds a SCAN VIP to the Oracle Clusterware.

Usage: srvctl add scan -scanname <scan_name> [-netnum <network_number>]
    -scanname <scan_name>          Domain name qualified SCAN name
    -netnum  <net_num>             Network number (default number is 1)
    -subnet                        <subnet>/<netmask>[/if1[|if2...]] NET address specification for network
    -help                          Print usage

Implemented on my first cluster node the command is easier to comprehend.

[root@ron12cprinode1 ~]# srvctl add scan -scanname ron12cpri-dgscan.dg.example.com -netnum 2

[root@ron12cprinode1 ~]# srvctl status scan
SCAN VIP scan1 is enabled
SCAN VIP scan1 is running on node ron12cprinode2
SCAN VIP scan2 is enabled
SCAN VIP scan2 is running on node ron12cprinode1
SCAN VIP scan3 is enabled
SCAN VIP scan3 is running on node ron12cprinode1

You need to create the SCAN on both clusters. On my primary cluster the SCAN has been created with the following configuration:

[root@ron12cprinode1 ~]# srvctl config scan -netnum 2
SCAN name: ron12cpri-dgscan.dg.example.com, Network: 2
Subnet IPv4: 192.168.102.0/255.255.255.0/eth2
Subnet IPv6:
SCAN 0 IPv4 VIP: 192.168.102.54
SCAN name: ron12cpri-dgscan.dg.example.com, Network: 2
Subnet IPv4: 192.168.102.0/255.255.255.0/eth2
Subnet IPv6:
SCAN 1 IPv4 VIP: 192.168.102.55
SCAN name: ron12cpri-dgscan.dg.example.com, Network: 2
Subnet IPv4: 192.168.102.0/255.255.255.0/eth2
Subnet IPv6:
SCAN 2 IPv4 VIP: 192.168.102.56
[root@ron12cprinode1 ~]#

You can see the new VIPs in the output of ifconfig, just as you would with the primary SCAN:

eth2      Link encap:Ethernet  HWaddr 52:54:00:FE:E2:D5
          inet addr:192.168.102.50  Bcast:192.168.102.255  Mask:255.255.255.0
          inet6 addr: fe80::5054:ff:fefe:e2d5/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:523 errors:0 dropped:0 overruns:0 frame:0
          TX packets:339 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:118147 (115.3 KiB)  TX bytes:72869 (71.1 KiB)

eth2:1    Link encap:Ethernet  HWaddr 52:54:00:FE:E2:D5
          inet addr:192.168.102.55  Bcast:192.168.102.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

So there is nothing too surprising in the output, it’s exactly the same as before with the public SCAN created during the installation.

End of part 1

This already seems like a lot of text to me so I think it’s time to pause here. The next parts will demonstrate the addition of the SCAN listeners, the new node listeners on the *.dg.example.com network and finally the duplication of the primary RAC One Node database for use as a standby database.

How important is a Disaster Recovery site for you?

I regularly read threads on the oracle-l mailing list, and occasionally feel very tempted to reply to one. Just recently I saw one that I liked a lot. It is specifically about using an Oracle Database Appliance (ODA) as a Disaster Recovery (DR) solution for an Exadata system. The Exadata configuration was not specified, I assume it was a smaller (eighth rack/quarter rack) configuration.

There were lots of arguments pro and against that Exadata->ODA architecture, and that leads to a broader question: how important is DR for your organisation? This blog post is about my personal experience, and probably strongly influenced by where I live in work (Europe), yours might be different.

About the original discussion

A number of replies stated that using an ODA as a DR solution is technically possible and acceptable if you don’t really plan to use it (and you have emails where you pointed out the shortcomings of that architecture).

The use of non-Exadata as a DR solution for Exadata is flawed in many points, and here is why:

Exadata, especially since 11.2.3.3.0, will make very elegant choices when it comes to caching data in the Smart Flash Cache. You will see very decent response times. You will also notice that smart scans will benefit from intelligent caching. This works so well that I had to rewrite some of my demos.

Smart scans and Smart Flash Cache do not exist outside the Exadata platform. Outside of Exadata you get no smart IO either, and no direct access to data compressed with Hybrid Columnar Compression.

The counter argument was that you might not need these anyway because these features are not used in your Exadata production system. But even if you don’t use HCC and smart scans in Exadata (I might ask you though-why?) you will feel the lack of the Smart Flash Cache, your ~1ms response times for many single block IOs might drop to ~6-8 ms most likely.

Others said if you have HCC compressed data then it has to be decompressed first before use (time/CPU intensive) and you might not have the space. If you have a 10x compression on a table that 100G table becomes 1TB. In the context of the ODA you might not even be able to decompress as disk space is limited compared to Exadata (even without HCC)

The broader discussion

The original poster had a very specific question about using an ODA (or other non-Exadata system) as a DR solution for an Exadata primary.

Abstracting from the original question I immediately thought about using non-identical hardware for DR. A common case I found in my career follows this pattern: the production system can’t cope with the workload anymore, so you get new kit. But since getting new kit is expensive, the budget owners (not DBAs!) decide to reuse the old production servers for DR.

I have seen this numerous times and always made sure I have an email trail where I said quite openly that this is not a Good Idea ™ and was overruled. I am a very cautious person.

So what would happen if you had to invoke DR? I haven’t seen this happen often (management often doesn’t hesitates to take that decision), but let’s assume you invoke DR on your old production servers. And guess what: the new hardware was so powerful that far less care was taken to ensure that code performs well-after all, the new hardware can deal with it. Except that the old hardware couldn’t in the first place, and that was the reason it was phased out. Oooops. DR that is inoperable is not really a solution.

How important is DR for you

This all boils down to the question “how important is DR for you”? I gathered that some users feel DR is just a tick in the box, and it isn’t really ever considered to be invoked.

In my opinion (and I formed that many years ago) DR is the only way to ensure business continuity for all but the smallest databases. When working with large data sets in the TB range it becomes quite unmanageable to fully restore that data and still meet the Recovery Time Objective (RTO). Modern technology gives us the opportunity to have an RTO of nearly 0 and critical applications surely need to meet this. In which case you can rule out a restore straight away.

But even if you had an hour to restore, it might be difficult to meet that target, depending on your backup strategy. Disk backups are great there but if the disks with your backups are gone, then you need to fall back to another restore method. Thirty minutes later it can become clear that at the current rate the 1 hour SLA can’t be met.

So you need to invoke DR, and the DR solution-again in my opinion-should perform just like production.

Final note

I should once more point out that the opinions listed here are mine. During every customer engagement I advocated the use of Data Guard or equivalent replication technology. I don’t believe a database restore is a viable option due to the time constraints around database operations, and sleep better knowing I have a working DR solution for my important systems.

Duplicate from the standby instead from the primary in 12c

This post is related to 12c and an active database duplication for a standby I did in my lab environment. I’d say although I first encountered it on 12c there is a chance you run into a similar situation with earlier releases too.

I would normally use ASM for all my databases to make my life easier but this time I had to be mindful of the available memory on the laptop-which at 8 GB-is not plenty. So I went with file system setup instead. After the initial preparations I was ready to launch the one-liner on the standby database:

RMAN> duplicate target database for standby from active database;

This worked away happily for a few moments only to come to an abrupt halt with the below error message. I have started the duplication process on the standby.

...
executing command: SET NEWNAME

executing command: SET NEWNAME

Starting backup at 01-APR-14
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile copy
input datafile file number=00012 name=/u01/oradata/CDB1/valid/sysaux01.dbf
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 04/01/2014 20:34:42
ORA-17628: Oracle error 19505 returned by remote Oracle server
ORA-19505: failed to identify file ""
channel ORA_DISK_1 disabled, job failed on it will be run on another channel
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 04/01/2014 20:34:42
RMAN-05501: aborting duplication of target database
RMAN-03015: error occurred in stored script Memory Script

RMAN-03009: failure of backup command on ORA_DISK_1 channel at 04/01/2014 20:34:42
ORA-17628: Oracle error 19505 returned by remote Oracle server
ORA-19505: failed to identify file ""

Huh? File 12 is part of a PDB named VALID. It was definitely available on the file system on the source, and I even had a backup on disk for it… I first thought it had to do with the PDBs all closed on the source datbase but that was not the case. After opening the PDBs read-write I still had the same error. Strange. I thought about debugging and tracing RMAN but then decided to first try the duplication from the primary. Which was a good idea, because the primary actually gave me a more meaningful error message:

Starting restore at 19.03.2014 10:17:08
using channel ORA_AUX_DISK_1

channel ORA_AUX_DISK_1: starting datafile backup set restore
channel ORA_AUX_DISK_1: using network backup set from service cdb1
channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_AUX_DISK_1: restoring datafile 00004 to /u01/oradata/STDBY1/undotbs01.dbf
channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:25
channel ORA_AUX_DISK_1: starting datafile backup set restore
channel ORA_AUX_DISK_1: using network backup set from service cdb1
channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_AUX_DISK_1: restoring datafile 00005 to /u01/oradata/STDBY1/pdbseed/system01.dbf
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 03/19/2014 10:17:26
RMAN-05501: aborting duplication of target database
RMAN-03015: error occurred in stored script Memory Script
ORA-19849: error while reading backup piece from service cdb1
ORA-19504: failed to create file "/u01/oradata/STDBY1/pdbseed/system01.dbf"
ORA-27040: file create error, unable to create file
Linux-x86_64 Error: 2: No such file or directory
Additional information: 1

So I forgot to create the needed directories on the standby. For consistency I created a subdirectory per PDB on the primary, which I forgot on the standby host. You can see this here:

RMAN> report schema;

Report of database schema for database with db_unique_name CDB1

List of Permanent Datafiles
===========================
File Size(MB) Tablespace           RB segs Datafile Name
---- -------- -------------------- ------- ------------------------
1    790      SYSTEM               ***     /u01/oradata/CDB1/system01.dbf
3    750      SYSAUX               ***     /u01/oradata/CDB1/sysaux01.dbf
4    90       UNDOTBS1             ***     /u01/oradata/CDB1/undotbs01.dbf
5    250      PDB$SEED:SYSTEM      ***     /u01/oradata/CDB1/pdbseed/system01.dbf
6    5        USERS                ***     /u01/oradata/CDB1/users01.dbf
7    590      PDB$SEED:SYSAUX      ***     /u01/oradata/CDB1/pdbseed/sysaux01.dbf
11   270      VALID:SYSTEM         ***     /u01/oradata/CDB1/valid/system01.dbf
12   610      VALID:SYSAUX         ***     /u01/oradata/CDB1/valid/sysaux01.dbf
13   260      RCAT:SYSTEM          ***     /u01/oradata/CDB1/rcat/system01.dbf
14   610      RCAT:SYSAUX          ***     /u01/oradata/CDB1/rcat/sysaux01.dbf
15   100      RCAT:RMAN_CATALOG    ***     /u01/oradata/CDB1/rcat/rman_catalog01.dbf
16   50       VALID:USERS          ***     /u01/oradata/CDB1/valid/users01.dbf
17   250      DGTEST:SYSTEM        ***     /u01/oradata/CDB1/dgtest/system01.dbf
18   590      DGTEST:SYSAUX        ***     /u01/oradata/CDB1/dgtest/sysaux01.dbf

List of Temporary Files
=======================
File Size(MB) Tablespace           Maxsize(MB) Tempfile Name
---- -------- -------------------- ----------- --------------------
1    60       TEMP                 32767       /u01/oradata/CDB1/temp01.dbf
2    20       PDB$SEED:TEMP        32767       /u01/oradata/CDB1/pdbseed/pdbseed_temp01.dbf
4    20       VALID:TEMP           32767       /u01/oradata/CDB1/valid/pdbseed_temp01.dbf
5    20       RCAT:TEMP            32767       /u01/oradata/CDB1/rcat/pdbseed_temp01.dbf
6    20       DGTEST:TEMP          32767       /u01/oradata/CDB1/dgtest/dgtest_temp01.dbf

All right, after creating the directories I retried the duplication and it worked:

...
datafile 4 switched to datafile copy
input datafile copy RECID=5 STAMP=843771196 file name=/u01/oradata/STDBY1/undotbs01.dbf

datafile 5 switched to datafile copy
input datafile copy RECID=6 STAMP=843771198 file name=/u01/oradata/STDBY1/pdbseed/system01.dbf
datafile 6 switched to datafile copy
input datafile copy RECID=7 STAMP=843771198 file name=/u01/oradata/STDBY1/users01.dbf
datafile 7 switched to datafile copy
input datafile copy RECID=8 STAMP=843771199 file name=/u01/oradata/STDBY1/pdbseed/sysaux01.dbf
datafile 11 switched to datafile copy
input datafile copy RECID=9 STAMP=843771199 file name=/u01/oradata/STDBY1/valid/system01.dbf
datafile 12 switched to datafile copy
input datafile copy RECID=10 STAMP=843771199 file name=/u01/oradata/STDBY1/valid/sysaux01.dbf
datafile 13 switched to datafile copy
input datafile copy RECID=11 STAMP=843771199 file name=/u01/oradata/STDBY1/rcat/system01.dbf
datafile 14 switched to datafile copy
input datafile copy RECID=12 STAMP=843771200 file name=/u01/oradata/STDBY1/rcat/sysaux01.dbf
datafile 15 switched to datafile copy
input datafile copy RECID=13 STAMP=843771200 file name=/u01/oradata/STDBY1/rcat/rman_catalog01.dbf
datafile 16 switched to datafile copy
input datafile copy RECID=14 STAMP=843771200 file name=/u01/oradata/STDBY1/valid/users01.dbf
datafile 17 switched to datafile copy
input datafile copy RECID=15 STAMP=843771200 file name=/u01/oradata/STDBY1/dgtest/system01.dbf
datafile 18 switched to datafile copy
input datafile copy RECID=16 STAMP=843771200 file name=/u01/oradata/STDBY1/dgtest/sysaux01.dbf
Finished Duplicate Db at 19.03.2014 10:28:42

Summary

Although not a big problem and probably known by everyone besides me the error message on the standby shouldn’t be of the scary type ‘can’t identify datafile “”‘ followed by the process aborting. Instead something more meaningful would have been nice.