Category Archives: Oracle

Oracle (Database and Middleware) related posts

Little things worth knowing: parallel Data Pump export in table mode

I haven’t used Data Pump in a little while but recently needed to do a bit of work involving this very useful utility to export a single table. I know that it is possible to export data in parallel using expdp, but I can’t recall the syntax for doing so off the top of my head when I need it. This post describes a potential approach to exporting a table in parallel. In the next post I will demonstrate an interesting case where using parallelism didn’t help me speed up the export. All of this was tested on 12.2 and 18.4.0, the examples I am sharing originate from my 18.4.0 single instance database (without ASM) running on Linux.

The setup

My lab environment is a bit limited when it comes to storage, so I’ll have to do with small-ish tables. The basic principles should still apply for larger segments though. Please note that my tables aren’t partitioned to keep the individual segment size as large as possible. 

My data is once more based on Swingbench’s order entry schema. The table I’ll use in my first example is the well-known ORDER_ITEMS table:

$ sql soe

SQLcl: Release 17.3.0 Production on Wed Dec 12 18:12:00 2018

Copyright (c) 1982, 2018, Oracle. All rights reserved.

Password? (*******?) *********
Last Successful login time: Wed Dec 12 2018 18:12:01 +01:00

Connected to:
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
Version 18.4.0.0.0

SQL> info soe.order_items
TABLE: ORDER_ITEMS
LAST ANALYZED:2018-12-11 22:19:40.0
ROWS :68442528
SAMPLE SIZE :68442528
INMEMORY :DISABLED
COMMENTS :

Columns
NAME DATA TYPE NULL DEFAULT COMMENTS
*LINE_ITEM_ID NUMBER(3,0) Yes
*ORDER_ID NUMBER(12,0) Yes
PRODUCT_ID NUMBER(6,0) Yes
UNIT_PRICE NUMBER(8,2) Yes
QUANTITY NUMBER(8,0) Yes
DISPATCH_DATE DATE Yes
RETURN_DATE DATE Yes
GIFT_WRAP VARCHAR2(20 BYTE) Yes
CONDITION VARCHAR2(20 BYTE) Yes
SUPPLIER_ID NUMBER(6,0) Yes
ESTIMATED_DELIVERY DATE Yes

Indexes
INDEX_NAME UNIQUENESS STATUS FUNCIDX_STATUS COLUMNS
SOE.ITEM_ORDER_IX NONUNIQUE VALID ORDER_ID
SOE.ORDER_ITEMS_PK UNIQUE VALID ORDER_ID, LINE_ITEM_ID
SOE.ITEM_PRODUCT_IX NONUNIQUE VALID PRODUCT_ID

Running an export without any parallelism

I’ll use this example as the baseline, exporting the table without parallelism. This example is quite basic, and you will probably have to adjust it to suit your needs. I am assuming basic familiarity with the Data Pump toolset, if not, head over to the documentation and read up on the concepts. The non-parallel expdp call is shown here:

$ expdp martin directory=exptest logfile=exp_order_items_noparallel.log \
> dumpfile=exp_order_items_noparallel.dmp tables=soe.order_items

[...]

Export: Release 18.0.0.0.0 - Production on Wed Dec 12 19:00:38 2018
Version 18.4.0.0.0

Copyright (c) 1982, 2018, Oracle and/or its affiliates. All rights reserved.

Connected to: Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
Starting "MARTIN"."SYS_EXPORT_TABLE_01": martin
directory=exptest logfile=exp_order_items_noparallel.log
dumpfile=exp_order_items_noparallel.dmp tables=soe.order_items
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Processing object type TABLE_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Processing object type TABLE_EXPORT/TABLE/STATISTICS/MARKER
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/INDEX/INDEX
Processing object type TABLE_EXPORT/TABLE/CONSTRAINT/CONSTRAINT
. . exported "SOE"."ORDER_ITEMS" 3.601 GB 68442528 rows
Master table "MARTIN"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
***************************************************************************
Dump file set for MARTIN.SYS_EXPORT_TABLE_01 is:
/u01/data_pump/ORCL/exp_order_items_noparallel.dmp
Job "MARTIN"."SYS_EXPORT_TABLE_01" successfully completed at Wed Dec 12 19:01:22 2018 elapsed 0 00:00:43

So it takes about 43 seconds for expdp to create the file.

Adding parallelism

The next attempt I made was to introduce parallelism. For expdp to actually use parallelism in the first place, you need to have multiple dump files to write to. This is accomplished by using the %U placeholder in the file name, as shown here:

$ expdp martin directory=exptest logfile=exp_order_items.log \
> dumpfile=exp_order_items_%U.dmp parallel=4 tables=soe.order_items

Export: Release 18.0.0.0.0 - Production on Wed Dec 12 19:56:42 2018
Version 18.4.0.0.0

Copyright (c) 1982, 2018, Oracle and/or its affiliates. All rights reserved.

Connected to: Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
Starting "MARTIN"."SYS_EXPORT_TABLE_01": martin directory=exptest
logfile=exp_order_items.log dumpfile=exp_order_items_%U.dmp parallel=4
tables=soe.order_items
Processing object type TABLE_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Processing object type TABLE_EXPORT/TABLE/INDEX/INDEX
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/CONSTRAINT/CONSTRAINT
Processing object type TABLE_EXPORT/TABLE/STATISTICS/MARKER
. . exported "SOE"."ORDER_ITEMS" 3.601 GB 68442528 rows
Master table "MARTIN"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded

Dump file set for MARTIN.SYS_EXPORT_TABLE_01 is:
/u01/data_pump/ORCL/exp_order_items_01.dmp
/u01/data_pump/ORCL/exp_order_items_02.dmp
/u01/data_pump/ORCL/exp_order_items_03.dmp
/u01/data_pump/ORCL/exp_order_items_04.dmp
/u01/data_pump/ORCL/exp_order_items_05.dmp
Job "MARTIN"."SYS_EXPORT_TABLE_01" successfully completed at Wed Dec 12 19:57:02 2018 elapsed 0 00:00:20

So this isn’t too bad: using parallel 4 I managed to cut the time it takes to export the table in half. If you read the Data Pump white paper you can see that the job of the worker processes is not just the data unloading, they also have to create metadata for the object currently exported. I supposed this is the reason why there are 5 dump files. You can see they aren’t all the same size:

$ ls -lh /u01/data_pump/ORCL/exp_order_items_0[12345].dmp
-rw-r----- 1 oracle oinstall 1.2G Dec 12 19:57 /u01/data_pump/ORCL/exp_order_items_01.dmp
-rw-r----- 1 oracle oinstall 16K Dec 12 19:56 /u01/data_pump/ORCL/exp_order_items_02.dmp
-rw-r----- 1 oracle oinstall 192K Dec 12 19:57 /u01/data_pump/ORCL/exp_order_items_03.dmp
-rw-r----- 1 oracle oinstall 1.3G Dec 12 19:57 /u01/data_pump/ORCL/exp_order_items_04.dmp
-rw-r----- 1 oracle oinstall 1.3G Dec 12 19:57 /u01/data_pump/ORCL/exp_order_items_05.dmp

What I really like about the Data Pump API is the ability to attach to a job and see what it’s doing. While the export was running, I had a quick look at it:

$ expdp martin attach=SYS_EXPORT_TABLE_01

Export: Release 18.0.0.0.0 - Production on Wed Dec 12 19:56:54 2018
Version 18.4.0.0.0

Copyright (c) 1982, 2018, Oracle and/or its affiliates. All rights reserved.
[...]
Connected to: Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production

Job: SYS_EXPORT_TABLE_01
Owner: MARTIN
Operation: EXPORT
Creator Privs: TRUE
GUID: 7CD567F0F8D057DDE0530164A8C0392F
Start Time: Wednesday, 12 December, 2018 19:56:43
Mode: TABLE
Instance: ORCL
Max Parallelism: 4
Timezone: +02:00
Timezone version: 31
Endianness: LITTLE
NLS character set: AL32UTF8
NLS NCHAR character set: AL16UTF16
EXPORT Job Parameters:
Parameter Name Parameter Value:
CLIENT_COMMAND martin directory=exptest
logfile=exp_order_items.log dumpfile=exp_order_items_%U.dmp
parallel=4 tables=soe.order_items
TRACE 0
State: EXECUTING
Bytes Processed: 0
Current Parallelism: 4
Job Error Count: 0
Job heartbeat: 5
Dump File: /u01/data_pump/ORCL/exp_order_items_%u.dmp
Dump File: /u01/data_pump/ORCL/exp_order_items_01.dmp
bytes written: 8,192
Dump File: /u01/data_pump/ORCL/exp_order_items_02.dmp
bytes written: 16,384
Dump File: /u01/data_pump/ORCL/exp_order_items_03.dmp
bytes written: 16,384
Dump File: /u01/data_pump/ORCL/exp_order_items_04.dmp
bytes written: 4,096
Dump File: /u01/data_pump/ORCL/exp_order_items_05.dmp
bytes written: 4,096

Worker 1 Status:
Instance ID: 1
Instance name: ORCL
Host name: server1
Object start time: Wednesday, 12 December, 2018 19:56:46
Object status at: Wednesday, 12 December, 2018 19:56:47
Process Name: DW00
State: WORK WAITING

Worker 2 Status:
Instance ID: 1
Instance name: ORCL
Host name: server1
Object start time: Wednesday, 00 Sat, 0000 0:00:00
Object status at: Wednesday, 12 December, 2018 15
Process Name: DW01
State: WORK WAITING

Worker 3 Status:
Instance ID: 1
Instance name: ORCL
Host name: server1
Object start time: Wednesday, 12 December, 2018 19:56:47
Object status at: Wednesday, 12 December, 2018 19:56:48
Process Name: DW02
State: EXECUTING
Object Schema: SOE
Object Name: ORDER_ITEMS
Object Type: TABLE_EXPORT/TABLE/TABLE_DATA
Completed Objects: 1
Total Objects: 1
Completed Rows: 33,417,535
Worker Parallelism: 3

Worker 4 Status:
Instance ID: 1
Instance name: ORCL
Host name: server1
Access method: direct_path
Object start time: Wednesday, 12 December, 2018 19:56:45
Object status at: Wednesday, 12 December, 2018 19:56:49
Process Name: DW03
State: WORK WAITING

The output is quite interesting. It confirms that I’m running with parallel 4 and it also shows that only 1 file is really being written to by worker 3. It does so in parallel. If you recall from the ls command earlier, there were 3 files of roughly 1.3 GB each. They surely have been written to by the 3rd worker process.

Summary

I have been able to export a table in parallel using Data Pump, and by doing so I cut the time it took to export the table in half. This is a great feature which I am certainly going to use regularly. 

Advertisements

Enhanced “validate” commands in Oracle’s Data Guard Broker 18c

If you are using an Oracle Database Enterprise Edition chances are that there is at least one environment in your estate making use of Data Guard. And if you are using Data Guard, why not use the broker? I have been using Data Guard broker for a long time now, and it has definitely improved a lot over the first releases, back in the day. I like it so much these days that I feel hard done by if I can’t make use of it. This is of course a matter of personal preference, and I might be exaggerating a little :)

One of the nice additions to the broker in Oracle 12.1 was the ability to validate a database before a role change. This is documented in the Data Guard broker documentation. I certainly don’t solely rely on the output of the command, I have my own checks I’m running that go over and above what a validate can do.

During research I ran the command on my 12.2 system and noticed much enhanced output. This is super exciting and worth blogging about. Well, that is, to me at least-I hope you find this post useful. I love discovering little changes like this, they aren’t always advertised on slide 1 in “What’s new in release X” presentations, but nevertheless great boosts to productivity.

This post was nearly complete, but then rested in my drafts folder for just a little too long and 18c has been released. Time for an update!

The environment

Before moving on, here’s the stack in case you find this via a search engine:

  • Oracle Linux 7.4 powering 2 VMs: server1 and server2
  • Oracle 18.3.0, single instance, no Oracle Restart
  • Data Guard Broker configuration managing 2 databases: NCDBA and NCDBB

The broker is quite happy with my setup, at least for now.

DGMGRL> show configuration

Configuration - test

  Protection Mode: MaxAvailability
  Members:
  NCDBA - Primary database
    NCDBB - Physical standby database 

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 51 seconds ago)

DGMGRL> 

This is my setup, YMMV as always. Refer to your standard documents or other relevant documentation for more details about your configuration

New things to validate in 18c

With 12.2 it was possible to validate a datafile in addition to validating the database. Oracle 18c enhances the validate command quite a bit more:

DGMGRL> help validate

Performs an exhaustive set of validations for a member

Syntax:

  VALIDATE DATABASE [VERBOSE] ;

  VALIDATE DATABASE [VERBOSE]  DATAFILE  
    OUTPUT=;

  VALIDATE DATABASE [VERBOSE]  SPFILE;

  VALIDATE FAR_SYNC [VERBOSE]  
    [WHEN PRIMARY IS ];

  VALIDATE NETWORK CONFIGURATION FOR { ALL |  };

  VALIDATE STATIC CONNECT IDENTIFIER FOR { ALL |  };

DGMGRL> 

In this post I am going to focus on the verbose output generated by validate database, if I can find the time I’ll write about the other new options as well.

Validate database in Oracle 18c

Let’s have a look at the output of validate database verbose … Looking at the configuation status, NCDBB is currently running in recovery mode with NCDBA acting as the primary database. It makes sense to start with the validation of NCDBB first.

The output of the command is rather comprehensive as you will see, I have decided to annotate the output so you don’t have to scroll up and down that much.

DGMGRL> validate database verbose 'NCDBB'

  Database Role:     Physical standby database
  Primary Database:  NCDBA

  Ready for Switchover:  Yes
  Ready for Failover:    Yes (Primary Running)

  Flashback Database Status:
    NCDBA :  Off
    NCDBB :  Off

  Capacity Information:
    Database  Instances        Threads
    NCDBA      1               1
    NCDBB      1               1

  Managed by Clusterware:
    NCDBA :  NO
    NCDBB:  NO
    Validating static connect identifier for database NCDBA...
    The static connect identifier allows for a connection to database "NCDBA".

The first part of the output is related to the database’s role and status. Oracle reckons my standby database is ready for a role change (which I’ll double-check using my own tools and scripts). You can see that flashback database is not enabled (for reasons that don’t matter for this post).

Since both members are single instance databases it makes sense for them to have a single redo thread.

Another important piece of information can be found in the “managed by Clusterware” section. In releases prior to 12.1.0.2 you always had to statically register your databases with the listener for use with the broker. This has changed in 12.1.0.2: another one of these productivity boosters :) In modern releases you don’t need to statically register your databases with the listener provided Clusterware manages them. See MOS DocID 1387859.1 for all the details.

You read in the introduction that I’m using single instance Oracle databases without any Grid Infrastructure at all so this shiny new feature does not apply. There’s something else that might help though: Oracle validates the static connection identifier for you. It doesn’t simply ping the tns alias, the broker actually establishes a connection to the database. I noticed this in the listener.log: a program named dgmgrl connects to the database using the static conncetion identifier (as per “show database memberName staticconnectidentifier”). Output is formatted for readability:

2018-08-14 10:54:16.377000 +01:00
14-AUG-2018 10:54:16 * (CONNECT_DATA=(SERVICE_NAME=NCDBB_DGMGRL)(INSTANCE_NAME=NCDBB)(SERVER=DEDICATED)
(STATIC_SERVICE=TRUE)(CID=(PROGRAM=dgmgrl)(HOST=server1)(USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)
(HOST=192.168.100.21)(PORT=27049)) * establish * NCDBB_DGMGRL * 0

Let’s continue with the output of the validate database command:

  Temporary Tablespace File Information:
    NCDBA TEMP Files:   1
    NCDBB TEMP Files:   1

  Data file Online Move in Progress:
    NCDBA:  No
    NCDBB:  No

This little section compares the number of temp files and warns you of any online data file move operations.

  Standby Apply-Related Information:
    Apply State:      Running
    Apply Lag:        0 seconds (computed 0 seconds ago)
    Apply Delay:      0 minutes

  Transport-Related Information:
    Transport On:      Yes
    Gap Status:        No Gap
    Transport Lag:     0 seconds (computed 0 seconds ago)
    Transport Status:  Success


  Log Files Cleared:
    NCDBA Standby Redo Log Files:  Cleared
    NCDBB Online Redo Log Files:   Not Cleared
    NCDBB Standby Redo Log Files:  Available

  Current Log File Groups Configuration:
    Thread #  Online Redo Log Groups  Standby Redo Log Groups Status
              (NCDBA)                 (NCDBB)
    1         2                       3                       Sufficient SRLs

  Future Log File Groups Configuration:
    Thread #  Online Redo Log Groups  Standby Redo Log Groups Status
              (NCDBB)                 (NCDBA)
    1         2                       3                       Sufficient SRLs

  Current Configuration Log File Sizes:
    Thread #   Smallest Online Redo      Smallest Standby Redo
               Log File Size             Log File Size
               (NCDBA)                    (NCDBB)
    1          200 MBytes                200 MBytes

  Future Configuration Log File Sizes:
    Thread #   Smallest Online Redo      Smallest Standby Redo
               Log File Size             Log File Size
               (NCDBB)                   (NCDBA)
    1          200 MBytes                200 MBytes

This section is quite interesting as well as it allows you to have a gander at the transport lag and apply lag respectively. I can also see that I have standby redo logs both for my primary as well as the standby database. Sometimes the broker doesn’t get the number of standby redo logs right in “future log file groups configuration”. I found this to rectify itself after a switchover and switch-back.
Another reassuring fact is presented in the current and future configuration log file sizes: I follwed th documentation carefully and created all my (online and standby) redo logs the exact same size.

  Apply-Related Property Settings:
    Property                        NCDBA Value              NCDBB Value
    DelayMins                       0                        0
    ApplyParallel                   AUTO                     AUTO
    ApplyInstances                  0                        0

  Transport-Related Property Settings:
    Property                        NCDBA Value              NCDBB Value
    LogXptMode                      sync                     sync
    Dependency                                        
    DelayMins                       0                        0
    Binding                         optional                 optional
    MaxFailure                      0                        0
    MaxConnections                  1                        1
    ReopenSecs                      300                      300
    NetTimeout                      30                       30
    RedoCompression                 DISABLE                  DISABLE
    LogShipping                     ON                       ON

The section just above compares apply and transport related settings between the primary and standby databases. These are stored in Data Guard broker properties eventually mapping back to database initialisation parameters.

  Automatic Diagnostic Repository Errors:
    Error                       NCDBA    NCDBB
    No logging operation        NO       NO
    Control file corruptions    NO       NO
    SRL Group Unavailable       NO       NO
    System data file missing    NO       NO
    System data file corrupted  NO       NO
    System data file offline    NO       NO
    User data file missing      NO       NO
    User data file corrupted    NO       NO
    User data file offline      NO       NO
    Block Corruptions found     NO       NO

DGMGRL> 

And finally, it appears as if the Automatic Diagnostic Repository (ADR) didn’t show any issues. This doesn’t mean there aren’t, maybe there is still something looming in your configuration that hasn’t been detected yet.

Summary

The validate database command is really quite useful in my opinion performing basic checks before role changes. It doesn’t take the responsibility off you for ensuring the role change will be a success but I found it to be a useful first indicator.

Using colplot to visualise performance data

Back in 2011 I wrote a blog post about colplot but at that time focused on running the plot engine backed by a web server. However some people might not want to take this approach, and thinking about┬ásecurity it might not be the best idea in the world anyway. A port that isn’t opened can’t be scanned for vulnerabilities…

So what is colplot anyway? And why this follow-up to a 7 year old post?

Some background

Back in the day I learned about collectl: a small, light-weight performance monitoring tool that’s easily portable to many platforms. Collectl is very versatile and has capabilities to record a lot of detail, especially on Linux. Collectl comes with a companion, colplot, which I introduced in said post in 2011.

A typical use case – at least for me – is to start collectl recording, do some work, stop recording, and plotting the results. This can be handy if you don’t get permission to install TFA (think “OSWatcher”) at short notice, and still need performance data!

In this blog post I’d like to share how you could go about doing this. I am using Debian as my workstation O/S and Oracle Linux to capture data.

The workload

I wanted a really simple workload, so for lack of ingenuity I came up with the idea of running fio to drive storage and a tiny bit of CPU.

I am running a very basic random read I/O test and record it with collectl. It is important to understand that it’s your responsibility to define what you want to capture with regards to the plots you want to generate. I’ll explain this in a bit, but for now it’ll have to suffice that I need detail data for CPU and disk.

Before kicking off the load generator I start collectl to record data in (P)lot mode, saving data to my data subdirectory:

[oracle@server1 ~]$ collectl -sCD -f /home/oracle/data -P

(Another, probably cleverer option is to record the data without the -P switch, then replay the recording and writing the output to a file in plot format. That way you can have a lot more control over the process and as an added advantage have the raw data)

Once the workload has completed, I can transfer the data captured to my workstation for analysis with colplot. For each detail switch you select (C and D in the example) you’ll get a (compressed) file:

$ ls -l data
total 84
-rw-r--r-- 1 oracle oinstall 21790 Oct 30 08:26 server1-20181030.cpu
-rw-r--r-- 1 oracle oinstall 59357 Oct 30 08:26 server1-20181030.dsk

Plotting details

I imagine most people use the web-frontend to colplot, but that doesn’t mean there aren’t other ways of generating visualisations of your performance data. But first of all, what plots can you create? This depends on the version of colplot in use, for me it’s this:

$ colplot --version
colplot: V5.1.0, gnuplot: V:5.0[png,x11]

Copyright 2004-2014 Hewlett-Packard Development Company, L.P.
colplot may be copied only under the terms of either the Artistic License
or the GNU General Public License, which may be found in the source kit

Getting back to the list of plots supported in my version, it’s super impressive!

$ colplot -showplots
cpu         cpu     s  Cpu Utilization. Other
cpu20       cpu     s  Cpu Utilization, 0-20%
cpu80       cpu     s  Cpu Utilization, 80-100%
cpudold     cpu     s  Old format ony reports individual fields
cpumid      cpu     s  Cpu Utilization, 40-60%
cpumore     cpu     s  Additional types of use
cpuold      cpu     s  Old format ony reports individual fields
loadavg     cpu     s  Load Averages for 1,5,15 min
disk        disk    s  Disk Summary
diskio      disk    s  Disk I/O
disksize    disk    s  Bandwidth and transfer sizes
elan        elan    s  Quadrics ELAN Bandwidth
elanio      elan    s  Quadrics ELAN Packet Rates
ib          ib      s  Infiniband Bandwidth
ibio        ib      s  Infiniband Packet Rates
ibsize      ib      s  Bandwidth and transfer sizes
nvidia      import  s  nvidia GPU stats
inode       inode   s  Inode Summary
luscltio    lustre  s  Lustre Client Summary, I/O only
cltmeta     lustre  s  Lustre Client Meta Summary
cltreada    lustre  s  Lustre Client Read-Ahead Summary
lusmds      lustre  s  lustre Lustre MDS Summary
lusoss      lustre  s  Lustre OSS Data Rates
ossio       lustre  s  Lustre OSS I/Os
faults      mem     s  Page Faults
mem         mem     s  Memory
memanon     mem     s  Anon Memory
membuf      mem     s  Buffered Memory
memcache    mem     s  Cached Memory
memdirty    mem     s  Dirty Memory
memmap      mem     s  Mapped Memory
memslab     mem     s  Slab Memory
paging      mem     s  Paging
swap        mem     s  Swap Utilization
misc1       misc    s  Miscellaneous ctrs from '--import misc'
misc2       misc    s  CPU Frequency from '--import misc'
net         net     s  Network Summary
netpkt      net     s  Network packets
netsize     net     s  Bandwidth and transfer sizes
nfsV2c      nfs     s  older NFS V2 Client Summary
nfsV2s      nfs     s  older NFS V2 Server Summary
nfsV3c      nfs     s  older NFS V3 Client Summary
nfsV3s      nfs     s  older NFS V3 Server Summary
nfsV4c      nfs     s  older NFS V4 Client Summary
nfsV4s      nfs     s  older NFS V4 Server Summary
nfsmeta     nfs     s   NFS Metadata and Commits
nfsrpc      nfs     s  NFS RPC Summary
nfssum      nfs     s   NFS Aggregate Summary Data
ctxint      proc    s  Context and Interruputs
proc        proc    s  Processes
sock        sock    s  Socket Usage
accaudt     swift   s  Account Auditor
accreap     swift   s  Account Reaper
accrepl     swift   s  Account Replicator
accsrvr     swift   s  Account Server
conaudt     swift   s  Container Auditor
conrepl     swift   s  Container Replicator
consrvr     swift   s  Container Server
consync     swift   s  Container Sync
conupdt     swift   s  Container Updater
objaudt     swift   s  Object Auditor
objexpr     swift   s  Object Expirer
objrepl     swift   s  Object Replicator
objsrv2     swift   s  Object Server2
objsrvr     swift   s  Object Server
objupdt     swift   s  Object Updater
prxyacc     swift   s  Proxy Account
prxycon     swift   s  Proxy Container
prxyobj     swift   s  Proxy Object
tcp         tcp     s  TCP errors count summary
tcpold      tcp     s  old TCP acks & packet failures
cpudet      cpu     d  Cpu Details, Other
cpuint      cpu     d  Interrupts by Cpu
cpumored    cpu     d  Additional types of use
diskdet     disk    d  Disk Details
diskdsize   disk    d  Disk I/O Size Details
diskque     disk    d  Disk request queue depth
disktimes   disk    d  Disk wait/service times
diskutil    disk    d  Disk utilization
fans        env     d  Fan Speeds
power       env     d  Power Sensor
temps       env     d  Temperature Sensors
ibdet       inter   d  IB interconnect detail data
ibdsize     inter   d  IB packet size detail
elandio     inter   d  Elan interconnect IOs (get/put/comp)
elandmb     inter   d  Elan interconnect MBs (get/put/comp)
cltdet      lustre  d  Lustre Client FS I/O Detail
cltdetL     lustre  d  Lustre Client OST I/O Detail
ossdet      lustre  d  Lustre OSS Detail
netdet      net     d  Network Details
netdsize    net     d  Network Packet Size Details
nfsV2cd     nfs     d  NFS Version 2 Client Detail
nfsV2sd     nfs     d  NFS Version 2 Server Detail
nfsV3cd     nfs     d  NFS Version 3 Client Detail
nfsV3sd     nfs     d  NFS Version 3 Server Detail
nfsV4cd     nfs     d  NFS Version 4 Client Detail
nfsV4sd     nfs     d  NFS Version 4 Server Detail
cltbrw      macro      Lustre Client BRW stats
cltbrwD     macro      Lustre Client BRW detail stats
detall      macro      All detail plots except nfs and lustre
detlus      macro      Lustre detail plots (there can be a LOT of these!)
detnfs      macro      NFS detail plots, colplot only
inter       macro      Interconnect summary plots
interdet    macro      Interconnect detail plots
lusblkDR    macro      Lustre Block I/O read detail stats (there can be LOTS of these!)
lusblkDW    macro      Lustre Block I/O write detail stats (there can be LOTS of these!)
lusblkR     macro      Lustre Block I/O read summary stats
lusblkW     macro      Lustre Block I/O write summary stats
misc        macro      All misc counters from '--import misc'
ossbrw      macro      Lustre OSS BRW stats
ossbrwD     macro      Lustre OSS BRW detail stats
sumall      macro      All summary plots, except nfs client/server stats
sumlus      macro      Lustre summary plots for clients, mds and oss
summin      macro      Minimal set of summary plots (cpu, disk, mem and disk
sumnfs      macro      NFS summary plots, colplot only

I don’t think I promised too much!

One thing to keep in mind though: you cannot plot charts for which you don’t have data. For example, if I wanted to plot CPU summary data, colplot would tell that it can’t:

$ colplot -dir /home/vagrant/data -plots cpu
No plottable files match your selection criteria.
Are your dir, selection dates and/or file protections right?

I am certain I have set the permissions correctly, but I also know that I didn’t capture CPU summary information (this would be lower-case “c”, as opposed to upper case “C” for the detailed recording). I suggest you run a few tests until you are comfortable with collectl’s command line switches to avoid later disappointment when trying to plot :)

With the collected performance data transferred to ~/data and now can plot some CPU and disk details:

$ colplot -dir /home/vagrant/data -plots cpudet,diskdet \
> -filedir /home/vagrant/data --time 08:15:00-08:25:00
Your Plot(s) have been written to /home/vagrant/data/5376-colplot.pdf

The resulting file created by this particular command is a PDF. I like this format simply because it’s easy to store it for later reference. I also wanted to limit the plots to a specific time, otherwise my little 5m test would have been hardly noticeable.

This is what it looks like, please don’t try to read anything from the charts, they are included for illustration purposes only, taken from a lab VM without any resemblance to a real-world system.

colplot example

It’s also possible to plot interactively by omitting the -filedir switch in colplot. Output is generated in your X-session and you can export it in different formats.

There is of course more to colplot than I could show in a single post, but I hope I have managed to give you a first impression.

Happy benchmarking!

Creating a RAC 12.1 Data Guard Physical Standby environment (3b)

Huh, what is this I hear you ask? Part 3b? Oracle 12.1? Well, there’s a bit of a story to this post. Back in December 2016 I started to write a series of blog posts (part 1 | part 2 | part 3 | part 4) about how I created a standby database on RAC 12.1. For some reason I forgot to post this part. Up until now the step where I am creating the broker configuration was missing. Thanks to a friend I discovered my mistake. I decided to post this article to complete the series. There’s a pretty big logical gap that needed filling :)

I also deliberately decided against making changes to my notes, so this is written in the same style as the articles preceding this one. Also bear in mind that this is Oracle 12.1!

NOTE: As always, this is just a demonstration using VMs in my lab, based on my notes. Your system is most likely different, so in real-life you might take a different approach. The techniques I am using here were suitable for me, and my own small-scale testing. I tried to make sure they are valid, but you may want to allocate more resources in your environment. Test, test, test on your own environment on test kit first!

Here is the original blog post

In a last-minute decision I decided to split part 3 into half: first the database duplication, followed by the Data Guard configuration. It really got a bit too long (even for my standards), and would have put readers off the article. I appreciate your patience though…

What’s the situation?

At the end of part 3 I had a physical standby database ready for use, including its registration in the OCR. it was looking promising, but there was more work to do.

At the moment, the standby database is out of sync with the primary, a fact that needs to be rectified by creating a Data Guard configuration. I prefer to do this on the command line. If you are so inclined you can of course push the mouse around and do it with OEM.

Step 1: Data Guard Broker pre-requisites

Before creating the configuration you should change the location of the broker configuration files to shared storage. This needs to be done on each side – primary and every standby – before you can start the broker.

SQL> alter system set dg_broker_config_file1='+DATA/NCDBA/dr1NCDBA.dat' scope=both sid='*';

System altered.

SQL> alter system set dg_broker_config_file2='+DATA/NCDBA/dr2NCDBA.dat' scope=both sid='*';

System altered.

SQL> show parameter dg_broker

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
dg_broker_config_file1               string      +DATA/NCDBA/dr1NCDBA.dat
dg_broker_config_file2               string      +DATA/NCDBA/dr2NCDBA.dat
dg_broker_start                      boolean     FALSE

SQL> alter system set dg_broker_start=true scope=both sid='*';

System altered.

SQL> 

I had to repeat the step on NCDBB, of course with adapted paths and file names.

Step 2: Creating a Data Guard Broker configuration

Once that’s done, you create the configuration using dgmgrl, the Data Guard Broker Manager Line Mode tool. I named my configuration RACTEST and added both databases. Here is an example of my session:

[oracle@rac12pri1 ~]$ dgmgrl
DGMGRL for Linux: Version 12.1.0.2.0 - 64bit Production

Copyright (c) 2000, 2013, Oracle. All rights reserved.

Welcome to DGMGRL, type "help" for information.
DGMGRL> connect sys@ncdba
Password:
Connected as SYSDBA.
DGMGRL> create configuration ractest as primary database is 'NCDBA' connect identifier is 'NCDBA';
Configuration "ractest" created with primary database "NCDBA"
DGMGRL> add database 'NCDBB' as connect identifier is 'NCDBB';
Database "NCDBB" added
DGMGRL> show configuration

Configuration - ractest

  Protection Mode: MaxPerformance
  Members:
  NCDBA - Primary database
    NCDBB - Physical standby database 

Fast-Start Failover: DISABLED

Configuration Status:
DISABLED

As a first step I connect as SYSDBA before creating the configuration. You can usually type help command to remind you of the exact syntax. After the configuration is created you can start adding (standby) databases. In my case there is only 1 – physical standby – database, but you can think of environments where you have more than one standby environment although that is less common. When specifying the connect identifier you use valid net*8 names found in the default tnsnames.ora (the one in the RDBMS home).

At this point it is important to set standby_file_management to auto. Otherwise managed recovery can abort and you will have to do extra work managing data files that can easily be avoided. The broker syntax to do so involves changing database properties (type show database verbose 'NCDBA' to show all of them). Many Data Guard broker properties are not easily mapped back to the corresponding initialisation parameter. In the case of standby_file_management it is not hard, and the documentation helps with the rest.

Eventually I enable the configuration and check the status. I have found that it can take a couple of minutes at times to get the SUCCESS message.

DGMGRL> edit database 'NCDBA' set property standbyfilemanagement=auto;
Property "standbyfilemanagement" updated
DGMGRL> edit database 'NCDBB' set property standbyfilemanagement=auto;
Property "standbyfilemanagement" updated
DGMGRL> enable configuration
Enabled.
DGMGRL> show configuration

Configuration - ractest

  Protection Mode: MaxPerformance
  Members:
  NCDBA - Primary database
    NCDBB - Physical standby database 

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 21 seconds ago)

There are other things you must consider in accordance with your company’s (data protection/availability/business continuity/…) standards and other documentation – again my example is rather basic, but deliberately so. It’s my lab playground after all.

For the sake of completeness, here is the current configuration of NCDBA and NCDBB:

DGMGRL> show database 'NCDBA'

Database - NCDBA

  Role:               PRIMARY
  Intended State:     TRANSPORT-ON
  Instance(s):
    NCDBA1
    NCDBA2

Database Status:
SUCCESS

DGMGRL> show database 'NCDBB';

Database - NCDBB

  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      0 seconds (computed 1 second ago)
  Apply Lag:          0 seconds (computed 1 second ago)
  Average Apply Rate: 223.00 KByte/s
  Real Time Query:    OFF
  Instance(s):
    NCDBB1
    NCDBB2 (apply instance)

Database Status:
SUCCESS

DGMGRL> exit

Please note that the broker detected the fact that my systems are cluster databases and lists both instances for each database. It also shows that NCDBB2 is the apply instance. I was positively surprised about the broker’s ability to detect both instances automatically. I seem to remember cases in earlier releases where said instance detection occasionally required a little nudging to get it right.

This example is deliberately kept as minimalist as possible: it works. Most likely it won’t satisfy your requirements. Review your Data Guard (and any other relevant) standards and change the configuration accordingly.

Summary

… back to present time :) This post completes the series, all my notes are now transcribed to the blog. I hope you find this useful.

back to part 3 | on to part 4

RAC One node databases are relocated by opatchauto in 12.2 part 2

In a previous post I shared how I found out that RAC One Node databases are relocated on-line during patching and I promised a few more tests and sharing of implications. If you aren’t familiar with RAC One Node I recommend having a look at the official documentation: The Real Application Clusters Administration and Deployment Guide features it prominently in the introduction. One of the things I like to keep in mind when working with RAC One Node is this, quoting from section 1.3 in said admin and deployment guide:

Using the Single Client Access Name (SCAN) to connect to the database, clients can locate the service independently of the node on which it is running. Relocating an Oracle RAC One Node instance is therefore mostly transparent to the client, depending on the client connection. Oracle recommends to use either Application Continuity and Oracle Fast Application Notification or Transparent Application Failover to minimize the impact of a relocation on the client.

A little later, you can read this important additional piece of information in the “notes” section:

To prepare for all failure possibilities, you must add at least one Dynamic Database Service (Oracle Clusterware-managed database service) to an Oracle RAC One Node database.

If you created your RAC One database with the database creation assistant (dbca), you are already complying with that rule. In my case, my lab database is named RON with the mandatory service RON_SVC. I opted for an instance prefix of “DCB”.

What I’d like to try out for this blog post is what happens to an active workload on a RAC One database during patching.

I have successfully patched multi-node RAC systems, but that required the application to support this procedure. One of my favourite talks is named “Advanced RAC programming techniques” where I demonstrate the resilience of an application based on RAC to node failures by using Universal Connection Pool (UCP), JDBC and Fast Connection Failover (FCF). UCP is a Fast Application Notification (FAN) aware connection pool allowing my application to react to cluster events such as node up/node down. The idea is to mask instance failure from users.

If all of this sounds super-cryptic, I would like to suggest chapters 5 and 6 of the RAC administration and deployment guide for you. There you can read more about Workload Management with Dynamic Database Services and Application Continuity in RAC.

RAC One Node is different from multi-node RAC as it is only ever active on one node in normal operations. Online relocation, as described in the previous post, is supported by temporarily adding a second (destination) instance to the cluster, and moving all transactions across before terminating them after a time-out (default 30m). Once the time-out is reached (or all sessions made it across to the destination) the source instance is shut down and you are back to 1 active instance.

The online relocation does not care too much about the compatibility of the application with the process. If your application is written for RAC, most likely it will migrate quickly from source to destination instance. If it isn’t, well, the hard timeout might kick in and evict a few of your users. In a worse case your users need to re-connect to the database. Even worse still, you might have to restart the middle-tier system(s).

Sadly I haven’t seen too many applications capable of handling RAC events properly. One application that does is Swingbench, so I’ll stick with it. I configured it according to Dominic Giles’s blog post.

This post assumes that you are properly licensed to use all these features.

The environment

My setup hasn’t changed from previous post so I spare you the boredom of repeating it here. Jump over to the other post for details.

Preparations

For this blogpost I need to ensure that my RAC One node database resides on the node I am about to patch. I have again followed the patch readme very carefully, I have made sure that I have (proven, working, tested) backups of the entire stack and all the rest of it…

My database is indeed actively running on the node I am about to patch:

[oracle@rac122sec2 ~]$ srvctl status database -db DCB -detail -verbose
Instance DCB_1 is running on node rac122sec2 with online services RON_SVC. Instance status: Open.
Instance DCB_1 is connected to ASM instance +ASM2
Online relocation: INACTIVE

Before I can benefit from Application Continuity, I need to make a few changes to my application service, RON_SVC. There are quite a few sources to choose from, I went with the JDBC Developer’s Guide. Here’s the modification:

[oracle@rac122sec2 ~]$ srvctl modify service -db DCB -service RON_SVC -commit_outcome true \
> -failoverretry 30 -failoverdelay 10 -failovertype transaction \
> -replay_init_time 1800 -retention 86400 -notification true
[oracle@rac122sec2 ~]

Following the instructions on Dominic Giles’s blog, I also need to grant SOE the right to execute DBMS_APP_CONT.

Finally, I need to make changes to my Swingbench configuration file. The relevant part is shown here:

<SwingBenchConfiguration xmlns="http://www.dominicgiles.com/swingbench/config">
    <Name>"Order Entry (PLSQL) V2"</Name>
    <Comment>Version 2 of the SOE Benchmark running in the database using PL/SQL</Comment>
    <Connection>
        <UserName>soe</UserName>
        <Password></Password>
        <ConnectString>(DESCRIPTION=(TRANSPORT_CONNECT_TIMEOUT=5)
        (RETRY_COUNT=3)(FAILOVER=ON)
        (ADDRESS=(PROTOCOL=TCP)(HOST=rac122sec-scan)(PORT=1521))
        (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=RON_SVC)))</ConnectString>
        <DriverType>Oracle jdbc Driver</DriverType>
        <ConnectionPooling>
            <PooledInitialLimit>5</PooledInitialLimit>
            <PooledMinLimit>5</PooledMinLimit>
            <PooledMaxLimit>10</PooledMaxLimit>
            <PooledInactivityTimeOut>50</PooledInactivityTimeOut>
            <PooledConnectionWaitTimeout>45</PooledConnectionWaitTimeout>
            <PooledPropertyCheckInterval>10</PooledPropertyCheckInterval>
            <PooledAbandonedConnectionTimeout>120</PooledAbandonedConnectionTimeout>
        </ConnectionPooling>
        <Properties>
            <Property Key="StatementCaching">50</Property>
            <Property Key="FastFailover">true</Property>
            <Property Key="FetchSize">20</Property>
            <Property Key="OnsConfiguration">nodes=rac122sec1:6200,rac122sec2:6200</Property>
            <Property Key="FetchSize">20</Property>
            <Property Key="AppContinuityDriver">true</Property>
        </Properties>
    </Connection>
    <Load>
        <NumberOfUsers>5</NumberOfUsers>
        <MinDelay>0</MinDelay>
        <MaxDelay>0</MaxDelay>
        <InterMinDelay>50</InterMinDelay>
        <InterMaxDelay>500</InterMaxDelay>
        <QueryTimeout>120</QueryTimeout>
        <MaxTransactions>-1</MaxTransactions>
        <RunTime>0:0</RunTime>
        <LogonGroupCount>1</LogonGroupCount>
        <LogonDelay>1000</LogonDelay>
        <LogOutPostTransaction>false</LogOutPostTransaction>
        <WaitTillAllLogon>false</WaitTillAllLogon>
        <StatsCollectionStart>0:0</StatsCollectionStart>
        <StatsCollectionEnd>0:0</StatsCollectionEnd>
        <ConnectionRefresh>0</ConnectionRefresh>
        <TransactionList>
...

The connection string is actually on a single line, I have formatted it for readability in the above example. The main change from the standard configuration file is the use of connection pooling and setting the required properties for Application Continuity.

Let’s patch!

Once all the preparations are completed, it’s time to see how RAC One Node deals with an active workload undergoing an online relocation during patching. First of all I need to start the workload. I’d normally use charbench for this, but this time around opted for the GUI. It shows performance graphs over a 3 minute rolling window.

A few minutes after starting the benchmark I commenced patching. Soon thereafter, opatchauto told me that it was relocating the database.

OPatchauto session is initiated at Fri Jul 27 14:52:23 2018

...

Relocating RACOne home before patching on home /u01/app/oracle/product/12.2.0.1/dbhome_1

According to the session log, this happened at 14:54. And by the way, always keep the session log, it’s invaluable!

2018-07-27 14:54:07,914 INFO  [163] com.oracle.glcm.patch.auto.db.product.executor.PatchingStepExecutor - 
Executing command as oracle: 
 /bin/sh -c 'ORACLE_HOME=/u01/app/oracle/product/12.2.0.1/dbhome_1 
 /u01/app/oracle/product/12.2.0.1/dbhome_1/bin/srvctl relocate database -d DCB -n rac122sec1 -w 30 -v'
2018-07-27 14:54:07,914 INFO  [163] com.oracle.glcm.patch.auto.db.product.executor.PatchingStepExecutor - 
  COMMAND Looks like this: 
    /bin/sh -c 'ORACLE_HOME=/u01/app/oracle/product/12.2.0.1/dbhome_1 
    /u01/app/oracle/product/12.2.0.1/dbhome_1/bin/srvctl relocate database -d DCB -n rac122sec1 -w 30 -v'
2018-07-27 14:55:31,736 INFO  [163] com.oracle.glcm.patch.auto.db.product.executor.PatchingStepExecutor - 
  status: 0
2018-07-27 14:55:31,737 INFO  [163] com.oracle.glcm.patch.auto.db.product.executor.PatchingStepExecutor - 
  Successfully executed the above command.

2018-07-27 14:55:31,737 INFO  [163] com.oracle.glcm.patch.auto.db.product.executor.PatchingStepExecutor - 
  Output from the command:
2018-07-27 14:55:31,737 INFO  [163] com.oracle.glcm.patch.auto.db.product.executor.PatchingStepExecutor -
Configuration updated to two instances
Instance DCB_2 started
Services relocated
Waiting for up to 30 minutes for instance DCB_1 to stop ...
Instance DCB_1 stopped
Configuration updated to one instance

2018-07-27 14:55:31,737 INFO  [163] com.oracle.glcm.patch.auto.db.product.executor.PatchingStepExecutor - 
  Command executed successfully.

You can see the same information by querying Clusterware, although there aren’t any timestamps associated with it:

[root@rac122sec1 ~]# srvctl status database -db DCB -detail -verbose
Instance DCB_1 is running on node rac122sec2 with online services RON_SVC. Instance status: Open.
Instance DCB_1 is connected to ASM instance +ASM2
Online relocation: ACTIVE
Source instance: DCB_1 on rac122sec2
Destination instance: DCB_2 on rac122sec1
[root@rac122sec1 ~]# 

Although the online relocation timeout was set to 30 minutes, use of modern coding techniques and connection pooling allowed for a much faster online relocation. As you can see in the log excerpt the entire relocation was completed 2018-07-27 14:55:31,737. Clusterware now tells me that my database runs on node 1:

[root@rac122sec1 ~]# date;  srvctl status database -db DCB -detail -verbose
Fri 27 Jul 14:55:38 2018
Instance DCB_2 is running on node rac122sec1 with online services RON_SVC. Instance status: Open.
Instance DCB_2 is connected to ASM instance +ASM1
Online relocation: INACTIVE

While this is wicked, what are the implications for the application? Have a look at this print screen, taken a minute after the online relocation completed.

As you can see there was nothing unusual recorded (tab events to the left), and I couldn’t see a drop in the number of sessions connected. I noticed a slight blip in performance though but it recovered very soon thereafter.

Summary

During automatic patching of the Oracle stack opatchauto will perform an online relocation of a RAC One Node database if it is found running on the node currently undergoing patching. If your application is developed with RAC in mind – such as using a FAN-aware connection pool like UCP, and either supports Fast Connection Failover or Application Continuity, there is a good chance that patching the system does not affect your service.

While my test was successful, it is by no means representative of a production workload – my RAC One database has a SGA of 4 GB and there were 5 concurrent users – hardly what I see in the real world. However that shouldn’t deter you: if you are using RAC One Node I think it’s definitely worth a try implementing modern technology into the application.

RAC One node databases are relocated by opatchauto in 12.2 part 1

This is an interesting observation I wanted to share. I have a feeling as if there didn’t seem to be too much information out there for RAC One Node (RON) users, and I hope this helps someone thinking about patching his system.

RAC-rolling patching is well documented in patch readme files, blog posts and official white papers. Most RAC DBAs have a solid handle on the procedure. Patching RAC One Node is a different affair.

What happens when patching a RAC One Node system? As the name suggests a RAC One Node database is a cluster database restricted to one active instance in normal operations. It is possible to relocate the database from one node to another online. Oracle does this by temporarily adding a second instance to the cluster database with the intention of letting it take over from the source instance. At the end of the online relocation, the source instance is shut down, and only the destination instance remains up and running.

An online relocation quite often is a manual operation. However I noticed that such an online relocation can happen during patching with opatchauto as well, at least in 12.2.

This post is intended to show you the process as it is, in the next part I’d like to show some implications of that approach.

The environment

In this example my lab environment consists of a 2 node RAC system currently patched to 12.2.0.1.180417. I wanted to apply the July 2018 RU to the system next to get some experience with the patch.

I have one RDBMS home in addition to the mandatory Grid home, same release level for both, no one-off patches (it’s a lab after all). The virtual machines run Oracle Linux 7.4 with kernel UEK4. To keep things simple there’s a single RAC One database, named RON. I assigned it DCB (“data centre B”) as unique name because I don’t like setting db_unique_name to reflect roles such as “PROD” and “STDBY”. It gets confusing when “STDBY” runs in primary role :)

Here’s the current status of my components:

[oracle@rac122sec2 ~]$ srvctl status database -db DCB
Instance DCB_1 is running on node rac122sec1
Online relocation: INACTIVE

[oracle@rac122sec2 ~]$ srvctl status service -db DCB
Service RON_SVC is running on instance(s) DCB_1
[oracle@rac122sec2 ~]$ 

For the curious, here’s the configuration metadata:

[oracle@rac122sec2 ~]$ srvctl config service -db DCB
Service name: RON_SVC
Server pool: 
Cardinality: 1
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Global: false
Commit Outcome: false
Failover type: 
Failover method: 
TAF failover retries: 
TAF failover delay: 
Failover restore: NONE
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: SERVICE_TIME
TAF policy specification: NONE
Edition: 
Pluggable database name: 
Maximum lag time: ANY
SQL Translation Profile: 
Retention: 86400 seconds
Replay Initiation Time: 300 seconds
Drain timeout: 
Stop option: 
Session State Consistency: DYNAMIC
GSM Flags: 0
Service is enabled
Preferred instances: DCB_1
Available instances: 
CSS critical: no

[oracle@rac122sec2 ~]$ srvctl config database -db DCB
Database unique name: DCB
Database name: RON
Oracle home: /u01/app/oracle/product/12.2.0.1/dbhome_1
Oracle user: oracle
Spfile: +DATA/DCB/spfileRON.ora
Password file: +DATA/DCB/orapwRON
Domain: 
Start options: open
Stop options: immediateb
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: 
Disk Groups: DATA,RECO
Mount point paths: 
Services: RON_SVC
Type: RACOneNode
Online relocation timeout: 30
Instance name prefix: DCB
Candidate servers: rac122sec1,rac122sec2
OSDBA group: dba
OSOPER group: oper
Database instances: DCB_1
CSS critical: no
CPU count: 0
Memory target: 0
Maximum memory: 0
Default network number for database services: 
Database is administrator managed

The most important takeaway is that my RON instance DCB_1 is running on node rac122sec1.

Now let’s patch

After having followed the instructions in the patch readme closely, and after double/triple/quadrupel checking that I have (working, tried and tested!) backups of the entire stack I am ready to patch. This time around I’m following the instructions for the automatic application of the Grid Infrastructure RU, eg using opatchauto. Here is some relevant output from the patching session:

...
OPatchauto session is initiated at Thu Jul 26 14:12:12 2018

System initialization log file is /u01/app/12.2.0.1/grid/cfgtoollogs/opatchautodb/systemconfig2018-07-26_02-12-14PM.log.

Session log file is /u01/app/12.2.0.1/grid/cfgtoollogs/opatchauto/opatchauto2018-07-26_02-13-15PM.log
The id for this session is Q4JA

Executing OPatch prereq operations to verify patch applicability on home /u01/app/12.2.0.1/grid

Executing OPatch prereq operations to verify patch applicability on home /u01/app/oracle/product/12.2.0.1/dbhome_1
Patch applicability verified successfully on home /u01/app/oracle/product/12.2.0.1/dbhome_1

Patch applicability verified successfully on home /u01/app/12.2.0.1/grid


Verifying SQL patch applicability on home /u01/app/oracle/product/12.2.0.1/dbhome_1
SQL patch applicability verified successfully on home /u01/app/oracle/product/12.2.0.1/dbhome_1


Preparing to bring down database service on home /u01/app/oracle/product/12.2.0.1/dbhome_1

WARNING: The service RON_SVC configured on dcb will not be switched as it is not configured to run on any other node(s).
No step execution required.........
 

Relocating RACOne home before patching on home /u01/app/oracle/product/12.2.0.1/dbhome_1
/u01/app/oracle/product/12.2.0.1/dbhome_1 is not a RACOne database. No step execution required........
 

Bringing down CRS service on home /u01/app/12.2.0.1/grid
...

Wait a minute, what’s that? Have a look at the line beginning with “Relocating RACOne home before patching…”. Relocating the database wasn’t necessary in this case (remember that the database was active on rac122sec1-the other node), but opatchauto can definitely relocate your RAC One database.

When it does, you will see something like this in the output generated by opatchauto:

...
Preparing to bring down database service on home /u01/app/oracle/product/12.2.0.1/dbhome_1

WARNING: The service RON_SVC configured on dcb will not be switched as it is not configured to run on any other node(s).
Successfully prepared home /u01/app/oracle/product/12.2.0.1/dbhome_1 to bring down database service


Relocating RACOne home before patching on home /u01/app/oracle/product/12.2.0.1/dbhome_1
Relocated RACOne home before patching on home /u01/app/oracle/product/12.2.0.1/dbhome_1
...

The last 2 lines are those of interest. opatchauto detected that a RAC One database was running on the active node, and relocated it. Under the covers it uses a “srvctl relocate database …” command, as shown in the session log file.

Interestingly however, and contrary to what I expected, opatchauto moves the RAC One database back to where it came from as a post-patch step. Towards then end of the patching session I saw this:

...
Starting CRS service on home /u01/app/12.2.0.1/grid
Postpatch operation log file location: /u01/app/oracle/crsdata/rac122sec2/crsconfig/crspatch_rac122sec2_2018-07-26_03-01-06PM.log
CRS service started successfully on home /u01/app/12.2.0.1/grid


Relocating back RACOne to home /u01/app/oracle/product/12.2.0.1/dbhome_1
Relocated back RACOne home successfully to home /u01/app/oracle/product/12.2.0.1/dbhome_1


Preparing home /u01/app/oracle/product/12.2.0.1/dbhome_1 after database service restarted
No step execution required.........
...

The relevant bit is in the middle (“relocating …”). After relocating the database to rac122sec1 opatchauto moved it back to rac122sec2.

Summary

Unlike rolling patching on multi-node RAC where all instances on the patched RDBMS home are shut down and applications rely on connection pools and Fast Application Notification to maintain service ability, a RAC One Node database might be relocated to a different node in the cluster. There are implications to that process for application developers, some of which I hope to share in the next post.

Little things worth knowing: Creating a RAC One Node database on the command line

This post is going to be super short, and mostly just a note to myself as I constantly forget how to create a RAC One database on the command line. This post is for 12.2.0.1 but should be similar on 12.1 (although I didn’t test!).

Provided you are licensed appropriately, this is probably the most basic way how you create an admin-managed RAC One database on Linux for use in a lab environment:

dbca -silent -createDatabase -gdbName RON -templateName gold_image01.dbc \
 -createAsContainerDatabase false -databaseConfigType RACONENODE \
 -RACOneNodeServiceName RON_SVC -totalMemory 1024 \
 -nodelist rac122node1,rac122node2 -storageType ASM \
 -datafileDestination '+DATA' -recoveryAreaDestination '+RECO' \
 -recoveryAreaSize 10240

This works for me, but most likely not for you :) And it’s certainly not suitable for a production deployment. Make sure to adapt the command as needed; I tend to create gold images for use with dbca, and this is one example.

The command itself should be fairly self-explanatory. If you are unsure about the meaning of the various options, have a look at the output of “dbca -help -createDatabase” and the official documentation/My Oracle Support. I learned the hard way that forgetting the “-nodelist” results in a single instance creation instead of an error message.

I didn’t find too many examples on the net, hope someone finds this useful.