Runtime Load Balancing Advisory in RAC 12c

This is a follow-up on yesterday’s post about services in the new 12c database architecture. After having worked out everything I needed to know about TAF and RAC 12c in CDBs I wanted to check how FCF works with PDBs today. While investigating I found out that the Runtime Load Balancing Advisory does not seem to work as expected in some cases. But I’m getting ahead of myself. First of all, here is my test case:

  • Oracle Linux 6.4 x86-64
  • Grid Infrastructure 12.1.0.1.2, i.e. January 2014 PSU applied
  • RDBMS 12.1.0.1.2, likewise patched with the January PSU
  • A CDB with just 1 PDB for this purpose, named DEMOPDB
  • Service FCFSRV is used

Service setup

The service is specifically created to connect against the PDB:

[oracle@rac12node1 ~] srvctl add service -d rac12c -s FCFSRV -preferred RAC12C1,RAC12C2 -pdb DEMOPDB \
> -clbgoal short -rlbgoal SERVICE_TIME -notification true

The service has then been started and was running on both instances. The PDB was also opened read-write on all instances. I am using the same dequeue.pl script to dequeue FAN events as I already demonstrated in the RAC book. In normal operations both instances should be evenly loaded (more or less). Changes to the load on a RAC node should be communicated internally using Fast Application Notifications (FAN). In theory this allows connection pools to shift sessions from one instance to another and dynamically reflect changes in the cluster load. During my limited testing Oracle’s Universal Connection Pool (UCP) does this quite nicely, and I have written some articles about it. Search this site for UCP if you would like to read up on UCP first.

As you just read the mechanism employed is based on FAN events and explained in the Real Application Clusters Administration and Deployment Guide. The RTLB Advisory is perfectly suited for connection pools, and the right way forward in my opinion for new development projects.

The service definition shown above sets the connection load balancing (CLB) goal to short connections, typical for an application based on a connection pool. In this environment, you look up the pool using JNDI for example, grab a session, do something, and hand it back to the pool. I instructed the pool to optimise for service time, ie. to get a connection from the instance offering the best response time (as opposed to throughput, the other option).

Testing the RTLB Advisory with a PDB

With the services started and the PDB open I was ready to burn some CPU. This is quite simply done with a SQL script that I execute lots of times. First the SQL script (credit to Julian Dyke)

[oracle@rac12node2 burncpu]$ cat burn.sql
DECLARE
  n NUMBER := 0;
BEGIN
  FOR f IN 1..10000000
  LOOP
    n := MOD (n,999999) + SQRT (f);
  END LOOP;
--  DBMS_OUTPUT.PUT_LINE ('Res = '||TO_CHAR (n,'999999.99'));
END;
/

exit

This script is called by my (certainly not perfect!) shell script:

[oracle@rac12node2 burncpu]$ cat burn.sh
#!/bin/bash

RUNS=${1:-10}
SERVICE=${2:-FCFSRV}

echo starting $RUNS runs to burn some CPU using service $SERVICE

START_TIME=$SECONDS

for i in $(seq 1 $RUNS) ; do
  sqlplus -s martin/secret@rac12scan.gns.example.com/${SERVICE} @burn &
done

wait

ELAPSED_TIME=$(($SECONDS - $START_TIME))

echo CPU burnt in $ELAPSED_TIME seconds

Time to connect!

SQL> select name,cdb from v$database;

NAME      CDB
--------- ---
RAC12C    YES

SQL> @/home/oracle/lspdbs

   INST_ID NAME                           OPEN_MODE
---------- ------------------------------ ----------
         1 DEMOPDB                        READ WRITE
         2 DEMOPDB                        READ WRITE

Now I am going to generate a lot of CPU load on the system, which should have an effect. In a different session I dequeue the FAN events. You may have noticed that I’m requesting “-notification true” in the service definition. This is needed for this example and .NET clients if memory serves me right. Here is the result of me creating some load:

top - 14:27:42 up 4 days, 23:04,  3 users,  load average: 7.30, 2.52, 2.22
Tasks: 317 total,  16 running, 301 sleeping,   0 stopped,   0 zombie
Cpu(s): 98.2%us,  1.5%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.2%hi,  0.0%si,  0.2%st
Mem:   8059892k total,  6277936k used,  1781956k free,    34372k buffers
Swap:   524280k total,     6144k used,   518136k free,  4143280k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 7994 oracle    20   0 3379m  31m  28m R 13.3  0.4   0:04.35 oracle_7994_rac
 7996 oracle    20   0 3379m  34m  31m R 13.3  0.4   0:04.35 oracle_7996_rac
 7998 oracle    20   0 3379m  39m  35m R 13.3  0.5   0:04.36 oracle_7998_rac
 8006 oracle    20   0 3378m  28m  26m R 13.3  0.4   0:04.34 oracle_8006_rac
 8008 oracle    20   0 3378m  28m  26m R 13.3  0.4   0:04.33 oracle_8008_rac
 8010 oracle    20   0 3379m  29m  26m R 13.3  0.4   0:04.34 oracle_8010_rac
 8018 oracle    20   0 3378m  28m  26m R 13.3  0.4   0:04.34 oracle_8018_rac
 8004 oracle    20   0 3378m  29m  26m R 12.6  0.4   0:04.02 oracle_8004_rac
 8074 oracle    20   0 3378m  28m  25m R 12.6  0.4   0:03.64 oracle_8074_rac
 8012 oracle    20   0 3379m  32m  29m R 12.3  0.4   0:04.03 oracle_8012_rac
 8016 oracle    20   0 3378m  28m  26m R 12.3  0.4   0:04.03 oracle_8016_rac
 8000 oracle    20   0 3378m  28m  26m R 11.9  0.4   0:04.03 oracle_8000_rac
 8002 oracle    20   0 3379m  29m  26m R 11.9  0.4   0:04.03 oracle_8002_rac
 8014 oracle    20   0 3378m  29m  26m R 11.9  0.4   0:04.03 oracle_8014_rac
 8072 oracle    20   0 3378m  28m  25m R 11.9  0.4   0:03.63 oracle_8072_rac
 4365 root      RT   0  789m 101m  71m S  1.3  1.3 103:57.23 osysmond.bin

This isn’t too bad for a virtualised system with a dual-core “processor”. The other instance is more or less idle.

top - 14:27:59 up 4 days, 23:05,  4 users,  load average: 0.23, 0.45, 1.08
Mem:   8059892k total,  5994680k used,  2065212k free,   177168k buffers
Swap:   524280k total,     4024k used,   520256k free,  4347016k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 9626 oracle    20   0 3381m 165m 159m S  9.3  2.1   0:04.64 ora_ppa7_rac12c
 3813 root      RT   0  789m 101m  71m S  1.7  1.3  84:00.64 osysmond.bin
 1938 oracle    20   0  546m  37m  14m S  1.0  0.5 101:26.07 gipcd.bin
 3710 oracle    -2   0 1367m  16m  14m S  1.0  0.2 153:56.63 asm_vktm_+asm2

Now the big question is: does this have any effect? Going by my script and v$servicemetric it doesn’t. Here are the FAN events for the duration, reformatted for readability:

VERSION=1.0 database=RAC12C service=FCFSRV {
  {instance=RAC12C2 percent=50 flag=UNKNOWN aff=FALSE}
  {instance=RAC12C1 percent=50 flag=UNKNOWN aff=FALSE}
} timestamp=2014-02-18 14:30:36
VERSION=1.0 database=RAC12C service=FCFSRV {
  {instance=RAC12C2 percent=50 flag=UNKNOWN aff=FALSE}
  {instance=RAC12C1 percent=50 flag=UNKNOWN aff=FALSE}
} timestamp=2014-02-18 14:31:06
VERSION=1.0 database=RAC12C service=FCFSRV {
  {instance=RAC12C2 percent=50 flag=UNKNOWN aff=FALSE}
  {instance=RAC12C1 percent=50 flag=UNKNOWN aff=FALSE}
} timestamp=2014-02-18 14:31:36
VERSION=1.0 database=RAC12C service=FCFSRV {
  {instance=RAC12C2 percent=50 flag=UNKNOWN aff=FALSE}
  {instance=RAC12C1 percent=50 flag=UNKNOWN aff=FALSE}
} timestamp=2014-02-18 14:32:06

As you can see the percentage between instances is 50:50 all the time, despite the big difference in load on the nodes. The FAN events are documented in the same chapter of the RAC Admin Guide. The percent figure is used to describe “The percentage of work requests to send to this database instance.”. The observation leads me to suspect there is something not quite right with the figures above.

UPDATE

After retesting with a different system I came to the conclusion that the above is an edge case. As you saw in the shell script I am connecting to a service. The service in turn is available on both instances. Yet all my sessions ended up on just one node, which shouldn’t be the case. I double-checked my DNS round-robin resolution for the SCAN but couldn’t find anything wrong with it. Unfortunately I no longer had access to the original system to repeat my testing and I recreated a new 12.1.1.2 cluster:

12:46:44 SQL> select xmltransform(dbms_qopatch.get_opatch_list(), dbms_qopatch.GET_OPATCH_XSLT()) from dual;

XMLTRANSFORM(DBMS_QOPATCH.GET_OPATCH_LIST(),DBMS_QOPATCH.GET_OPATCH_XSLT())
--------------------------------------------------------------------------------

Patch Details:

Patch(sqlpatch) 17552800:   applied on 2014-03-17T11:01:42-04:00
Unique Patch ID: 16974186
  Patch Description: Database Patch Set Update : 12.1.0.1.2 (17552800)
  Created on     : 3 Jan 2014, 03:21:20 hrs PST8PDT
  Files Touched:


Patch    17077442:   applied on 2014-03-17T11:01:25-04:00
Unique Patch ID: 16881794
  Patch Description: Oracle Clusterware Patch Set Update 12.1.0.1.1
  Created on     : 12 Oct 2013, 06:33:53 hrs US/Central
  Files Touched:

I created two test cases. In case 1 I connect to just one instance to confirm the observation made above. In the second test I try to evenly connect to both instances. The PDB in this case is ACDEMOPDB and the service name is PDB_FCF:

srvctl add service -d RAC12C -s PDB_FCF -preferred RAC12C1,RAC12C2 -failovertype transaction \
> -pdb ACDEMOPDB -clbgoal short -rlbgoal service_time -notification true -commit_outcome true

I needed to change my connection information for this to work, see below for each example.

Case 1: I am connecting to just 1 instance:

FANTEST_SI =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = rac12scan.example.com)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = PDB_FCF)
      (INSTANCE_NAME = RAC12C2)
    )
  )

In my test case I executed the burn.sql script in this way:

for i in $(seq 1 10); do
  $ORACLE_HOME/bin/sqlplus martin/secret@fantest_si @burn &
done

This is the same as in the above test where only 1 instance was hammered. There is no reflection of the load in the FAN load balancing events.

Case 2: I am connecting to all instances served by the service

Same setup, but this time I connect to all instances:

FANTEST_ALL =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = rac12scan.example.com)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = PDB_FCF)
    )
  )

The invocation of the load test is the same except for the service name:

for i in $(seq 1 10); do
  $ORACLE_HOME/bin/sqlplus martin/secret@fantest_all @burn &
done

And finally I got the much expected FAN events! For some reason 6 sessions ended up on node 2 and only 4 on node 2:

SQL> select count(inst_id),inst_id from gv$session
  2  where con_id = 3 and username = 'MARTIN' group by inst_id;

COUNT(INST_ID)    INST_ID
-------------- ----------
             4          1
             6          2

Examples for the FAN events recorded are:

VERSION=1.0 database=RAC12C service=PDB_FCF {
  {instance=RAC12C2 percent=45 flag=GOOD aff=FALSE}
  {instance=RAC12C1 percent=55 flag=GOOD aff=FALSE}
} timestamp=2014-03-22 12:28:53
VERSION=1.0 database=RAC12C service=PDB_FCF {
  {instance=RAC12C2 percent=41 flag=GOOD aff=FALSE}
  {instance=RAC12C1 percent=59 flag=GOOD aff=FALSE}
} timestamp=2014-03-22 12:29:53

So the lesson learnt (and apologies for the confusion!) is that FAN load balancing events seem to require you to log into the service, and without specifying the instance. And they also seem to be slightly different if you connect to the service and you end up on just 1 instance for whatever reason.

Repeating the test on a Non-CDB

Now just to prove this does actually work I created a non-CDB and repeated the test with database NCDB. To cut the long story short, here are my results:

VERSION=1.0 database=NCDB service=FCFSRV {
  {instance=NCDB2 percent=50 flag=GOOD aff=TRUE}
  {instance=NCDB1 percent=50 flag=UNKNOWN aff=FALSE}
} timestamp=2014-02-18 13:55:09

VERSION=1.0 database=NCDB service=FCFSRV {
  {instance=NCDB2 percent=55 flag=GOOD aff=TRUE}
  {instance=NCDB1 percent=45 flag=GOOD aff=FALSE}
} timestamp=2014-02-18 13:55:39

VERSION=1.0 database=NCDB service=FCFSRV {
  {instance=NCDB2 percent=71 flag=GOOD aff=TRUE}
  {instance=NCDB1 percent=29 flag=GOOD aff=FALSE}
} timestamp=2014-02-18 13:56:09
VERSION=1.0 database=NCDB service=FCFSRV {
  {instance=NCDB2 percent=78 flag=GOOD aff=TRUE}
  {instance=NCDB1 percent=22 flag=GOOD aff=FALSE}
} timestamp=2014-02-18 13:56:39
VERSION=1.0 database=NCDB service=FCFSRV {
  {instance=NCDB2 percent=64 flag=GOOD aff=TRUE}
  {instance=NCDB1 percent=36 flag=UNKNOWN aff=FALSE}
} timestamp=2014-02-18 13:57:09
VERSION=1.0 database=NCDB service=FCFSRV {
  {instance=NCDB2 percent=57 flag=GOOD aff=TRUE}
  {instance=NCDB1 percent=43 flag=UNKNOWN aff=FALSE}
} timestamp=2014-02-18 13:57:39
VERSION=1.0 database=NCDB service=FCFSRV {
  {instance=NCDB2 percent=54 flag=GOOD aff=TRUE}
  {instance=NCDB1 percent=46 flag=UNKNOWN aff=FALSE}
} timestamp=2014-02-18 13:58:09
^Cexiting...

Seems like another case to raise with Oracle Support. In the above output you can see how the system would have sent sessions to instance 2 instead of instance 1 where I generated all the load.

Responses

  1. Is your v$servicemetric query against the pluggable or container database?

    1. Just repeated the test-no difference between CDB$ROOT and DEMOPDB, which was what I hoped for. The FAN events I dequeued are what your connection pool would receive so there is definitely a problem there.

Blog at WordPress.com.