Martins Blog

Trying to explain complex things in simple terms

Runtime Load Balancing Advisory in RAC 12c-addendum

Posted by Martin Bach on February 19, 2014

A reader asked an interesting question yesterday with regards to the previous post on the subject: where did you get your service metrics from when you queried v$servicemetric-PDB or CDB$ROOT?

I queried the PDB, but this morning repeated the test to make sure the results are consistent, and they are. This is definitely something you’d hope for: you should not have different results in the same v$-view depending on the container you execute your query in for a given CON_ID.

During testing I noticed something interesting though. I queried gv$servicemetric but did not limit the result to the service I wanted to test with (FCFSRV). Here is the query against gv$servicemetric while the system was idle.

select inst_id,begin_time,end_time,service_name,cpupercall,callspersec,goodness,delta,con_id
from gv$servicemetric
where con_id = 3 and service_name  in ('FCFSRV','demopdb')
order by service_name, inst_id;

   INST_ID BEGIN_TIME          END_TIME            SERVICE_NAME         CPUPERCALL CALLSPERSEC   GOODNESS      DELTA     CON_ID
---------- ------------------- ------------------- -------------------- ---------- ----------- ---------- ---------- ----------
         1 19.02.2014 10:07:16 19.02.2014 10:07:21 FCFSRV                        0           0        100        100          3
         1 19.02.2014 10:05:57 19.02.2014 10:06:56 FCFSRV                        0           0        100        100          3
         2 19.02.2014 10:05:55 19.02.2014 10:06:54 FCFSRV                        0           0        100        100          3
         2 19.02.2014 10:07:19 19.02.2014 10:07:24 FCFSRV                        0           0        100        100          3
         1 19.02.2014 10:07:16 19.02.2014 10:07:21 demopdb                       0           0          0          1          3
         1 19.02.2014 10:05:57 19.02.2014 10:06:56 demopdb                       0           0          0          1          3
         2 19.02.2014 10:07:19 19.02.2014 10:07:24 demopdb                       0           0          0          1          3
         2 19.02.2014 10:05:55 19.02.2014 10:06:54 demopdb                       0           0          0          1          3

8 rows selected.

DEMOPDB is the default service created when the PDB going by the same name is opened. Now I’m going to ramp up some load on one of the instances again. After a short while node 2 is nearly dying:

top - 10:34:28 up 5 days, 19:11,  3 users,  load average: 17.49, 5.88, 3.50
Tasks: 362 total,  27 running, 335 sleeping,   0 stopped,   0 zombie
Cpu(s): 98.2%us,  1.3%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.3%hi,  0.2%si,  0.0%st
Mem:   8059892k total,  7900680k used,   159212k free,   170980k buffers
Swap:   524280k total,     4024k used,   520256k free,  5607316k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
12807 oracle    20   0 3378m  28m  26m R  7.6  0.4   0:04.03 oracle_12807_ra
12809 oracle    20   0 3378m  28m  26m R  7.6  0.4   0:04.20 oracle_12809_ra
12813 oracle    20   0 3378m  28m  26m R  7.6  0.4   0:04.03 oracle_12813_ra
12829 oracle    20   0 3378m  28m  26m R  7.6  0.4   0:03.92 oracle_12829_ra
12840 oracle    20   0 3378m  28m  26m R  7.6  0.4   0:03.25 oracle_12840_ra
12842 oracle    20   0 3378m  28m  26m R  7.6  0.4   0:03.35 oracle_12842_ra
12844 oracle    20   0 3378m  28m  26m R  7.6  0.4   0:03.24 oracle_12844_ra
12865 oracle    20   0 3378m  28m  26m R  7.6  0.4   0:02.38 oracle_12865_ra
12805 oracle    20   0 3378m  28m  26m R  7.3  0.4   0:04.27 oracle_12805_ra
12811 oracle    20   0 3378m  28m  26m R  7.3  0.4   0:04.11 oracle_12811_ra
12815 oracle    20   0 3378m  28m  26m R  7.3  0.4   0:04.10 oracle_12815_ra
12817 oracle    20   0 3378m  28m  26m R  7.3  0.4   0:04.10 oracle_12817_ra
12819 oracle    20   0 3378m  28m  26m R  7.3  0.4   0:04.02 oracle_12819_ra
12821 oracle    20   0 3378m  28m  26m R  7.3  0.4   0:04.03 oracle_12821_ra
12823 oracle    20   0 3378m  28m  26m R  7.3  0.4   0:04.10 oracle_12823_ra

… while node 1 is idle:

top - 10:34:20 up 5 days, 19:11,  3 users,  load average: 0.47, 0.70, 0.84
Tasks: 367 total,   1 running, 366 sleeping,   0 stopped,   0 zombie
Cpu(s):  2.4%us,  1.9%sy,  0.0%ni, 37.4%id, 57.7%wa,  0.2%hi,  0.2%si,  0.3%st
Mem:   8059892k total,  7818588k used,   241304k free,    23948k buffers
Swap:   524280k total,     8472k used,   515808k free,  5359064k cached

Going back to my query:

SYSTEM@DEMOPDB> r
  1  select inst_id,begin_time,end_time,service_name,cpupercall,callspersec,goodness,delta,con_id
  2  from gv$servicemetric
  3  where con_id = 3 and service_name  in ('FCFSRV','demopdb')
  4* order by service_name, inst_id

   INST_ID BEGIN_TIME          END_TIME            SERVICE_NAME         CPUPERCALL CALLSPERSEC   GOODNESS      DELTA     CON_ID
---------- ------------------- ------------------- -------------------- ---------- ----------- ---------- ---------- ----------
         1 19.02.2014 10:34:37 19.02.2014 10:34:42 FCFSRV                        0           0        100        100          3
         1 19.02.2014 10:32:56 19.02.2014 10:33:56 FCFSRV                        0           0        100        100          3
         2 19.02.2014 10:32:54 19.02.2014 10:33:54 FCFSRV                        0           0        100        100          3
         2 19.02.2014 10:34:34 19.02.2014 10:34:39 FCFSRV                        0           0        100        100          3
         1 19.02.2014 10:34:37 19.02.2014 10:34:42 demopdb                       0           0          0          1          3
         1 19.02.2014 10:32:56 19.02.2014 10:33:56 demopdb                       0           0          0          1          3
         2 19.02.2014 10:34:34 19.02.2014 10:34:39 demopdb                 8629687           0         25          1          3
         2 19.02.2014 10:32:54 19.02.2014 10:33:54 demopdb              141362.896  7.48129676         25          1          3

8 rows selected.

Aha! If you look closely then you will notice that the metrics are updated for the default service on instance 2 only. Interesting. It is still a problem unless you connect to the default service but that’s not really an option for many applications where you want to logically subdivide your cluster.

About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

Join 2,273 other followers

%d bloggers like this: