Category Archives: Performance

New Events for Data Guard and Synchronous Redo Transport in 12c (2)

After the baseline has been established in the first part of this series it’s time to measure the effect of the network in this part. The second test will introduce an interesting feature: Using Linux’s own Traffic Shaper/Quality of Services module I will add a delay of 100ms to the Data Guard network interface card (NIC) to slow things down a little.

WARNING: this is of course a lab or VM-only situation. I can’t exert control over wire quality in my (own) switches, hence some software magic is needed on my virtual ones. This post is intended to be merely for educational purposes, not for use at work.

I am continuing to use the 2 node RAC 12.1.0.2.170117 primary database on Oracle Linux 7 with UEK 4 and an identical RAC to host my standby database.

All database traffic still uses eth0 – my public network (192.168.100.0/24). In order not to affect my SSH session with the planned network changes I created a new out-of-band management interface (eth2) on the Data Guard standby database servers. I love KVM and virtio drivers for the NICs: a few commands in virsh later and every RAC node has a new NIC, and of course without rebooting. Adding some nmcli magic and I can configure all interfaces in 1 line!

Here’s what it looks like from connection manager (the system is Oracle Linux 7.3)

[root@rac12sec1 ~]# nmcli c show
NAME  UUID                                  TYPE            DEVICE 
eth0  3e8915f1-10ce-4da7-99b8-d2a6947f9ff7  802-3-ethernet  eth0   
eth1  0ebe1279-40e1-4faf-a0be-49588d347fcf  802-3-ethernet  eth1   
mgmt  51054729-19ae-49e7-ba18-c6b41475ec1b  802-3-ethernet  eth2   

My eth0 device maps to the public networks, e.g. all database traffic including Data Guard. The private interconnect uses eth1 with eth2 as the management interface. I have deliberately kept this simple, usually I’d request a dedicated set of interfaces to route Data Guard traffic (I blogged about that earlier in the context of RAC) here:

Please read past part I of the Adding an additional SCAN-series to get the full story.

Change protection mode

Following the scenario I outlined in part I I first have to change the protection mode to Maximum Availability. Again, please see part I of this series for an explanation about Protection Levels, my choice of protection level, and why “Maximum Availability” might or might not work for you.

Since I’m using the Data Guard Broker, I need to head over and change the mode in dgmgrl.

DGMGRL> show configuration

Configuration - ractest

  Protection Mode: MaxPerformance
  Members:
  NCDBA - Primary database
    NCDBB - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 40 seconds ago)

As with many other maintenance tasks, you use the edit command to change the configuration.

DGMGRL> EDIT CONFIGURATION SET PROTECTION MODE AS MaxAvailability;
Error: ORA-16627: operation disallowed since no standby databases would 
remain to support protection mode

Failed.

I seem to have gotten a bit ahead of myself – changing to the more serious protection modes requires changing the transport mode from async to sync. Let’s do that, and then change the protection mode:

DGMGRL> edit database 'NCDBA' set property LogXptMode = 'SYNC';
Property "logxptmode" updated
DGMGRL> edit database 'NCDBB' set property LogXptMode = 'SYNC';
Property "logxptmode" updated
DGMGRL> EDIT CONFIGURATION SET PROTECTION MODE AS MaxAvailability;
Succeeded.
DGMGRL> show configuration

Configuration - ractest

  Protection Mode: MaxAvailability
  Members:
  NCDBA - Primary database
    NCDBB - Physical standby database 

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 19 seconds ago)

DGMGRL> 

That looks better now, the first half of the job is done. The second half of this article details how to make the network look a lot worse.

Net Emulator

Using the Linux netem kernel module I can simulate many complex network situations – I found https://wiki.linuxfoundation.org/networking/netem to be a good reference.

I should probably say at this point that no one would use a link that has 100 ms delay between sites for (sync) redo transport. That simply doesn’t make sense. I used 100ms to have a large enough factor so that I can actually be sure that a deviation of performance figures is actually a result of my fiddling with the network latency. Large numbers make spotting outliers easier.

How do you mess with the network? That is simple when you are on Linux! On my standby cluster I use the tc command, part of the iproute package. This is the “normal” state:

[root@rac12sec1 ~]# tc qdisc
qdisc pfifo_fast 0: dev eth0 root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: dev eth1 root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: dev eth2 root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
[root@rac12sec1 ~]# 

I am merely using these tools, if you want to understand them, have a look at the Linux Advanced Routing & Traffic Control HOWTO chapter 9, and refer to the link to the Linux Foundation wiki I mentioned earlier. Then take it from there.

The syntax of tc qdisc is complex at first sight:

[root@rac12sec1 ~]# tc qdisc help
Usage: tc qdisc [ add | del | replace | change | show ] dev STRING
       [ handle QHANDLE ] [ root | ingress | parent CLASSID ]
       [ estimator INTERVAL TIME_CONSTANT ]
       [ stab [ help | STAB_OPTIONS] ]
       [ [ QDISC_KIND ] [ help | OPTIONS ] ]

       tc qdisc show [ dev STRING ] [ingress]
Where:
QDISC_KIND := { [p|b]fifo | tbf | prio | cbq | red | etc. }
OPTIONS := ... try tc qdisc add <desired QDISC_KIND> help
STAB_OPTIONS := ... try tc qdisc add stab help

The above allows us to introduce NetEM to the mix. On my Linux 7.3/UEK4 system it offers the following options:

[root@rac12sec1 ~]# tc qdisc add netem help
Usage: ... netem [ limit PACKETS ] 
                 [ delay TIME [ JITTER [CORRELATION]]]
                 [ distribution {uniform|normal|pareto|paretonormal} ]
                 [ corrupt PERCENT [CORRELATION]] 
                 [ duplicate PERCENT [CORRELATION]]
                 [ loss random PERCENT [CORRELATION]]
                 [ loss state P13 [P31 [P32 [P23 P14]]]
                 [ loss gemodel PERCENT [R [1-H [1-K]]]
                 [ ecn ]
                 [ reorder PRECENT [CORRELATION] [ gap DISTANCE ]]
                 [ rate RATE [PACKETOVERHEAD] [CELLSIZE] [CELLOVERHEAD]]

That’s a wealth of options. I can simulate somewhat unreliable WAN links, even with variations in delay! Refer to the Linux Foundation wiki for more details on how to simulate networks more realistically.

Introducing Delay

I prefer to keep it simple for this post, so I’ll stick with the most basic setting for network delay. I am introducing this to the Data Guard network on both of my standby nodes while connected to the management interface eth2:

[root@rac12sec1 ~]# tc qdisc add dev eth0 root netem delay 100ms

[root@rac12sec1 ~]# tc qdisc
qdisc netem 8001: dev eth0 root refcnt 2 limit 1000 delay 100.0ms
qdisc pfifo_fast 0: dev eth1 root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: dev eth2 root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1

[root@rac12sec2 ~]# tc qdisc
qdisc netem 8001: dev eth0 root refcnt 2 limit 1000 delay 100.0ms
qdisc pfifo_fast 0: dev eth1 root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: dev eth2 root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1

[root@rac12sec2 ~]# tc qdisc show dev eth0 
qdisc netem 8001: root refcnt 2 limit 1000 delay 100.0ms

The effect can be seen immediately by pinging one of the standby nodes:

[oracle@rac12pri1 ~]$ ping rac12sec1
PING rac12sec1.example.com (192.168.100.20) 56(84) bytes of data.
64 bytes from rac12sec1.example.com (192.168.100.20): icmp_seq=1 ttl=64 time=100 ms
64 bytes from rac12sec1.example.com (192.168.100.20): icmp_seq=2 ttl=64 time=100 ms
64 bytes from rac12sec1.example.com (192.168.100.20): icmp_seq=3 ttl=64 time=100 ms
64 bytes from rac12sec1.example.com (192.168.100.20): icmp_seq=4 ttl=64 time=100 ms
^C
--- rac12sec1.example.com ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3003ms
rtt min/avg/max/mdev = 100.368/100.415/100.478/0.229 ms
[oracle@rac12pri1 ~]$ 

So 100ms delay it is!

Starting the Benchmark

With the scene set, it’s time to start the benchmark. The command to start it resembles the one I used for part I, except for the name of the results file.

[oracle@loadgen1 bin]$ ./charbench -cs //rac12pri-scan/swingbench_both \
> -uc 20 -r results_20_maxavail_netem.xml -rt 00:10
Author  :	 Dominic Giles
Version :	 2.5.0.971

Results will be written to results_20_maxavail_netem.xml.
Hit Return to Terminate Run...

Time		Users	TPM	TPS

1:08:56 PM      20      5188    94          

What effect does this have on the database? Let’s check the Data Guard broker first:

DGMGRL> show database 'NCDBB'

Database - NCDBB

  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      0 seconds (computed 0 seconds ago)
  Apply Lag:          10 seconds (computed 0 seconds ago)
  Average Apply Rate: 207.00 KByte/s
  Real Time Query:    OFF
  Instance(s):
    NCDBB1
    NCDBB2 (apply instance)

Database Status:
SUCCESS

DGMGRL> show database 'NCDBB'

Database - NCDBB

  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      0 seconds (computed 0 seconds ago)
  Apply Lag:          0 seconds (computed 0 seconds ago)
  Average Apply Rate: 211.00 KByte/s
  Real Time Query:    OFF
  Instance(s):
    NCDBB1
    NCDBB2 (apply instance)

Database Status:
SUCCESS

So it looks like there isn’t any lag visible with the benchmark underway, however it took a little while for it to start as visible in the first example. Here is the output from Tanel Poder’s ASHTOP script again (using Tanel Poder’s ashtop again – which requires you to have the Diagnostic Pack license as it access v$active_session_history):

SQL> @ashtop program,event,wait_class,session_state session_type='BACKGROUND' "to_date('17.02.2017 08:10','dd.mm.yyyy hh24:mi')" "to_date('17.02.2017 08:15','dd.mm.yyyy hh24:mi')"

    Total
  Seconds     AAS %This   PROGRAM                                          EVENT                                    WAIT_CLASS      SESSION FIRST_SEEN          LAST_SEEN           DIST_SQLEXEC_SEEN
--------- ------- ------- ------------------------------------------------ ---------------------------------------- --------------- ------- ------------------- ------------------- -----------------
      298     1.0   24% | oracle@rac12pri1 (LGWR)                          SYNC Remote Write                        Other           WAITING 2017-02-17 08:10:00 2017-02-17 08:14:59                 1
      298     1.0   24% | oracle@rac12pri1 (NSS2)                          Redo Transport MISC                      Other           WAITING 2017-02-17 08:10:00 2017-02-17 08:14:59                 1
      276      .9   23% | oracle@rac12pri2 (LGWR)                          SYNC Remote Write                        Other           WAITING 2017-02-17 08:10:00 2017-02-17 08:14:59                 1
      274      .9   23% | oracle@rac12pri2 (NSS2)                          Redo Transport MISC                      Other           WAITING 2017-02-17 08:10:00 2017-02-17 08:14:59                 1
        8      .0    1% | oracle@rac12pri2 (LMS1)                                                                                   ON CPU  2017-02-17 08:10:16 2017-02-17 08:14:59                 1
        7      .0    1% | oracle@rac12pri2 (LMS0)                          gcs log flush sync                       Other           WAITING 2017-02-17 08:10:43 2017-02-17 08:14:51                 1
        5      .0    0% | oracle@rac12pri1 (LMS0)                          gcs log flush sync                       Other           WAITING 2017-02-17 08:10:23 2017-02-17 08:13:48                 1
        5      .0    0% | oracle@rac12pri2 (LMS0)                                                                                   ON CPU  2017-02-17 08:10:00 2017-02-17 08:13:24                 1
        5      .0    0% | oracle@rac12pri2 (NSV1)                                                                                   ON CPU  2017-02-17 08:12:43 2017-02-17 08:14:44                 1
        4      .0    0% | oracle@rac12pri1 (DBW0)                                                                                   ON CPU  2017-02-17 08:11:16 2017-02-17 08:13:35                 1
        3      .0    0% | oracle@rac12pri2 (CJQ0)                                                                                   ON CPU  2017-02-17 08:12:39 2017-02-17 08:12:51                 2
        3      .0    0% | oracle@rac12pri2 (RSM0)                                                                                   ON CPU  2017-02-17 08:11:44 2017-02-17 08:14:44                 3
        2      .0    0% | oracle@rac12pri1 (LMON)                          control file sequential read             System I/O      WAITING 2017-02-17 08:11:51 2017-02-17 08:14:01                 1
        2      .0    0% | oracle@rac12pri1 (LMS1)                          gcs log flush sync                       Other           WAITING 2017-02-17 08:11:14 2017-02-17 08:11:44                 1
        2      .0    0% | oracle@rac12pri2 (DBW0)                          db file parallel write                   System I/O      WAITING 2017-02-17 08:12:14 2017-02-17 08:14:24                 1
        2      .0    0% | oracle@rac12pri2 (LGWR)                          log file parallel write                  System I/O      WAITING 2017-02-17 08:12:59 2017-02-17 08:14:42                 1
        1      .0    0% | oracle@rac12pri1 (CKPT)                          control file sequential read             System I/O      WAITING 2017-02-17 08:13:46 2017-02-17 08:13:46                 1
        1      .0    0% | oracle@rac12pri1 (DBW0)                          db file parallel write                   System I/O      WAITING 2017-02-17 08:10:28 2017-02-17 08:10:28                 1
        1      .0    0% | oracle@rac12pri1 (LGWR)                          Redo Transport Close                     Other           WAITING 2017-02-17 08:12:22 2017-02-17 08:12:22                 1
        1      .0    0% | oracle@rac12pri1 (LGWR)                          log file sequential read                 System I/O      WAITING 2017-02-17 08:14:16 2017-02-17 08:14:16                 1

20 rows selected.

As you can see the top waits are in wait class “other”: “SYNC Remote Write” and “Redo Transport MISC”. I have pulled an AWR diff report that shows an interesting picture. The first snapshot set represents the full network performance, while the second one shows the effect of the 100ms network delay. Only relevant information is shown.

WORKLOAD REPOSITORY RAC Compare Period Report for

Database Summary
~~~~~~~~~~~~~~~~
                            Database                   Snapshot Ids     Number of Instances    Number of Hosts    Report Total (minutes)
               ----------------------------------   -----------------   -----------------   -----------------   ---------------------
Snapshot Set            Id Name      RAC Block Si      Begin      End   In Repor    Total   In Repor    Total      DB time Elapsed ti
------------   ----------- --------- --- --------   -------- --------   -------- --------   -------- --------   ---------- ----------
First (1st)      481582958 NCDBA     YES     8192        303      304          2        2          2        2         22.1       10.2
Second (2nd)     481582958 NCDBA     YES     8192        301      302          2        2          2        2        141.9       10.6
Database Instances Included In Report
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                                                                                             Elapsed              Instance Avg Activ
Set       DB Id Inst Instance  Release         Host       Startup         Begin Snap Time End Snap Time   Time (min) DB time (m Up Time (h  Sessions Platform
--- ----------- ---- --------- --------------- ---------- --------------- --------------- --------------- ---------- ---------- ---------- --------- ---------------
1st   481582958    1 NCDBA1    12.1.0.2.0      rac12pri1  17-Feb-17 04:00 17-Feb-17 08:59 17-Feb-17 09:09       10.0        2.0        5.2       0.2 Linux x86 64-bi
                   2 NCDBA2    12.1.0.2.0      rac12pri2  17-Feb-17 04:00 17-Feb-17 08:59 17-Feb-17 09:09       10.0       20.1        5.2       2.0 Linux x86 64-bi

2nd   481582958    1 NCDBA1    12.1.0.2.0      rac12pri1  17-Feb-17 04:00 17-Feb-17 08:06 17-Feb-17 08:17       10.5      134.0        4.3      12.8 Linux x86 64-bi
                   2 NCDBA2    12.1.0.2.0      rac12pri2  17-Feb-17 04:00 17-Feb-17 08:07 17-Feb-17 08:17       10.5        7.9        4.3       0.8 Linux x86 64-bi

...

Workload Comparison
~~~~~~~~~~~~~~~~~~~                      1st Per Sec          2nd Per Sec      %Diff              1st Per Txn          2nd Per Txn      %Diff
                                     ---------------      ---------------     ------          ---------------      ---------------     ------
                       DB time:                  2.2                 13.3      514.3                      0.0                  0.2    1,600.0
                      CPU time:                  1.5                  0.5      -66.0                      0.0                  0.0        0.0
           Background CPU time:                  0.5                  0.2      -56.9                      0.0                  0.0        0.0
             Redo size (bytes):            983,488.1            322,026.1      -67.3                  4,164.1              4,192.4        0.7
         Logical read (blocks):             30,773.7              8,299.3      -73.0                    130.3                108.0      -17.1
                 Block changes:              5,665.8              1,886.0      -66.7                     24.0                 24.6        2.3
        Physical read (blocks):                 10.5                 18.3       74.4                      0.0                  0.2      500.0
       Physical write (blocks):                654.9                196.8      -69.9                      2.8                  2.6       -7.6
              Read IO requests:                 10.1                 12.9       27.3                      0.0                  0.2      325.0
             Write IO requests:                503.8                110.4      -78.1                      2.1                  1.4      -32.4
                  Read IO (MB):                  0.1                  0.1       75.0                      0.0                  0.0        0.0
                 Write IO (MB):                  5.1                  1.5      -69.9                      0.0                  0.0        0.0
                  IM scan rows:                  0.0                  0.0        0.0                      0.0                  0.0        0.0
       Session Logical Read IM:
                    User calls:                515.8                188.2      -63.5                      2.2                  2.5       12.4
                  Parses (SQL):                 24.9                 17.7      -28.8                      0.1                  0.2      109.1
             Hard parses (SQL):                  0.1                  0.2       45.5                      0.0                  0.0        0.0
            SQL Work Area (MB):                  0.1                  0.5      253.8                      0.0                  0.0      253.8
                        Logons:                  1.3                  3.9      191.7                      0.0                  0.1      400.0
                Executes (SQL):              2,587.7                853.0      -67.0                     11.0                 11.1        1.4
                  Transactions:                236.2                 76.8      -67.5

...

Top Timed Events   First DB/Inst: NCDBA/NCDBA1 Snaps: 303-304 (Elapsed time: 612.266 sec  DB time: 1326.09 sec),  Second DB/Inst: NCDBA/NCDBA1 Snaps: 301-302 (Elapsed time: 638.691 sec  DB time: 8514.42 sec)
-> Events with a "-" did not make the Top list in this set of snapshots, but are displayed for comparison purposes
-> Summary for all instances

                                               1st                                                                                                2nd
------------------------------------------------------------------------------------------------   ------------------------------------------------------------------------------------------------
Event                          Wait Class           Waits      Time(s)  Avg Time(ms)    %DB time   Event                          Wait Class           Waits      Time(s)  Avg Time(ms)    %DB time
------------------------------ ------------- ------------ ------------ ------------- -----------   ------------------------------ ------------- ------------ ------------ ------------- -----------
 CPU time                                             N/A        919.5           N/A        69.3    log file sync                 Commit              49,010      7,976.7         162.8        93.7
 log file sync                 Commit             144,284        270.4           1.9        20.4    Redo Transport MISC           Other               11,392      1,192.0         104.6        14.0
 log file parallel write       System I/O         141,581        156.6           1.1        11.8    SYNC Remote Write             Other               11,392      1,179.1         103.5        13.8
 gc current block 2-way        Cluster             98,483         71.9           0.7         5.4    CPU time                                             N/A        328.7           N/A         3.9
 gc cr block 2-way             Cluster             71,469         47.9           0.7         3.6    gcs log flush sync            Other                4,480         56.0          12.5         0.7
 target log write size         Other               22,284         26.5           1.2         2.0    gc current block 2-way        Cluster             62,843         43.3           0.7         0.5
 gc current grant busy         Cluster             22,561         16.8           0.7         1.3    gc current block busy         Cluster                284         39.3         138.4         0.5
 control file sequential read  System I/O           8,497          6.0           0.7         0.5    log file switch (checkpoint i Configuration           75         37.1         495.1         0.4
 db file sequential read       User I/O             5,862          5.4           0.9         0.4    rdbms ipc reply               Other                   32         27.1         845.4         0.3
 log file sequential read      System I/O          14,895          4.8           0.3         0.4    gc cr block 2-way             Cluster             39,015         26.0           0.7         0.3
-log file switch (checkpoint i Configuration          212          3.6          17.2         0.3   -log file parallel write       System I/O          11,267         20.3           1.8         0.2
-gcs log flush sync            Other                  461          0.4           0.9         0.0   -gc current grant busy         Cluster             12,427          8.2           0.7         0.1
-gc current block busy         Cluster                126          0.3           2.7         0.0   -control file sequential read  System I/O           7,326          5.0           0.7         0.1
-                                                     N/A          N/A           N/A         N/A   -db file sequential read       User I/O             6,615          4.7           0.7         0.1
-                                                     N/A          N/A           N/A         N/A   -log file sequential read      System I/O             174          0.4           2.4         0.0
                          --------------------------------------------------------------------------------------------------------------------

This doesn’t look too good, does it? The throughput is down substantially, which is visible in Swingbench itself: in the first example I managed to achieve 255 TPS while I’m averaging 90-95 in the synch + delay example. The numbers from the AWR excerpt shown above confirm this, and allows you to work out where the problem is.

Here is a print screen taken from OEM (that’s a different benchmark execution), showing the same situation. Make sure you are licensed appropriately to use this view.

oemexpress_with_latency

Summary Part II

So yes, there is a very noticeable affect of the network delay, and the AWR report shows the relevant wait events in 12c indicating where time is spent.

In my opinion it’s a bit unfortunate to have the new waits in the “Other” wait class, but that’s how it is.

References

http://www.oracle.com/technetwork/database/availability/sync-2437177.pdf

New Events for Data Guard and Synchronous Redo Transport in 12c (1)

I may have said it before but I consider presenting and teaching a great way to expand one’s knowledge: first of all it requires me to really understand a subject. Secondly, when presenting, you get lots of interesting questions that can turn into blog posts like this one.

Lately I have been asked about the impact of synchronous log shipping to a physical standby database. I was sure there was an effect to be observed, depending most likely on the network latency between systems but I didn’t have any evidence I could pull out of the hat to back up my thoughts. So what better than trying! I also read that some of the events have changed in 12c, and wanted to make them visible. My environment is based on the 2 node RAC primary/2 node RAC standby configuration I wrote about in my previous posts.

Since their initial setup I upgraded the cluster to 12.1.0.2.170117 for Clusterware and RDBMS.

I am planning to share my findings in two posts: the first one (the one you are reading) will present the baseline, the second one to follow shows how performance can be impacted by poor network performance. I guess the second one will be a bit longer …

Background

Data Guard allows the administrator to define the databases in a Data Guard configuration to be in one of three possible modes. The first one is Maximum Performance, the second Maximum Availability and finally Maximum Protection. All three modes are covered in the official documentation set and MOS. Without saying it very scientifically:

  • Maximum Performance uses asynchronous log shipping and is the default. The performance of the standby database (or rather lack thereof) should not impact the performance of the primary. At a cost: it is not guaranteed that redo is actually written to the standby when the commit command returns.
  • Maximum Protection allows you to enforce a Zero Data Loss (ZDL) strategy but also at a cost: The primary will shut down if it can’t transmit redo information to the standby. That is how you prevent data loss!
  • Maximum Availability is the middle ground and I’ll focus on that protection mode here.

Make sure to read the documentation and understand the implications of each protection mode when assessing the most suitable mode for your environment!

Background: New Events

Before going into details I should probably point out that there are some new Data Guard events. Hurray! More events and more meaningful names, great. Just one caveat-have a look at the output from my 12c database:

select name, wait_class from v$event_name
where name in (
'ARCH Remote Write','ASYNC Remote Write','Redo Transport Attach',
'Redo Transport Close','Redo Transport Detach','Redo Transport Open',
'Redo Transport Ping','Redo Transport Slave Shutdown','Redo Transport Slave Startup',
'Redo Writer Remote Sync Complete', 'Redo Writer Remote Sync Notify',
'Remote SYNC Ping','SYNC Remote Write');

NAME                                               WAIT_CLASS
-------------------------------------------------- ----------------------------------------
ASYNC Remote Write                                 Other
SYNC Remote Write                                  Other
ARCH Remote Write                                  Other
Redo Transport Attach                              Other
Redo Transport Detach                              Other
Redo Transport Open                                Other
Redo Transport Close                               Other
Redo Transport Ping                                Other
Remote SYNC Ping                                   Other
Redo Transport Slave Startup                       Other
Redo Transport Slave Shutdown                      Other
Redo Writer Remote Sync Notify                     Other
Redo Writer Remote Sync Complete                   Other

13 rows selected.

Compare this to the pre 12c events (taken from an 11.2.0.3 database since they are mostly gone in 12c)

select name, wait_class from v$event_name
where name in (
'ARCH wait on ATTACH','ARCH wait on SENDREQ','ARCH wait on DETACH',
'LNS wait on ATTACH', 'LNS wait on SENDREQ','LNS wait on DETACH',
'LGWR wait on LNS','LNS wait on LGWR','LGWR-LNS wait on channel');

NAME                                               WAIT_CLASS
-------------------------------------------------- --------------------
ARCH wait on ATTACH                                Network
LNS wait on ATTACH                                 Network
LNS wait on SENDREQ                                Network
LNS wait on DETACH                                 Network
ARCH wait on SENDREQ                               Network
ARCH wait on DETACH                                Network
LNS wait on LGWR                                   Network
LGWR wait on LNS                                   Network
LGWR-LNS wait on channel                           Other

9 rows selected.

For the sake of completeness here are those events that remained in 12c:

select name, wait_class from v$event_name
where name in (
'ARCH wait on ATTACH','ARCH wait on SENDREQ','ARCH wait on DETACH',
'LNS wait on ATTACH', 'LNS wait on SENDREQ','LNS wait on DETACH',
'LGWR wait on LNS','LNS wait on LGWR','LGWR-LNS wait on channel');

NAME                           WAIT_CLASS
------------------------------ ------------------------------
LNS wait on LGWR               Network
LGWR wait on LNS               Network
LGWR-LNS wait on channel       Other

Bugger. The new events are all in the “Other” wait class. My initial attempts at filtering information from v$event_histogram based on events with the “Network” class had little success. I guess friends of the OEM performance pages might also find that interesting. If you wonder from where I pulled these events – I used a white paper named “Best Practices for Synchronous Redo Transport” as reference.

Testing: The Environment

As you read in the introduction, this environment is based on a two node RAC primary -> two node RAC standby configuration:

DGMGRL> show configuration verbose

Configuration - ractest

  Protection Mode: MaxPerformance
  Members:
  NCDBA - Primary database
    NCDBB - Physical standby database

  Properties:
    FastStartFailoverThreshold      = '30'
    OperationTimeout                = '30'
    TraceLevel                      = 'USER'
    FastStartFailoverLagLimit       = '30'
    CommunicationTimeout            = '180'
    ObserverReconnect               = '0'
    FastStartFailoverAutoReinstate  = 'TRUE'
    FastStartFailoverPmyShutdown    = 'TRUE'
    BystandersFollowRoleChange      = 'ALL'
    ObserverOverride                = 'FALSE'
    ExternalDestination1            = ''
    ExternalDestination2            = ''
    PrimaryLostWriteAction          = 'CONTINUE'

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS
DGMGRL> show database 'NCDBA';

Database - NCDBA

  Role:               PRIMARY
  Intended State:     TRANSPORT-ON
  Instance(s):
    NCDBA1
    NCDBA2

Database Status:
SUCCESS

DGMGRL> show database 'NCDBB'

Database - NCDBB

  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      0 seconds (computed 0 seconds ago)
  Apply Lag:          0 seconds (computed 0 seconds ago)
  Average Apply Rate: 177.00 KByte/s
  Real Time Query:    OFF
  Instance(s):
    NCDBB1
    NCDBB2 (apply instance)

Database Status:
SUCCESS

DGMGRL> show database 'NCDBA' LogXptMode
  LogXptMode = 'async'
DGMGRL> show database 'NCDBB' LogXptMode
  LogXptMode = 'async'

The configuration is currently set to “Maximum Performance”, databases ship redo asynchronously.

All systems are patched to the January 2017 Proactive Bundle Patch:

[oracle@rac12pri1 ~]$ $ORACLE_HOME/OPatch/opatch lspatches
24917972;Database PSU 12.1.0.2.170117, Oracle JavaVM Component (JAN2017)
24732082;Database Patch Set Update : 12.1.0.2.170117 (24732082)
24828633;OCW Patch Set Update : 12.1.0.2.170117 (24828633)

OPatch succeeded.

Testing part I: The Baseline

I am using Swingbench’s Order Entry benchmark to generate a little bit of load. For the avoidance of doubt: I am not interested in a comparison between asynchronous and synchronous log shipping. I would like to see the effect caused by the difference in latency in the form of hugely increased waits. In my first test I am running with the defaults of my VM (the NIC uses virtio drivers), no change to SDU or send/receive buffers.

$ date
Fri Feb 17 13:58:58 UTC 2017
$ ./charbench -cs //rac12pri-scan/swingbench_both \
> -uc 20 -r results_20_maxperf.xml -rt 00:10
Author  :        Dominic Giles
Version :        2.5.0.971

Results will be written to results_20_maxperf.xml.
Hit Return to Terminate Run...

Time            Users   TPM     TPS

1:59:55 PM      20      3852    255

On the apply side this translates to the following figures (the average apply rate is misleading and should be ignored; it captures all apply since MRP0 was started)

DGMGRL> show database 'NCDBB'

Database - NCDBB

  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      0 seconds (computed 0 seconds ago)
  Apply Lag:          0 seconds (computed 0 seconds ago)
  Average Apply Rate: 713.00 KByte/s
  Real Time Query:    OFF
  Instance(s):
    NCDBB1
    NCDBB2 (apply instance)

Database Status:
SUCCESS

There is no transport lag, and no apply lag either. When checking for waits on background processes (using Tanel Poder’s ashtop – which requires you to have the Diagnostic Pack license as it access v$active_session_history) I noticed the following:

SQL> @ashtop program,event,wait_class,session_state session_type='BACKGROUND' "to_date('17.02.2017 09:02','dd.mm.yyyy hh24:mi')" "to_date('17.02.2017 09:07','dd.mm.yyyy hh24:mi')"

    Total
  Seconds     AAS %This   PROGRAM                                          EVENT                                    WAIT_CLASS      SESSION FIRST_SEEN          LAST_SEEN           DIST_SQLEXEC_SEEN
--------- ------- ------- ------------------------------------------------ ---------------------------------------- --------------- ------- ------------------- ------------------- -----------------
       53      .2   29% | oracle@rac12pri2 (LGWR)                          log file parallel write                  System I/O      WAITING 2017-02-17 09:03:42 2017-02-17 09:06:53                 1
       23      .1   13% | oracle@rac12pri2 (LG00)                          log file parallel write                  System I/O      WAITING 2017-02-17 09:02:03 2017-02-17 09:03:26                 1
       19      .1   10% | oracle@rac12pri2 (DBW0)                                                                                   ON CPU  2017-02-17 09:02:07 2017-02-17 09:06:39                 1
       12      .0    7% | oracle@rac12pri2 (LGWR)                          target log write size                    Other           WAITING 2017-02-17 09:02:03 2017-02-17 09:03:27                 1
        9      .0    5% | oracle@rac12pri1 (LMS1)                                                                                   ON CPU  2017-02-17 09:04:37 2017-02-17 09:06:58                 1
        8      .0    4% | oracle@rac12pri2 (TT00)                                                                                   ON CPU  2017-02-17 09:02:15 2017-02-17 09:06:56                 1
        6      .0    3% | oracle@rac12pri1 (LG00)                          log file parallel write                  System I/O      WAITING 2017-02-17 09:03:08 2017-02-17 09:06:44                 1
        6      .0    3% | oracle@rac12pri2 (LGWR)                                                                                   ON CPU  2017-02-17 09:04:05 2017-02-17 09:06:41                 1
        5      .0    3% | oracle@rac12pri2 (LG00)                                                                                   ON CPU  2017-02-17 09:02:02 2017-02-17 09:03:27                 1
        5      .0    3% | oracle@rac12pri2 (LMS0)                                                                                   ON CPU  2017-02-17 09:02:07 2017-02-17 09:06:14                 1
        5      .0    3% | oracle@rac12pri2 (LMS1)                                                                                   ON CPU  2017-02-17 09:02:24 2017-02-17 09:06:28                 1
        4      .0    2% | oracle@rac12pri1 (LMS0)                                                                                   ON CPU  2017-02-17 09:04:21 2017-02-17 09:06:05                 1
        3      .0    2% | oracle@rac12pri1 (LGWR)                          LGWR all worker groups                   Other           WAITING 2017-02-17 09:03:08 2017-02-17 09:06:13                 1
        2      .0    1% | oracle@rac12pri1 (DBW0)                          db file parallel write                   System I/O      WAITING 2017-02-17 09:03:08 2017-02-17 09:03:42                 1
        2      .0    1% | oracle@rac12pri2 (CKPT)                                                                                   ON CPU  2017-02-17 09:05:10 2017-02-17 09:05:13                 1
        2      .0    1% | oracle@rac12pri2 (TT00)                          log file sequential read                 System I/O      WAITING 2017-02-17 09:03:17 2017-02-17 09:03:46                 1
        1      .0    1% | oracle@rac12pri1 (CJQ0)                                                                                   ON CPU  2017-02-17 09:04:44 2017-02-17 09:04:44                 1
        1      .0    1% | oracle@rac12pri1 (DBW0)                                                                                   ON CPU  2017-02-17 09:03:05 2017-02-17 09:03:05                 1
        1      .0    1% | oracle@rac12pri1 (LG00)                                                                                   ON CPU  2017-02-17 09:02:02 2017-02-17 09:02:02                 1
        1      .0    1% | oracle@rac12pri1 (LGWR)                          target log write size                    Other           WAITING 2017-02-17 09:05:47 2017-02-17 09:05:47                 1

The time difference between charbench and the database can be explained by time zones: my load generator is set to use UTC whereas the database is on EST; the result is a 6 hour time difference. This is a minor oversight on my part and has nothing to do with the location of the machines. In fact, they are all virtualised.

Summary Part I

In this part I tried to lay the foundation for the next one where I’m using my lab environment to simulate an artificially high network latency. I have to admit again that my setup is somewhat over-simplified-using Data Guard with asynchronous log shipping and Maximum Performance mode is not something I’d like to support for anything more important than my toy databases in the lab.

References

http://www.oracle.com/technetwork/database/availability/sync-2437177.pdf

TABLE ACCESS INMEMORY FULL – but there may be more

While preparing to teach a class this week I have had the time to look into the In Memory (cost) option a bit closer. I noticed a few interesting things along the way and would like to share one of these here.

Background

One of the questions I was asking myself was:

“What happens if I scan segments that are within the IM area, and some are not?”

I was primarily thinking of joins in a DWH environment, but in order to keep the test case simple enough and reproducible I decided to go with a partitioned table where the current partition is assigned to the IMCS, and the rest is not. For this test I’m relying on the Swingbench SH schema. All of this runs on my laptop in a VM so I had to be selective when it comes to indexes. I also chose to NOT partition the tables at this stage, I wanted to chose my own partitioning scheme. For reference, here is the command that created the SH schema:

[oracle@oraclelinux7 bin]$ ./shwizard -cl -create -cs //localhost/sh_pdb \
> -dba system -dbap somePassword -nopart -pkindexes -scale 1 -tc 2 -u sh -p sh
...
============================================
|           Datagenerator Run Stats        |
============================================
Connection Time                        0:00:00.005
Data Generation Time                   0:04:35.890
DDL Creation Time                      0:00:14.063
Total Run Time                         0:04:49.962
Rows Inserted per sec                       64,707
Data Generated (MB) per sec                    5.3
Actual Rows Generated                   17,848,007

...

I wanted to create a range partitioned version of the sales table. After the SH-wizard completed the data distribution is as shown:

SQL> select count(*), trunc(time_id,'yy') year from sales group by trunc(time_id,'yy') order by year;

  COUNT(*) YEAR
---------- ---------
    130653 01-JAN-95
    149319 01-JAN-96
    251974 01-JAN-97
    326632 01-JAN-98
    365547 01-JAN-99
    388318 01-JAN-00
    393919 01-JAN-01
    406703 01-JAN-02
    483605 01-JAN-03
    566384 01-JAN-04
    422289 01-JAN-05
    619858 01-JAN-06
    782244 01-JAN-07
   1099551 01-JAN-08
   1249340 01-JAN-09
   1346025 01-JAN-10
   1690302 01-JAN-11
   2028048 01-JAN-12
   2028048 01-JAN-13

19 rows selected.

So based on this information I can create a table that has data from 2013 in the IMCS and everything else will be excluded from it. For the sake of the following discussion 2013 is considered the “current” partition. Here is the partitioned sales table’s DDL with my changes to enable my partitioning scheme.

CREATE TABLE SALES_PART
  (
    PROD_ID            NUMBER NOT NULL ENABLE,
    CUST_ID            NUMBER NOT NULL ENABLE,
    TIME_ID            DATE NOT NULL ENABLE,
    CHANNEL_ID         NUMBER NOT NULL ENABLE,
    PROMO_ID           NUMBER NOT NULL ENABLE,
    QUANTITY_SOLD      NUMBER(10,2) NOT NULL ENABLE,
    SELLER             NUMBER(6,0) NOT NULL ENABLE,
    FULFILLMENT_CENTER NUMBER(6,0) NOT NULL ENABLE,
    COURIER_ORG        NUMBER(6,0) NOT NULL ENABLE,
    TAX_COUNTRY        VARCHAR2(3) NOT NULL ENABLE,
    TAX_REGION         VARCHAR2(3),
    AMOUNT_SOLD        NUMBER(10,2) NOT NULL ENABLE
  )
  partition BY range (time_id)
  (
    partition p_old VALUES less than       (DATE '2001-01-01'),
    partition p_2000_2010 VALUES less than (DATE '2011-01-01'),
    partition p_2011 VALUES less than      (DATE '2012-01-01'),
    partition p_2012 VALUES less than      (DATE '2013-01-01'),
    partition p_2013 VALUES less than      (DATE '2014-01-01') inmemory
  )
  TABLESPACE SH;

SQL> alter session enable parallel dml;

Session altered.

SQL> insert /*+ parallel append */ into sales_part select /*+ parallel */ * from sales
  2  /

14728759 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(user,'SALES_PART')

PL/SQL procedure successfully completed.

SQL> select partition_name, num_rows, inmemory
  2  from user_tab_partitions
  3  where table_name = 'SALES_PART'
  4  order by partition_position;

PARTITION_NAME                   NUM_ROWS INMEMORY
------------------------------ ---------- --------
P_OLD                             1612443 DISABLED
P_2000_2010                       7369918 DISABLED
P_2011                            1690302 DISABLED
P_2012                            2028048 DISABLED
P_2013                            2028048 ENABLED

So that should give me what I needed. The IMCS was now populated with the segment as well:

SQL> select segment_name, partition_name, bytes_not_populated, populate_status from v$im_user_segments;

SEGMENT_NAME         PARTITION_NAME                 BYTES_NOT_POPULATED POPULATE_
-------------------- ------------------------------ ------------------- ---------
SALES_PART           P_2013                                           0 COMPLETED

Test

With the setup work complete I am ready to test. First of all, a simple SQL trace should show me what is happening. Note that I am specifically targeting the “current” partition.

SQL> sho user
USER is "SH"
SQL> alter session set events 'sql_trace level 8';

Session altered.

SQL> select /* current_partition */ count(*) from sales_part
  2  where time_id > DATE '2013-01-02'
  3  and promo_id = 316;

  COUNT(*)
----------
      3947

SQL> alter session set events 'sql_trace off';

Session altered.

SQL> select value from v$diag_info where name like 'Def%';

VALUE
--------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/cdb/CDB/trace/CDB_ora_14034.trc

The tkprof’d trace shows this result:

********************************************************************************

select /* current_partition */ count(*) from sales_part
where time_id > DATE '2013-01-02'
and promo_id = 316

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          2          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.00          0          3          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.01       0.01          0          5          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 117  
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=3 pr=0 pw=0 time=8441 us)
      3947       3947       3947   PARTITION RANGE SINGLE PARTITION: 5 5 (cr=3 pr=0 pw=0 time=7583 us cost=214 size=48252 card=4021)
      3947       3947       3947    TABLE ACCESS INMEMORY FULL SALES_PART PARTITION: 5 5 (cr=3 pr=0 pw=0 time=6972 us cost=214 size=48252 card=4021)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       2        0.00          0.00
  SQL*Net message from client                     2        6.79          6.79
********************************************************************************

As you can see, my partitioning strategy paid off-only the 5th partition is accessed (that’s p_2013 or the “current” partition based on the data). This partition is the one entirely found in the IMCS.

A result like the one above is what I’d expect and the access path “TABLE ACCESS INMEMORY FULL” does it justice. But now the actual reason for the blog post: mixing scans against segments in memory and on disk. Logging on again I ran the same query but without the restriction to 2013.

SQL> alter session set events 'sql_trace level 8';

Session altered.

SQL> select /* every_partition */ count(*) from sales_part
  2  where promo_id = 316;

  COUNT(*)
----------
     28993

SQL> alter session set events 'sql_trace off';

Session altered.

SQL> select value from v$diag_info where name like 'Def%';

VALUE
----------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/cdb/CDB/trace/CDB_ora_14437.trc

The tkprof’d result is shown here:

select /* every_partition */ count(*) from sales_part
where promo_id = 316

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      3.15       3.84      94878      94901          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      3.15       3.84      94878      94901          0           1

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 117  
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=94901 pr=94878 pw=0 time=3848392 us)
     28993      28993      28993   PARTITION RANGE ALL PARTITION: 1 5 (cr=94901 pr=94878 pw=0 time=4503583 us cost=26217 size=117128 card=29282)
     28993      28993      28993    TABLE ACCESS INMEMORY FULL SALES_PART PARTITION: 1 5 (cr=94901 pr=94878 pw=0 time=4254867 us cost=26217 size=117128 card=29282)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       2        0.00          0.00
  Disk file operations I/O                        4        0.00          0.00
  db file sequential read                         6        0.00          0.01
  direct path read                               42        0.01          0.18
  SQL*Net message from client                     2        7.63          7.63
********************************************************************************

Again the access path is clearly indicated as “TABLE ACCESS INMEMORY FULL SALES_PART”. This time it references all 5 partitions (which is expected since I didn’t have a date in my predicate list). And I can see direct path reads plus some other I/O related information! Direct Path Reads are quite likely going to be Smart Scans on Exadata by the way…

Of course “TABLE ACCESS INMEMORY FULL” is correct (because one partition is accessed that way), but I guess that you cannot see the disk-based I/O against the other segments from the regular execution plan as shown in SQLPlus for example .

Is there any other instrumentation I could use?

The tkprof output shows that data retrieval is possible for segments that are not (entirely) found in the IMCS. Where else could I learn about this fact? Session statistics (V$STATNAME, V$SESSTAT, etc) are another useful source of information. I am a great fan of session snapper (who isn’t?). Snapper can be used to display the change in the session counters while the session you are monitoring is running. Another option is Adrian Billington’s mystats script. It can help you capture the changes in session statistics during the execution of a SQL command. I executed the same query again, sandwiched into calls to mystats. Note that the statistics shown next do not match those of to the execution of the query above. I also tried to limit the output only to the relevant ones, output is sorted by statistic name.


------------------------------------------------------------------------------------------
2. Statistics Report
------------------------------------------------------------------------------------------
Type    Statistic Name                                                               Value
------  ----------------------------------------------------------------  ----------------

STAT    IM scan CUs columns accessed                                                     4
STAT    IM scan CUs columns theoretical max                                             48
STAT    IM scan CUs memcompress for query low                                            4
STAT    IM scan CUs predicates applied                                                   4
STAT    IM scan CUs predicates received                                                  4
STAT    IM scan CUs split pieces                                                         4
STAT    IM scan bytes in-memory                                                 47,683,308
STAT    IM scan bytes uncompressed                                              99,118,124
STAT    IM scan rows                                                             2,028,048
STAT    IM scan rows projected                                                       3,963
STAT    IM scan rows valid                                                       2,028,048
STAT    IM scan segments minmax eligible                                                 4

STAT    consistent gets                                                             95,093
STAT    consistent gets direct                                                      94,872
STAT    consistent gets examination                                                      4
STAT    consistent gets examination (fastpath)                                           4
STAT    consistent gets from cache                                                     221
STAT    consistent gets pin                                                            217
STAT    consistent gets pin (fastpath)                                                 217

STAT    logical read bytes from cache                                            9,666,560
STAT    no work - consistent read gets                                              94,871

STAT    physical read IO requests                                                      749
STAT    physical read bytes                                                    777,191,424
STAT    physical read total IO requests                                                749
STAT    physical read total bytes                                              777,191,424
STAT    physical read total multi block requests                                       749
STAT    physical reads                                                              94,872
STAT    physical reads direct                                                       94,872

STAT    session logical reads                                                      111,201
STAT    session logical reads - IM                                                  15,149

STAT    table scan blocks gotten                                                    94,872
STAT    table scan disk non-IMC rows gotten                                     12,700,711
STAT    table scan rows gotten                                                  14,728,759
STAT    table scans (IM)                                                                 1
STAT    table scans (direct read)                                                        4
STAT    table scans (long tables)                                                        5

STAT    temp space allocated (bytes)                                             1,048,576

STAT    workarea executions - optimal                                                    1
STAT    workarea memory allocated                                                        5
------------------------------------------------------------------------------------------
3. About
------------------------------------------------------------------------------------------
- MyStats v2.01 by Adrian Billington (http://www.oracle-developer.net)
- Based on the SNAP_MY_STATS utility by Jonathan Lewis
==========================================================================================
End of report
==========================================================================================

The highlights are:

  • You can see the compression ratio of the data in the IMCS (IM scan bytes in-memory and IM scan bytes uncompressed)
  • All rows in the partition were valid (the number of IM Scan rows equals the number of rows in the partition)
  • One segment was scanned using IM, and 4 were scanned as direct (path) reads. A total of 5 segments were scanned

Hope this helps!

Little things worth knowing: automatic generation of extended statistics in 12c Part II

In my last post I demonstrated an example how to use a PL/SQL package to monitor a workload in order to identify correlated columns. Such columns are good candidates for the creation of extended statistics since the optimiser does not assume that information stored in different columns may be related. Before starting my tests I invoked DBMS_STATS.SEED_COL_USAGE, which in turn increased Oracle’s attention level trying to find correlated columns. Eventually these have been identified (as shown in the output of DBMS_STATS.REPORT_COL_USAGE) and a subsequent call to DBMS_STATS.GATHER_TABLE_STATS caused extended statistics to be created, including histograms. This is one way you can get extended statistics automatically, but it requires you to enable monitoring of a workload by invoking a PL/SQL API. In other words, it is a conscious effort.

More Ways to Extended Statistics

The other way of ending up with extended statistics is more subtle as it does not require user intervention. As described in the documentation (Database SQL Tuning Guide, chapter 13 “Managing Optimizer Statistics: Advanced Topics”) there is another way to gather extended statistics automatically. It is based on statistics feedback and SQL Plan Directives, both new to 12c (actually statistics feedback is not quite so new, it seems to be an evolution of cardinality feedback).

Demo

To start with a clean sheet I dropped the table I had used previously, which got rid of all the extended stats and SQL Plan Directives (SPD) from the previous example. To be really thorough I also flushed the shared pool.

SQL> drop table martin.customers purge;

Table dropped.

SQL> select * from dba_stat_extensions where owner = user;

no rows selected

SQL> select count(*) from DBA_SQL_PLAN_DIR_OBJECTS where owner = user;

  COUNT(*)
----------
         0

SQL> alter system flush shared_pool;

System altered.

BTW I have switched environments to an Oracle Restart 12.1.0.2 database with the JUL 2015 PSU applied. The second patch you see here is the OJVM patch.

SQL> select patch_id, version, action from dba_registry_sqlpatch;

  PATCH_ID VERSION              ACTION
---------- -------------------- ---------------
  20831110 12.1.0.2             APPLY
  21068507 12.1.0.2             APPLY

Back to creating the table… I am using the same technique as before, but this time without the call to DBMS_STATS.SEED_COL_USAGE:

SQL> create table martin.customers as select * from sh.customers;

Table created.

SQL> select num_rows, last_analyzed from user_tables where table_name = 'CUSTOMERS';

  NUM_ROWS LAST_ANALYZED
---------- -------------------
     55500 15.10.2015 16:57:42

I love the 12c feature where stats are gathered during a CTAS operation …

And now a few queries to tickle the optimiser. Apologies for the wide output but that makes it so much easier to use RLWRAP and command line history. Credit again to
the blog post by Maria Colgan
for the query examples.

SQL> select /*+ gather_plan_statistics */ count(*) from customers where cust_city = 'Los Angeles' and cust_state_province = 'CA' and country_id = 52790;

  COUNT(*)
----------
       932

SQL>
SQL> select * from table(dbms_xplan.display_cursor(format => 'allstats last'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------
SQL_ID  34zmr3acgz06g, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from customers where
cust_city = 'Los Angeles' and cust_state_province = 'CA' and country_id
= 52790

Plan hash value: 296924608

---------------------------------------------------------------------------------------------------
| Id  | Operation          | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |      1 |        |      1 |00:00:00.03 |    1521 |   1516 |
|   1 |  SORT AGGREGATE    |           |      1 |      1 |      1 |00:00:00.03 |    1521 |   1516 |
|*  2 |   TABLE ACCESS FULL| CUSTOMERS |      1 |      1 |    932 |00:00:00.04 |    1521 |   1516 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("CUST_CITY"='Los Angeles' AND "CUST_STATE_PROVINCE"='CA' AND
              "COUNTRY_ID"=52790))


22 rows selected.

There is nothing too new here-the optimiser’s cardinality estimate is not great. Following the example from the SQL Tuning Guide-chapter 13 (btw country_id is a number, not a varchar2 field) we can now check if there is anything fancy going to happen next. And sure enough there is:

SQL> select sql_id, child_number, is_obsolete, is_reoptimizable, is_resolved_adaptive_plan from v$sql where sql_id = '34zmr3acgz06g';

SQL_ID        CHILD_NUMBER I I I
------------- ------------ - - -
34zmr3acgz06g            0 N Y

The second flag, is_reoptimisable, is interesting. It indicates the optimiser’s intention to apply information found at run-time to the next execution of the cursor. Let’s run the original query again.

SQL> select /*+ gather_plan_statistics */ count(*) from customers where cust_city = 'Los Angeles' and cust_state_province = 'CA' and country_id = 52790;

  COUNT(*)
----------
       932

So what did that do? Let’s have a look at the diagnostics:


SQL> select sql_id, child_number, is_obsolete, is_reoptimizable, is_resolved_adaptive_plan from v$sql where sql_id = '34zmr3acgz06g';

SQL_ID        CHILD_NUMBER I I I
------------- ------------ - - -
34zmr3acgz06g            0 N Y
34zmr3acgz06g            1 N N

SQL> select * from table(dbms_xplan.display_cursor(sql_id=>'34zmr3acgz06g',cursor_child_no=>1,format => 'allstats last'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  34zmr3acgz06g, child number 1
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from customers where
cust_city = 'Los Angeles' and cust_state_province = 'CA' and country_id
= 52790

Plan hash value: 296924608

------------------------------------------------------------------------------------------
| Id  | Operation          | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |      1 |        |      1 |00:00:00.01 |    1521 |
|   1 |  SORT AGGREGATE    |           |      1 |      1 |      1 |00:00:00.01 |    1521 |
|*  2 |   TABLE ACCESS FULL| CUSTOMERS |      1 |    932 |    932 |00:00:00.01 |    1521 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("CUST_CITY"='Los Angeles' AND "CUST_STATE_PROVINCE"='CA' AND
              "COUNTRY_ID"=52790))

Note
-----
   - statistics feedback used for this statement


26 rows selected.

As you can see another child cursor has been created. Why?

SQL> select xmltype(reason) from v$sql_shared_cursor where dbms_lob.getlength(reason) > 1 and sql_id = '34zmr3acgz06g';

XMLTYPE(REASON)
-----------------------------------------------------------------------------------------------------------------------------
<ChildNode>
  <ChildNumber>0</ChildNumber>
  <ID>49</ID>
  <reason>Auto Reoptimization Mismatch(1)</reason>
  <size>3x4</size>
  <kxscflg>32</kxscflg>
  <kxscfl4>4194560</kxscfl4>
  <dnum_kksfcxe>262144</dnum_kksfcxe>
</ChildNode>

Ah – the cursor was created because of a reoptimisation mismatch. Checking against v$sql you can see that the optimiser is now happy with the cardinality estimate (not anticipating further reoptimisation for this statement). Note that depending on the query’s complexity many more child cursors can be created as part of the reoptimisation. Also note that for child cursor 1 the A and E rows are perfect matches. Our work here is done. But wait- what about that column usage?


SQL> select dbms_stats.report_col_usage(user, 'CUSTOMERS') from dual;

DBMS_STATS.REPORT_COL_USAGE(USER,'CUSTOMERS')
--------------------------------------------------------------------------------
LEGEND:
.......

EQ         : Used in single table EQuality predicate
RANGE      : Used in single table RANGE predicate
LIKE       : Used in single table LIKE predicate
NULL       : Used in single table is (not) NULL predicate
EQ_JOIN    : Used in EQuality JOIN predicate
NONEQ_JOIN : Used in NON EQuality JOIN predicate
FILTER     : Used in single table FILTER predicate
JOIN       : Used in JOIN predicate
GROUP_BY   : Used in GROUP BY expression
...............................................................................

###############################################################################

COLUMN USAGE REPORT FOR MARTIN.CUSTOMERS
........................................

1. COUNTRY_ID                          : EQ
2. CUST_CITY                           : EQ
3. CUST_STATE_PROVINCE                 : EQ
###############################################################################

SQL> select * from dba_stat_extensions where owner = user;

no rows selected

Nothing here except some equality predicates! This is the big difference to part I of this article where the filter on the 3 columns was detected thanks to the call to DBMS_STATS.SEED_COL_USAGE.

It took me a while to connect the dots and understand what needed to be done next. In the meantime, hidden from view, and in the background, Oracle created a SQL Plan Directive for that table which I failed to notice for a while. SQL Plan Directives are not persisted immediately, the documentation states they are written to SYSAUX every 15 minutes. I must have thought about this for more than 15 minutes, as you will see shortly. If you are a bit more impatient then force the write of the SPD to disk using DBMS_SPD.FLUSH_SQL_PLAN_DIRECTIVE now.

To verify if there were any SPDs I used a query taken more or less literally from the documentation, and yes, there are:

SQL> SELECT TO_CHAR(d.DIRECTIVE_ID) dir_id, o.OBJECT_NAME,
  2         o.SUBOBJECT_NAME col_name, o.OBJECT_TYPE, d.TYPE, d.STATE, d.REASON
  3  FROM   DBA_SQL_PLAN_DIRECTIVES d, DBA_SQL_PLAN_DIR_OBJECTS o
  4  WHERE  d.DIRECTIVE_ID=o.DIRECTIVE_ID
  5  AND    o.OWNER = user
  6  ORDER BY 1,2,3,4,5;

DIR_ID                         OBJECT_NAME          COL_NAME             OBJECT TYPE             STATE    REASON
------------------------------ -------------------- -------------------- ------ ---------------- -------- ------------------------------------
15273172249382976180           CUSTOMERS            COUNTRY_ID           COLUMN DYNAMIC_SAMPLING USABLE   SINGLE TABLE CARDINALITY MISESTIMATE
15273172249382976180           CUSTOMERS            CUST_CITY            COLUMN DYNAMIC_SAMPLING USABLE   SINGLE TABLE CARDINALITY MISESTIMATE
15273172249382976180           CUSTOMERS            CUST_STATE_PROVINCE  COLUMN DYNAMIC_SAMPLING USABLE   SINGLE TABLE CARDINALITY MISESTIMATE
15273172249382976180           CUSTOMERS                                 TABLE  DYNAMIC_SAMPLING USABLE   SINGLE TABLE CARDINALITY MISESTIMATE

Once you can see the SPD in the dictionary, you can also see them in action when you run a statement with the same predicates but different SQL_ID, as in this example.

SQL> select /*+ gather_plan_statistics new_parse_please */ count(*) from customers where cust_city = 'Los Angeles' and cust_state_province = 'CA' and country_id = 52790;

  COUNT(*)
----------
       932

SQL> select * from table(dbms_xplan.display_cursor(format => 'allstats last'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  85qvryzgzj57q, child number 0
-------------------------------------
select /*+ gather_plan_statistics new_parse_please */ count(*) from
customers where cust_city = 'Los Angeles' and cust_state_province =
'CA' and country_id = 52790

Plan hash value: 296924608

------------------------------------------------------------------------------------------
| Id  | Operation          | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |      1 |        |      1 |00:00:00.01 |    1521 |
|   1 |  SORT AGGREGATE    |           |      1 |      1 |      1 |00:00:00.01 |    1521 |
|*  2 |   TABLE ACCESS FULL| CUSTOMERS |      1 |    829 |    932 |00:00:00.01 |    1521 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("CUST_CITY"='Los Angeles' AND "CUST_STATE_PROVINCE"='CA' AND
              "COUNTRY_ID"=52790))

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - 1 Sql Plan Directive used for this statement


27 rows selected.

The SQL Plan directive stays even if you flush the cursor with SQL ID 34zmr3acgz06g from the cursor cache or supply a different set of predicates. I used Kerry Osborne’s flush_sql.sql script for this.

SQL> @flush_sql 
Enter value for sql_id: 34zmr3acgz06g
old  14:   where sql_id like '&sql_id';
new  14:   where sql_id like '34zmr3acgz06g';

PL/SQL procedure successfully completed.

SQL> select sql_id, child_number, is_obsolete, is_reoptimizable, is_resolved_adaptive_plan from v$sql where sql_id = '34zmr3acgz06g';

no rows selected

SQL> select /*+ gather_plan_statistics */ count(*) from customers where cust_city = 'Los Angeles' and cust_state_province = 'CA' and country_id = 52790;

  COUNT(*)
----------
       932

SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  34zmr3acgz06g, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from customers where
cust_city = 'Los Angeles' and cust_state_province = 'CA' and country_id
= 52790

Plan hash value: 296924608

------------------------------------------------------------------------------------------
| Id  | Operation          | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |      1 |        |      1 |00:00:00.01 |    1521 |
|   1 |  SORT AGGREGATE    |           |      1 |      1 |      1 |00:00:00.01 |    1521 |
|*  2 |   TABLE ACCESS FULL| CUSTOMERS |      1 |    953 |    932 |00:00:00.01 |    1521 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("CUST_CITY"='Los Angeles' AND "CUST_STATE_PROVINCE"='CA' AND
              "COUNTRY_ID"=52790))

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - 1 Sql Plan Directive used for this statement


27 rows selected.


SQL> select /*+ gather_plan_statistics */ count(*) from customers where cust_city = 'Palmdale' and cust_state_province = 'FL' and country_id = 52790;

  COUNT(*)
----------
       250

SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  dbkfpchpfwap3, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from customers where
cust_city = 'Palmdale' and cust_state_province = 'FL' and country_id =
52790

Plan hash value: 296924608

------------------------------------------------------------------------------------------
| Id  | Operation          | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |      1 |        |      1 |00:00:00.01 |    1521 |
|   1 |  SORT AGGREGATE    |           |      1 |      1 |      1 |00:00:00.01 |    1521 |
|*  2 |   TABLE ACCESS FULL| CUSTOMERS |      1 |    287 |    250 |00:00:00.02 |    1521 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("CUST_CITY"='Palmdale' AND "CUST_STATE_PROVINCE"='FL' AND
              "COUNTRY_ID"=52790))

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - 1 Sql Plan Directive used for this statement


27 rows selected.

And NOW you get the update to REPORT_COL_USAGE:

SQL> select dbms_stats.report_col_usage(user, 'CUSTOMERS') from dual;

DBMS_STATS.REPORT_COL_USAGE(USER,'CUSTOMERS')
--------------------------------------------------------------------------------
LEGEND:
.......

EQ         : Used in single table EQuality predicate
RANGE      : Used in single table RANGE predicate
LIKE       : Used in single table LIKE predicate
NULL       : Used in single table is (not) NULL predicate
EQ_JOIN    : Used in EQuality JOIN predicate
NONEQ_JOIN : Used in NON EQuality JOIN predicate
FILTER     : Used in single table FILTER predicate
JOIN       : Used in JOIN predicate
GROUP_BY   : Used in GROUP BY expression
...............................................................................

###############################################################################

COLUMN USAGE REPORT FOR MARTIN.CUSTOMERS
........................................

1. COUNTRY_ID                          : EQ
2. CUST_CITY                           : EQ
3. CUST_STATE_PROVINCE                 : EQ
4. (CUST_CITY, CUST_STATE_PROVINCE,
    COUNTRY_ID)                        : FILTER
###############################################################################

From here on it’s the same as in part 1 of this article. You gather stats, either manually like me or automatically like Oracle would, and the end result are extended statistics on the “filter” shown in line 4.


SQL> exec dbms_stats.gather_table_stats(user, 'CUSTOMERS')

PL/SQL procedure successfully completed.

SQL> select table_name, extension_name, extension from dba_stat_extensions where owner = user;

TABLE_NAME                     EXTENSION_NAME                           EXTENSION
------------------------------ ---------------------------------------- --------------------------------------------------------------------------------
CUSTOMERS                      SYS_STSMZ$C3AIHLPBROI#SKA58H_N           ("CUST_CITY","CUST_STATE_PROVINCE","COUNTRY_ID")

Voila! Extended statistics.

There are many more interesting implications to this whole concept, which is something I’ll write about in another post. It’s incredibly interesting, I can’t believe I’m writing optimiser posts …

PS: thanks to @Mautro for casting an eye over this article!

Little things worth knowing: automatic generation of extended statistics in 12c

When you are migrating to Oracle 12c I hope you might this post useful. I came across this feature when researching what’s new with Oracle 12c (and yes I still find lots of new ones I haven’t noticed before). This one is a bit hidden away in section 2.2.4.3 Automatic Column Group Detection of the 12c New Features Guide. And it’s a lot more complex than I first thought! In this first post I’ll try and show the generation of extended statistics in 12c. I am planning on another post to explain how the rest of the adaptive optimisations that are new with 12c fit into the picture.

What is the motivation?

Previously, in Oracle versions up to 12c you needed to be on the lookout for candidates for correlated columns. Extended statistics on groups of columns allow the optimiser to come up with better cardinality estimates if columns in a table are correlated. Instead of me trying to explain the concept in depth I’d like to link to a post written by Maria Colgan. I strongly recommend you have a look at this post if you haven’t used extended statistics a lot.

When putting my post together I referred to the same SH.CUSTOMERS table as Maria. The queries I am using are based on another excellent post by Maria Colgan that builds on the foundations of the one I just referenced. In fact I really needed a data set that had correlated columns but I couldn’t come up with an example that was easy-to-follow until I found that blog entry. I suggest you have a look at this post first as it explains a nifty tool for helping you finding candidates for extended statistics in 11.2. Oracle 12c takes this concept a step further as you will see.

My test case

My test environment is 12.1.0.2.3 on Exadata (but that shouldn’t matter), and this is how it goes. First I create a table in my schema which is an exact replica of the table in the SH schema.

SQL> create table martin.customers as select * from sh.customers;

Table created.

The next step is to tell Oracle to monitor column usage. I have to accelerate this a bit, Oracle does this over the cause of the day anyway.

SQL> exec dbms_stats.seed_col_usage(null,null,300)

PL/SQL procedure successfully completed.

By calling seed_col_usage() without a SQLSET and OWNER I can report column usage. The PL/SQL API documentation reads:

This procedure also records group of columns. Extensions for the recorded group of columns can be created using the CREATE_EXTENDED_STATS Function procedure. If sqlset_name and owner_name are NULL, it records the column (group) usage information for the statements executed in the system in next time_limit seconds.

That sounds like what is needed. To start with a level playing field I gather statistics again using dbms_stats.gather_table_stats(user, ‘customers’).

Let there be queries

The next step is to run a few queries. Cardinality estimates are not correct-the optimiser does not “know” that L.A. can only be in California in this example. Pay attention to the number of rows returned and the cardinality estimates:

SQL> select count(*) from customers
  2  where cust_city = 'Los Angeles'
  3  and cust_state_province = 'CA'
  4  and country_id = 52790;

  COUNT(*)
----------
       932

Elapsed: 00:00:00.03
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID  ap71092cqnj1y, child number 0
-------------------------------------
select count(*) from customers where cust_city = 'Los Angeles' and
cust_state_province = 'CA' and country_id = 52790

Plan hash value: 296924608

----------------------------------------------------------------------------------------
| Id  | Operation                  | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |           |       |       |   423 (100)|          |
|   1 |  SORT AGGREGATE            |           |     1 |    26 |            |          |
|*  2 |   TABLE ACCESS STORAGE FULL| CUSTOMERS |     1 |    26 |   423   (1)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - storage(("CUST_CITY"='Los Angeles' AND "CUST_STATE_PROVINCE"='CA' AND
              "COUNTRY_ID"=52790))
       filter(("CUST_CITY"='Los Angeles' AND "CUST_STATE_PROVINCE"='CA' AND
              "COUNTRY_ID"=52790))

Note
-----
   - automatic DOP: Computed Degree of Parallelism is 1 because of parallel threshold

27 rows selected.

Following advice I have been given by people who know a lot more about SQL tuning than I will ever comprehend, I gather execution statistics and then display actual and expected rows:

SQL> select /*+ gather_plan_statistics */ count(*) from customers
  2  where cust_city = 'Los Angeles'
  3  and cust_state_province = 'CA'
  4  and country_id = 52790;

  COUNT(*)
----------
       932

SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID  0qmjk7qm3zwkw, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from customers  where
cust_city = 'Los Angeles'  and cust_state_province = 'CA'  and
country_id = 52790

Plan hash value: 296924608

--------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |           |      1 |        |      1 |00:00:00.01 |    1521 |
|   1 |  SORT AGGREGATE            |           |      1 |      1 |      1 |00:00:00.01 |    1521 |
|*  2 |   TABLE ACCESS STORAGE FULL| CUSTOMERS |      1 |      1 |    932 |00:00:00.01 |    1521 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - storage(("CUST_CITY"='Los Angeles' AND "CUST_STATE_PROVINCE"='CA' AND
              "COUNTRY_ID"=52790))
       filter(("CUST_CITY"='Los Angeles' AND "CUST_STATE_PROVINCE"='CA' AND
              "COUNTRY_ID"=52790))

Note
-----
   - automatic DOP: Computed Degree of Parallelism is 1 because of parallel threshold

The number of actual rows in plan line 2 is quite different from what the optimiser expected it to be.

I also executed the next example from Maria’s blog post:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------

SQL_ID  5h284vjm2xtp9, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ country_id, cust_state_province,
count(*) from customers group by country_id, cust_state_province

Plan hash value: 1577413243

-----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |           |      1 |        |    145 |00:00:00.02 |    1521 |       |       |          |
|   1 |  HASH GROUP BY             |           |      1 |   1949 |    145 |00:00:00.02 |    1521 |  1015K|  1015K| 1371K (0)|
|   2 |   TABLE ACCESS STORAGE FULL| CUSTOMERS |      1 |  55500 |  55500 |00:00:00.02 |    1521 |  1025K|  1025K|          |
-----------------------------------------------------------------------------------------------------------------------------

Note
-----
   - automatic DOP: Computed Degree of Parallelism is 1 because of parallel threshold

Again cardinality estimates differ in plan line 1: instead of 1949 rows the optimiser expects to be returned there are only 145. This is a simple example, but quite often incorrect cardinality estimates lead to sub-optimal plan generation (although 12c has another new feature that tries to mitigate the effects of these incorrect cardinality estimates as well)

Column Groups

On the other hand the column usage monitor picked up information based on the “workload”. You can see this in sys.col_group_usage$

SQL> select u.*, o.object_name, object_type
  2  from sys.col_group_usage$ u, dba_objects o
  3  where u.obj# = o.data_object_id
  4  and owner = 'MARTIN';

      OBJ# COLS                           TIMESTAMP      FLAGS OBJECT_NAME                    OBJECT_TYPE
---------- ------------------------------ --------- ---------- ------------------------------ -----------------------
    111118 11,13                          26-JUN-15         36 CUSTOMERS                      TABLE
    111118 9,11,13                        26-JUN-15         33 CUSTOMERS                      TABLE

This is of course not the way to view this information-there’s an API for that.

SQL> select dbms_stats.report_col_usage(user, 'CUSTOMERS') from dual;

DBMS_STATS.REPORT_COL_USAGE(USER,'CUSTOMERS')
----------------------------------------------------------------------------------------------
LEGEND:
.......

EQ         : Used in single table EQuality predicate
RANGE      : Used in single table RANGE predicate
LIKE       : Used in single table LIKE predicate
NULL       : Used in single table is (not) NULL predicate
EQ_JOIN    : Used in EQuality JOIN predicate
NONEQ_JOIN : Used in NON EQuality JOIN predicate
FILTER     : Used in single table FILTER predicate
JOIN       : Used in JOIN predicate
GROUP_BY   : Used in GROUP BY expression
...............................................................................

###############################################################################

COLUMN USAGE REPORT FOR MARTIN.CUSTOMERS
........................................

1. COUNTRY_ID                          : EQ
2. CUST_CITY                           : EQ
3. CUST_STATE_PROVINCE                 : EQ
4. (CUST_CITY, CUST_STATE_PROVINCE,
    COUNTRY_ID)                        : FILTER
5. (CUST_STATE_PROVINCE, COUNTRY_ID)   : GROUP_BY
###############################################################################

So there is a potential candidates for extended stats in line 5. So far, nothing new. In version before 12c you now had to create the column groups manually. Again, please refer to Maria’s post for an example.

The difference in 12c

In 12c you don’t need to create these column groups manually. Consider this sequence of events:

SQL> select * from dba_stat_extensions where owner = 'MARTIN';

no rows selected

Elapsed: 00:00:00.28

SQL> exec dbms_stats.gather_table_stats(user,'CUSTOMERS')

PL/SQL procedure successfully completed.

SQL> select * from user_stat_extensions

TABLE_NAME                     EXTENSION_NAME                 EXTENSION                                          CREATO DRO
------------------------------ ------------------------------ -------------------------------------------------- ------ ---
CUSTOMERS                      SYS_STU#S#WF25Z#QAHIHE#MOFFMM_ ("CUST_STATE_PROVINCE","COUNTRY_ID")               USER   YES
CUSTOMERS                      SYS_STUMZ$C3AIHLPBROI#SKA58H_N ("CUST_CITY","CUST_STATE_PROVINCE","COUNTRY_ID")   USER   YES

2 rows selected.

SQL> SELECT column_name,
  2    num_distinct,
  3    num_buckets,
  4    histogram
  5  FROM dba_tab_col_statistics
  6  WHERE table_name = 'CUSTOMERS'
  7  AND owner        = 'MARTIN'
  8  AND column_name LIKE 'SYS%';

COLUMN_NAME                    NUM_DISTINCT NUM_BUCKETS HISTOGRAM
------------------------------ ------------ ----------- ---------------
SYS_STUMZ$C3AIHLPBROI#SKA58H_N          620         254 HYBRID
SYS_STU#S#WF25Z#QAHIHE#MOFFMM_          145           1 NONE

2 rows selected.

So in the above output you can see that there are no extended stats on the CUSTOMERS table before gathering stats. There is nothing special about the stats gathering command, the preferences are the defaults (reformatted here for readability)

SQL> l
  1  declare
  2    type prefs_t is table of varchar2(150);
  3    prefs prefs_t := prefs_t(
  4      'AUTOSTATS_TARGET','CASCADE','CONCURRENT','DEGREE','ESTIMATE_PERCENT',
  5      'METHOD_OPT','NO_INVALIDATE','GRANULARITY','PUBLISH','INCREMENTAL',
  6      'INCREMENTAL_STALENESS','INCREMENTAL_LEVEL','STALE_PERCENT',
  7      'GLOBAL_TEMP_TABLE_STATS','TABLE_CACHED_BLOCKS','OPTIONS');
  8    prefs_value varchar2(100);
  9  begin
 10    for p in prefs.first .. prefs.last loop
 11      select dbms_stats.get_prefs(prefs(p), user,'CUSTOMERS') into prefs_value from dual;
 12      dbms_output.put_line(prefs(p) || ': ' || prefs_value);
 13    end loop;
 14* end;
SQL> /
AUTOSTATS_TARGET:        AUTO
CASCADE:                 DBMS_STATS.AUTO_CASCADE
CONCURRENT:              MANUAL
DEGREE:                  NULL
ESTIMATE_PERCENT:        DBMS_STATS.AUTO_SAMPLE_SIZE
METHOD_OPT:              FOR ALL COLUMNS SIZE AUTO
NO_INVALIDATE:           DBMS_STATS.AUTO_INVALIDATE
GRANULARITY:             AUTO
PUBLISH:                 TRUE
INCREMENTAL:             FALSE
INCREMENTAL_STALENESS:
INCREMENTAL_LEVEL:       PARTITION
STALE_PERCENT:           10
GLOBAL_TEMP_TABLE_STATS: SESSION
TABLE_CACHED_BLOCKS:     1
OPTIONS:                 GATHER

PL/SQL procedure successfully completed.

Did you see the extended stats have histograms? There is even one of the new fancy “hybrid” ones. The effect is noticeable:

SQL> select /*+ gather_plan_statistics */ count(*) from customers
  2   where cust_city = 'Los Angeles'
  3   and cust_state_province = 'CA'
  4   and country_id = 52790;

SQL> select * from table(dbms_xplan.display_cursor(format => 'allstats last'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------
SQL_ID  0qmjk7qm3zwkw, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from customers  where
cust_city = 'Los Angeles'  and cust_state_province = 'CA'  and
country_id = 52790

Plan hash value: 296924608

--------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |           |      1 |        |      1 |00:00:00.01 |    1521 |
|   1 |  SORT AGGREGATE            |           |      1 |      1 |      1 |00:00:00.01 |    1521 |
|*  2 |   TABLE ACCESS STORAGE FULL| CUSTOMERS |      1 |    969 |    932 |00:00:00.01 |    1521 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - storage(("CUST_CITY"='Los Angeles' AND "CUST_STATE_PROVINCE"='CA' AND
              "COUNTRY_ID"=52790))
       filter(("CUST_CITY"='Los Angeles' AND "CUST_STATE_PROVINCE"='CA' AND
              "COUNTRY_ID"=52790))

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - 1 Sql Plan Directive used for this statement

The cardinality estimate is a lot closer now. What surprised me was the SQL Plan Directive referenced. But this is material for another blog post.

If you use swingbench take note of sbutil

This is going to be a very short post for a change. I have used Swingbench extensively and really love the tool. Many thanks to Dominic Giles!

Recently he announced a new tool on his blog that you can use to inflate your data volume. So instead of using the “-scale” argument when executing oewizard you can just keep the defaults and later on create as much data as you like. Here is an example, the reason for this post.

Setting the Scene

If you have not yet downloaded the latest Swingbench version and JDK 8 do this first. You can get Swingbench from the tool’s landing page and the Java software from java.sun.com. I used Swingbench 2.5.791 (stable). With both deployed, I created the Order Entry schema in lights-out mode:

[oracle@lab bin]$ ./oewizard -dbap secret -u soe -p soe -cl -cs //lab/martin -ts soe_tbs -create -dba system
SwingBench Wizard
Author  :        Dominic Giles
Version :        2.5.0.971

Running in Lights Out Mode using config file : oewizard.xml
The following statement failed : GRANT EXECUTE ON dbms_lock TO soe : Due to : ORA-01031: insufficient privileges

============================================
|           Datagenerator Run Stats        |
============================================
Connection Time                        0:00:00.006
Data Generation Time                   0:01:06.110
DDL Creation Time                      0:00:30.346
Total Run Time                         0:01:36.465
Rows Inserted per sec                      183,121
Data Generated (MB) per sec                   14.9
Actual Rows Generated                   13,002,138

Post Creation Validation Report
===============================
The schema appears to have been created successfully.

Valid Objects
=============
Valid Tables : 'ORDERS','ORDER_ITEMS','CUSTOMERS','WAREHOUSES','ORDERENTRY_METADATA','INVENTORIES',
'PRODUCT_INFORMATION','PRODUCT_DESCRIPTIONS','ADDRESSES','CARD_DETAILS'
Valid Indexes : 'PRD_DESC_PK','PROD_NAME_IX','PRODUCT_INFORMATION_PK','PROD_SUPPLIER_IX','PROD_CATEGORY_IX',
'INVENTORY_PK','INV_PRODUCT_IX','INV_WAREHOUSE_IX','ORDER_PK','ORD_SALES_REP_IX',
'ORD_CUSTOMER_IX','ORD_ORDER_DATE_IX','ORD_WAREHOUSE_IX','ORDER_ITEMS_PK','ITEM_ORDER_IX',
'ITEM_PRODUCT_IX','WAREHOUSES_PK','WHS_LOCATION_IX','CUSTOMERS_PK','CUST_EMAIL_IX',
'CUST_ACCOUNT_MANAGER_IX','CUST_FUNC_LOWER_NAME_IX','ADDRESS_PK','ADDRESS_CUST_IX',
'CARD_DETAILS_PK','CARDDETAILS_CUST_IX'
Valid Views : 'PRODUCTS','PRODUCT_PRICES'
Valid Sequences : 'CUSTOMER_SEQ','ORDERS_SEQ','ADDRESS_SEQ','LOGON_SEQ','CARD_DETAILS_SEQ'
Valid Code : 'ORDERENTRY'
Schema Created

More Data!

The next step is to invoke sbutil:

[oracle@lab bin]$ ./sbutil -u soe -p soe  -cs //lab/martin -soe parallel 12 -dup 4
Getting table Info
Got table information. Completed in : 0:00:01.558
Dropping Indexes
Dropped Indexes. Completed in : 0:00:03.252
Creating copies of tables
Created copies of tables. Completed in : 0:00:00.042
Begining data duplication
Completed Iteration 4. Completed in : 0:00:07.942
Creating  Constraints
Created  Constraints. Completed in : 0:00:28.737
Creating  Indexes
Created  Indexes. Completed in : 0:00:54.541
Updating Metadata and Recompiling Code
Updated Metadata. Completed in : 0:00:00.754
Updating Sequences
Updated Sequences. Completed in : 0:00:00.792
Determining New Row Counts
Got New Row Counts. Completed in : 0:00:03.132
Completed Data Duplication in 0 hour(s) 2 minute(s) 8 second(s) 454 millisecond(s)
----------------------------------------------------------------------------------------------------------
|Table Name          |  Original Row Count|       Original Size|       New Row Count|            New Size|
----------------------------------------------------------------------------------------------------------
|ORDER_ITEMS         |           4,288,582|              280 MB|          17,154,328|              1.1 GB|
|CUSTOMERS           |           1,000,000|              136 MB|           4,000,000|            552.4 MB|
|CARD_DETAILS        |           1,500,000|               80 MB|           6,000,000|            337.8 MB|
|ORDERS              |           1,429,790|              160 MB|           5,719,160|              664 MB|
|ADDRESSES           |           1,500,000|              184 MB|           6,000,000|            571.8 MB|
----------------------------------------------------------------------------------------------------------
|Total               |                    |              840 MB|                    |              3.2 GB|
----------------------------------------------------------------------------------------------------------

How Cool is That?

Really, it is. And look at the timings. This didn’t take too long, even on my poor old AMD 6200 series lab server.

Additional information on Oracle 12c big table caching

Teaching is on the things I like doing, and currently I am investigating the Oracle 12c features around caching data in the various memory areas. Since the In-Memory (cost) option has been discussed by other far more knowledgeable people I would like to share some findings about the big table caching here.

Some Background

In Oracle 12c you have two additional options to cache information: full database caching and big table caching. The first is great if you have a massively big machine with lots and lots of DRAM plus a clever OS that can deal with the inevitable ccNUMA setup you will have to tackle. And maybe don’t want to pay for the In-Memory option. This post is not about full database caching, but rather about the other possibility to cache blocks.

This other option is to have just a few tables in an area within the buffer cache. That’s right-after the keep and recycle pools you can now tweak the default buffer cache. In fact you tell Oracle by means of an initialisation parameter how much of the default buffer cache can be used to cache full scans. You can use zero percent (default) for caching full scans, up to 90%. You must leave 10% to OLTP workloads. I haven’t really investigated the pre-12c caching mechanism in detail (hey it works!) but I read that full scans are not cached by default to avoid thrashing the cache.

Big Table Caching

Now what? We can cache full scans too. Interesting, let’s try it. I am using 12.1.0.2.2 on an ODA, but that shouldn’t really matter. The SGA is 24 GB in size, and my segment to be scanned (a non-partitioned table without indexes) is about 20GB in size. I want 50% of the buffer cache allocated for the big table caching tool.

SQL> select component, current_size/power(1024,2) size_mb
  2  from v$sga_dynamic_components where current_size <> 0;

COMPONENT                                                           SIZE_MB
---------------------------------------------------------------- ----------
shared pool                                                            2688
large pool                                                              192
java pool                                                               128
DEFAULT buffer cache                                                  20992
Shared IO Pool                                                          512

SQL> show parameter db_big_table_cache_percent_target

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_big_table_cache_percent_target    string      50

SQL> select * from v$bt_scan_cache;

BT_CACHE_ALLOC BT_CACHE_TARGET OBJECT_COUNT MEMORY_BUF_ALLOC MIN_CACHED_TEMP     CON_ID
-------------- --------------- ------------ ---------------- --------------- ----------
    .500334108              50            0                0            1000          0

SQL> select bytes/power(1024,2) m from dba_segments
  2  where owner = 'MARTIN' and segment_name = 'T1';

         M
----------
     20864

Now let’s see if we can make use of this. 50% of 20GB are about 10GB useable for the scan cache. If I start a query against T1 in another session I can see the object count increase.

SQL> r
  1* select * from v$bt_scan_cache

BT_CACHE_ALLOC BT_CACHE_TARGET OBJECT_COUNT MEMORY_BUF_ALLOC MIN_CACHED_TEMP     CON_ID
-------------- --------------- ------------ ---------------- --------------- ----------
    .500334108              50            1          1292363            1000          0

Subsequent queries against T1 will increase the temperature and potentially the cached information. The “object temperature” is a method Oracle uses to determine the suitability of an object to be cached. In my very basic example there is only one table which has been full-scanned so far. Later on I’ll add T2 to the mix. The temperature and other object information are reflected in the second view, v$bt_scan_obj_temps:

SQL> select * from v$bt_scan_obj_temps;

       TS#   DATAOBJ# SIZE_IN_BLKS TEMPERATURE POLICY     CACHED_IN_MEM     CON_ID
---------- ---------- ------------ ----------- ---------- ------------- ----------
         4      39461      2669763        4000 MEM_PART         1292711          0

Session Counters!

I have developed a habit of looking into session counters when running queries to see if there is anything of interest. Using snapper with the before/after snapshot I got this:

SQL> @snapper4 all,end 5 1 163
Sampling SID 163 with interval 5 seconds, taking 1 snapshots...

-- Session Snapper v4.11 BETA - by Tanel Poder ( http://blog.tanelpoder.com ) - Enjoy the Most Advanced Oracle Troubleshooting Script on the Planet! :)

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    SID, USERNAME  , TYPE, STATISTIC                                                 ,         DELTA, HDELTA/SEC,    %TIME, GRAPH       , NUM_WAITS,  WAITS/SEC,   AVERAGES
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
...
    163, MARTIN    , STAT, user calls                                                ,             3,        .06,         ,             ,          ,           ,          3 per execution
...
    163, MARTIN    , STAT, physical read total IO requests                           ,         10752,     224.51,         ,             ,          ,           ,     10.75k per execution
    163, MARTIN    , STAT, physical read total multi block requests                  ,         10752,     224.51,         ,             ,          ,           ,     10.75k per execution
    163, MARTIN    , STAT, physical read total bytes                                 ,   11262763008,    235.17M,         ,             ,          ,           ,     11.26G per execution
...
    163, MARTIN    , STAT, consistent gets                                           ,       2666722,     55.68k,         ,             ,          ,           ,      2.67M per execution
    163, MARTIN    , STAT, consistent gets from cache                                ,       1292769,     26.99k,         ,             ,          ,           ,      1.29M per execution
    163, MARTIN    , STAT, consistent gets pin                                       ,       1292769,     26.99k,         ,             ,          ,           ,      1.29M per execution
    163, MARTIN    , STAT, consistent gets pin (fastpath)                            ,       1292768,     26.99k,         ,             ,          ,           ,      1.29M per execution
    163, MARTIN    , STAT, consistent gets direct                                    ,       1373953,     28.69k,         ,             ,          ,           ,      1.37M per execution
...
    163, MARTIN    , STAT, physical reads                                            ,       1374849,     28.71k,         ,             ,          ,           ,      1.37M per execution
    163, MARTIN    , STAT, physical reads cache                                      ,           128,       2.67,         ,             ,          ,           ,        128 per execution
    163, MARTIN    , STAT, physical reads direct                                     ,       1374721,     28.71k,         ,             ,          ,           ,      1.37M per execution
    163, MARTIN    , STAT, physical read IO requests                                 ,         10752,     224.51,         ,             ,          ,           ,      1.05M bytes per request
    163, MARTIN    , STAT, physical read bytes                                       ,   11262763008,    235.17M,         ,             ,          ,           ,     11.26G per execution
...
    163, MARTIN    , STAT, data warehousing scanned objects                          ,             1,        .02,         ,             ,          ,           ,          1 per execution
    163, MARTIN    , STAT, data warehousing scanned blocks                           ,       2666668,     55.68k,         ,             ,          ,           ,      2.67M per execution
    163, MARTIN    , STAT, data warehousing scanned blocks - memory                  ,       1292715,     26.99k,         ,             ,          ,           ,      1.29M per execution
    163, MARTIN    , STAT, data warehousing scanned blocks - disk                    ,       1373953,     28.69k,         ,             ,          ,           ,      1.37M per execution
...
    163, MARTIN    , STAT, table scans (short tables)                                ,             1,        .02,         ,             ,          ,           ,          1 per execution
    163, MARTIN    , STAT, table scan rows gotten                                    ,      16000006,    334.09k,         ,             ,          ,           ,        16M per execution
    163, MARTIN    , STAT, table scan disk non-IMC rows gotten                       ,      16000006,    334.09k,         ,             ,          ,           ,        16M per execution
    163, MARTIN    , STAT, table scan blocks gotten                                  ,       2666668,     55.68k,         ,             ,          ,           ,      2.67M per execution
...
    163, MARTIN    , STAT, execute count                                             ,             1,        .02,         ,             ,          ,           ,          1 executions per parse
...
    163, MARTIN    , TIME, parse time elapsed                                        ,            26,      .54us,      .0%, [          ],          ,           ,
    163, MARTIN    , TIME, DB CPU                                                    ,       5682136,   118.65ms,    11.9%, [@@        ],          ,           ,
    163, MARTIN    , TIME, sql execute elapsed time                                  ,      33220099,   693.66ms,    69.4%, [#######   ],          ,           ,
    163, MARTIN    , TIME, DB time                                                   ,      33220306,   693.66ms,    69.4%, [#######   ],          ,           ,       -.32 % unaccounted time
    163, MARTIN    , WAIT, db file scattered read                                    ,         24636,   514.41us,      .1%, [          ],         1,        .02,    24.64ms average wait
    163, MARTIN    , WAIT, direct path read                                          ,      27536664,   574.98ms,    57.5%, [WWWWWW    ],     10694,      223.3,     2.57ms average wait
    163, MARTIN    , WAIT, SQL*Net message to client                                 ,             3,      .06us,      .0%, [          ],         2,        .04,      1.5us average wait
    163, MARTIN    , WAIT, SQL*Net message from client                               ,      14826292,   309.58ms,    31.0%, [WWWW      ],         2,        .04,      7.41s average wait

--  End of Stats snap 1, end=2015-05-11 10:13:13, seconds=47.9

Don’t worry about the wide output-all you need in the output is the STATISTIC and DELTA columns. This can all be a bit overwhelming at first, so let me guide you through-statistics that are not important for the discussion have already been removed.

First of all you see the phyical reads. 10752 IO requests were issued, all of them multi block requests (physical read total IO requests and physical read total multi block requests, repeated also in physical read IO requests). Hmm – 10GB – isn’t that 50% of my buffer cache? It might actually be possible that 10GB of the 20GB table were read from disk using direct path reads, and another 10GB came from memory.

What is interesting is the next set of counters: “data warehousing scanned %” which I haven’t seen until now. In fact, if I run @TanelPoder’s statn.sql script in my session I can see there are more than the ones I have in the snapper output:

SQL> @statn.sql warehousing

     STAT# HEX#      OFFSET NAME                                                                  VALUE
---------- ----- ---------- ---------------------------------------------------------------- ----------
       237 ED          1896 data warehousing scanned objects                                          5
       238 EE          1904 data warehousing scanned blocks                                    13333340
       239 EF          1912 data warehousing scanned blocks - memory                            6462815
       240 F0          1920 data warehousing scanned blocks - flash                                   0
       241 F1          1928 data warehousing scanned blocks - disk                              6870525
       242 F2          1936 data warehousing scanned blocks - offload                                 0
       243 F3          1944 data warehousing evicted objects                                          0
       244 F4          1952 data warehousing evicted objects - cooling                                0
       245 F5          1960 data warehousing evicted objects - replace                                0
       246 F6          1968 data warehousing cooling action                                           0

10 rows selected.

Interesting! I’m quite glad Oracle gave us so many statistics that describe the way the big table caching works. I have queried T1 5 times in my session, and scanned 13333340 blocks during these. The distribution between memory and disk is 6462815 to 6870525, almost 50:50. It looks like you can use flash for this and offloading (note to self: repeat the test on Exadata). That makes sense as in the example just shown: the segment to be scanned is 20GB out of which 10GB are in the buffer cache. If the rest of the segment can be scanned using direct path reads as in the above example then it is more than likely that you can offload the scan as well.

I can even see if there was space pressure on the big table cache, the data warehousing evicted% statistics hint at space management.

Purely from memory?

Looking at the session counters above my take is too big. I would like to see more scans entirely from memory :) So I created a table with 25% of the size of T1 and called it T2 using the sample clause. Initial scans against T2 showed disk scans only (policy = DISK):

SQL> r
  1* select * from v$bt_scan_obj_temps

       TS#   DATAOBJ# SIZE_IN_BLKS TEMPERATURE POLICY     CACHED_IN_MEM     CON_ID
---------- ---------- ------------ ----------- ---------- ------------- ----------
         4      39461      2669763        8000 MEM_PART         1293081          0
         4      39465       668107        5000 DISK                   0          0

But after the 9th scan (each scan increased the temperature by 1000) the situation changed:

SQL> r
  1* select * from v$bt_scan_obj_temps

       TS#   DATAOBJ# SIZE_IN_BLKS TEMPERATURE POLICY     CACHED_IN_MEM     CON_ID
---------- ---------- ------------ ----------- ---------- ------------- ----------
         4      39465       668107        9000 MEM_ONLY          668107          0
         4      39461      2669763        8000 MEM_PART          624974          0

The smaller table now fits into memory and took over! The result is visible in the execution time:

SQL> r
  1* select min(date_created) from t2

MIN(DATE_CREATED)
-----------------
20140513 00:00:00

Elapsed: 00:00:05.39

...

SQL> r
  1* select min(date_created) from t2

MIN(DATE_CREATED)
-----------------
20140513 00:00:00

Elapsed: 00:00:01.25

So to prove the point I reconnected to the database and ran my test again:


SQL> select min(date_created) from t2;

MIN(DATE_CREATED)
-----------------
20140513 00:00:00

Elapsed: 00:00:01.36

SQL> @statn warehousing

     STAT# HEX#      OFFSET NAME                                                                  VALUE
---------- ----- ---------- ---------------------------------------------------------------- ----------
       237 ED          1896 data warehousing scanned objects                                          1
       238 EE          1904 data warehousing scanned blocks                                      666966
       239 EF          1912 data warehousing scanned blocks - memory                             666966
       240 F0          1920 data warehousing scanned blocks - flash                                   0
       241 F1          1928 data warehousing scanned blocks - disk                                    0
       242 F2          1936 data warehousing scanned blocks - offload                                 0
       243 F3          1944 data warehousing evicted objects                                          0
       244 F4          1952 data warehousing evicted objects - cooling                                0
       245 F5          1960 data warehousing evicted objects - replace                                0
       246 F6          1968 data warehousing cooling action                                           0

10 rows selected.

So the popularity of the object plays a big role working out which segment to cache as well. This is an interesting feature worth testing in your development environments. I have no idea if it’s licensable or not so please make sure you check with Oracle before using it to avoid surprises.