Martins Blog

Trying to explain complex things in simple terms

Archive for May, 2015

Smart Scan enhancements in Oracle Exadata 12c-inline LOBs

Posted by Martin Bach on May 25, 2015

As part of the 12c release of the database and cellos Oracle introduced the ability to perform Smart Scans on inline LOBs. The enhancement is certainly for Secure Files only. And as always, if you would like to read up about Secure Files you can head over to Tim Hall’s website for the details.

To demonstrate the new behaviour I have used an 11.2.0.3 BP 22 database on Exadata 11.2.3.3.1 and 12.1.0.2.2 database on Exadata 12.1.2.1.0. The test case evolves around the following table:

CREATE TABLE loboffload (
 id number primary key,
 comments clob)
enable row movement 
tablespace users
lob (comments) store as securefile (
 enable storage in row
);

This is the short version, Oracle fills in the blanks and converts the table DDL to

SQL> select dbms_metadata.get_ddl('TABLE','LOBOFFLOAD') from dual;

DBMS_METADATA.GET_DDL('TABLE','LOBOFFLOAD')
--------------------------------------------------------------------------------

  CREATE TABLE "MARTIN"."LOBOFFLOAD"
   (    "ID" NUMBER,
        "COMMENTS" CLOB,
         PRIMARY KEY ("ID")
  USING INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS LOGGING
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "USERS"  ENABLE
   ) SEGMENT CREATION IMMEDIATE
  PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
 NOCOMPRESS LOGGING
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "USERS"
 LOB ("COMMENTS") STORE AS SECUREFILE (
  TABLESPACE "USERS" ENABLE STORAGE IN ROW CHUNK 8192
  NOCACHE LOGGING  NOCOMPRESS  KEEP_DUPLICATES
  STORAGE(INITIAL 106496 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT))  ENABLE ROW
 MOVEMENT

The idea is that I am using the comments lob column to simulate a common approach in application design where you define a free text field on a GUI screen for the user to provide additional information about the record to be stored in the database. The design is not very practical though.

The important information is in the section about the LOB. Here you find the request to store the LOB as a SECUREFILE in-line with the table. Out of line LOBs are not offloadable as of 12.1.2.1.0 as far as I know.

To feed some data into the table I copied added a bit of data and then used the insert … append technique to populate it:

alter session enable parallel dml;

-- with nods to Jonathan Lewis who published this in one of his presentations on generating data
insert /*+ append parallel(4) */ into loboffload
with v1 as (
 SELECT rownum n FROM dual CONNECT BY level <= 10000
)
SELECT  rownum id,
  CASE
    WHEN mod(rownum,100000) = 0     THEN 'THIS WAS EXCEPTIONAL' 
    WHEN mod(rownum,10000) = 0      THEN 'THIS WAS VERY GOOD'
    WHEN mod(rownum,1000) = 0       THEN 'THIS WAS GOOD'
    WHEN mod(rownum,100) = 0        THEN 'OK'
    ELSE 'NO COMPLAINTS'
  END 
FROM v1,
  v1
WHERE rownum <= 1e6;

create sequence loboffload_s cache 10000 start with 1000001;


insert /*+ append parallel(4) */ into loboffload
select /*+ parallel(4) */ loboffload_s.nextval, comments from loboffload;

I ended up with 16 million rows in the end. I used impdp over a network link to move the table to the other database.

Test with 11.2.0.3

The test in 11.2.0.3 is used to confirm that inline LOBs are not offloadable. Here is the query and result:

MARTIN@ora11203> select /*+ monitor noparallel full(t) */ count(*) from loboffload t where comments like '%GOOD%';

  COUNT(*)
----------
     15840

Elapsed: 00:01:33.48

MARTIN@ora11203> -- removing irrelevant statistics from the output
MARTIN@ora11203> @mystats stop t=1
==========================================================================================
MyStats report : 25-MAY-2015 02:33:24
==========================================================================================


------------------------------------------------------------------------------------------
1. Summary Timings
------------------------------------------------------------------------------------------

Type    Statistic Name                                                               Value
------  ----------------------------------------------------------------  ----------------
TIMER   snapshot interval (seconds)                                                  95.28
TIMER   CPU time used (seconds)                                                      93.00


------------------------------------------------------------------------------------------
2. Statistics Report
------------------------------------------------------------------------------------------

Type    Statistic Name                                                               Value
------  ----------------------------------------------------------------  ----------------
STAT    CPU used by this session                                                     9,303
...
STAT    DB time                                                                      9,353
...
STAT    cell flash cache read hits                                                   9,454
STAT    cell physical IO interconnect bytes                                  1,233,436,672
...
STAT    consistent gets                                                            150,794
STAT    consistent gets - examination                                                    3
STAT    consistent gets direct                                                     150,558
STAT    consistent gets from cache                                                     236
STAT    consistent gets from cache (fastpath)                                          231
STAT    cursor authentications                                                           3
...
STAT    execute count                                                                   13
...
STAT    logical read bytes from cache                                           14,393,344
STAT    no work - consistent read gets                                             150,558
...
STAT    parse count (hard)                                                               1
STAT    parse count (total)                                                             13
STAT    parse time cpu                                                                   1
STAT    parse time elapsed                                                               1
STAT    physical read IO requests                                                    9,459
STAT    physical read bytes                                                  1,233,436,672
STAT    physical read requests optimized                                             9,454
STAT    physical read total IO requests                                              9,459
STAT    physical read total bytes                                            1,233,436,672
STAT    physical read total bytes optimized                                  1,233,395,712
STAT    physical read total multi block requests                                     9,255
STAT    physical reads                                                             150,566
STAT    physical reads cache                                                             8
STAT    physical reads direct                                                      150,558
...
STAT    table scan blocks gotten                                                   150,558
STAT    table scan rows gotten                                                  16,000,000
STAT    table scans (direct read)                                                        1
STAT    table scans (long tables)                                                        1
...

------------------------------------------------------------------------------------------
3. About
------------------------------------------------------------------------------------------
- MyStats v2.01 by Adrian Billington (http://www.oracle-developer.net)
- Based on the SNAP_MY_STATS utility by Jonathan Lewis

==========================================================================================
End of report
==========================================================================================

So none of the essential cell% events recorded except for Flash Cache read hits. Without the cell scans statistics incremented by 1 (for the table segment) you can conclude that no Smart Scan happened here.

Test on 12.1.0.2/12.1.2.1.0

The first result was not too surprising, as it is the documented behaviour. The second test should hopefully reveal some offloading.

MARTIN@ORA12c> select /*+ monitor noparallel full(t) */ count(*) from loboffload t where comments like '%GOOD%';

  COUNT(*)
----------
     15840

Elapsed: 00:00:01.65

MARTIN@ORA12c> @mystats stop t=1
==========================================================================================
MyStats report : 25-MAY-2015 02:29:46
==========================================================================================


------------------------------------------------------------------------------------------
1. Summary Timings
------------------------------------------------------------------------------------------

Type    Statistic Name                                                               Value
------  ----------------------------------------------------------------  ----------------
TIMER   snapshot interval (seconds)                                                  14.61
TIMER   CPU time used (seconds)                                                       0.03


------------------------------------------------------------------------------------------
2. Statistics Report
------------------------------------------------------------------------------------------

Type    Statistic Name                                                               Value
------  ----------------------------------------------------------------  ----------------
STAT    CPU used by this session                                                         9
...
STAT    cell IO uncompressed bytes                                           1,234,296,832
STAT    cell blocks helped by minscn optimization                                  150,666
STAT    cell blocks processed by cache layer                                       150,671
STAT    cell blocks processed by data layer                                        150,671
STAT    cell blocks processed by txn layer                                         150,671
STAT    cell flash cache read hits                                                   1,143
STAT    cell num smartio automem buffer allocation attempts                              1
STAT    cell physical IO bytes eligible for predicate offload                1,234,296,832
STAT    cell physical IO interconnect bytes                                      6,273,368
STAT    cell physical IO interconnect bytes returned by smart scan               6,273,368
STAT    cell scans                                                                       1
...
STAT    consistent gets                                                            151,053
STAT    consistent gets direct                                                     150,671
STAT    consistent gets examination                                                      3
STAT    consistent gets examination (fastpath)                                           3
STAT    consistent gets from cache                                                     382
STAT    consistent gets pin                                                            379
STAT    consistent gets pin (fastpath)                                                 379
...
STAT    parse count (total)                                                             13
STAT    physical read IO requests                                                    1,205
STAT    physical read bytes                                                  1,234,296,832
STAT    physical read requests optimized                                             1,143
STAT    physical read total IO requests                                              1,205
STAT    physical read total bytes                                            1,234,296,832
STAT    physical read total bytes optimized                                  1,171,423,232
STAT    physical read total multi block requests                                     1,189
STAT    physical reads                                                             150,671
STAT    physical reads direct                                                      150,671
...
STAT    table scan blocks gotten                                                    15,744
STAT    table scan disk non-IMC rows gotten                                      1,670,692
STAT    table scan rows gotten                                                   1,670,692
STAT    table scans (direct read)                                                        1
STAT    table scans (short tables)                                                       1
...

------------------------------------------------------------------------------------------
3. About
------------------------------------------------------------------------------------------
- MyStats v2.01 by Adrian Billington (http://www.oracle-developer.net)
- Based on the SNAP_MY_STATS utility by Jonathan Lewis

==========================================================================================
End of report
==========================================================================================

Good news, what Oracle said is true. The same query that did not offload in 11.2.0.3 is now offloaded in 12.1.0.2 with Exadata 12.1.2.1.0. You find proof in the existence of all the cell% statistics, especially cell scans plus the … processed by … layer statistic.

Posted in 12c Release 1, Exadata | 1 Comment »

If you use swingbench take note of sbutil

Posted by Martin Bach on May 19, 2015

This is going to be a very short post for a change. I have used Swingbench extensively and really love the tool. Many thanks to Dominic Giles!

Recently he announced a new tool on his blog that you can use to inflate your data volume. So instead of using the “-scale” argument when executing oewizard you can just keep the defaults and later on create as much data as you like. Here is an example, the reason for this post.

Setting the Scene

If you have not yet downloaded the latest Swingbench version and JDK 8 do this first. You can get Swingbench from the tool’s landing page and the Java software from java.sun.com. I used Swingbench 2.5.791 (stable). With both deployed, I created the Order Entry schema in lights-out mode:

[oracle@lab bin]$ ./oewizard -dbap secret -u soe -p soe -cl -cs //lab/martin -ts soe_tbs -create -dba system
SwingBench Wizard
Author  :        Dominic Giles
Version :        2.5.0.971

Running in Lights Out Mode using config file : oewizard.xml
The following statement failed : GRANT EXECUTE ON dbms_lock TO soe : Due to : ORA-01031: insufficient privileges

============================================
|           Datagenerator Run Stats        |
============================================
Connection Time                        0:00:00.006
Data Generation Time                   0:01:06.110
DDL Creation Time                      0:00:30.346
Total Run Time                         0:01:36.465
Rows Inserted per sec                      183,121
Data Generated (MB) per sec                   14.9
Actual Rows Generated                   13,002,138

Post Creation Validation Report
===============================
The schema appears to have been created successfully.

Valid Objects
=============
Valid Tables : 'ORDERS','ORDER_ITEMS','CUSTOMERS','WAREHOUSES','ORDERENTRY_METADATA','INVENTORIES',
'PRODUCT_INFORMATION','PRODUCT_DESCRIPTIONS','ADDRESSES','CARD_DETAILS'
Valid Indexes : 'PRD_DESC_PK','PROD_NAME_IX','PRODUCT_INFORMATION_PK','PROD_SUPPLIER_IX','PROD_CATEGORY_IX',
'INVENTORY_PK','INV_PRODUCT_IX','INV_WAREHOUSE_IX','ORDER_PK','ORD_SALES_REP_IX',
'ORD_CUSTOMER_IX','ORD_ORDER_DATE_IX','ORD_WAREHOUSE_IX','ORDER_ITEMS_PK','ITEM_ORDER_IX',
'ITEM_PRODUCT_IX','WAREHOUSES_PK','WHS_LOCATION_IX','CUSTOMERS_PK','CUST_EMAIL_IX',
'CUST_ACCOUNT_MANAGER_IX','CUST_FUNC_LOWER_NAME_IX','ADDRESS_PK','ADDRESS_CUST_IX',
'CARD_DETAILS_PK','CARDDETAILS_CUST_IX'
Valid Views : 'PRODUCTS','PRODUCT_PRICES'
Valid Sequences : 'CUSTOMER_SEQ','ORDERS_SEQ','ADDRESS_SEQ','LOGON_SEQ','CARD_DETAILS_SEQ'
Valid Code : 'ORDERENTRY'
Schema Created

More Data!

The next step is to invoke sbutil:

[oracle@lab bin]$ ./sbutil -u soe -p soe  -cs //lab/martin -soe parallel 12 -dup 4
Getting table Info
Got table information. Completed in : 0:00:01.558
Dropping Indexes
Dropped Indexes. Completed in : 0:00:03.252
Creating copies of tables
Created copies of tables. Completed in : 0:00:00.042
Begining data duplication
Completed Iteration 4. Completed in : 0:00:07.942
Creating  Constraints
Created  Constraints. Completed in : 0:00:28.737
Creating  Indexes
Created  Indexes. Completed in : 0:00:54.541
Updating Metadata and Recompiling Code
Updated Metadata. Completed in : 0:00:00.754
Updating Sequences
Updated Sequences. Completed in : 0:00:00.792
Determining New Row Counts
Got New Row Counts. Completed in : 0:00:03.132
Completed Data Duplication in 0 hour(s) 2 minute(s) 8 second(s) 454 millisecond(s)
----------------------------------------------------------------------------------------------------------
|Table Name          |  Original Row Count|       Original Size|       New Row Count|            New Size|
----------------------------------------------------------------------------------------------------------
|ORDER_ITEMS         |           4,288,582|              280 MB|          17,154,328|              1.1 GB|
|CUSTOMERS           |           1,000,000|              136 MB|           4,000,000|            552.4 MB|
|CARD_DETAILS        |           1,500,000|               80 MB|           6,000,000|            337.8 MB|
|ORDERS              |           1,429,790|              160 MB|           5,719,160|              664 MB|
|ADDRESSES           |           1,500,000|              184 MB|           6,000,000|            571.8 MB|
----------------------------------------------------------------------------------------------------------
|Total               |                    |              840 MB|                    |              3.2 GB|
----------------------------------------------------------------------------------------------------------

How Cool is That?

Really, it is. And look at the timings. This didn’t take too long, even on my poor old AMD 6200 series lab server.

Posted in Performance | Tagged: , , | 1 Comment »

Additional information on Oracle 12c big table caching

Posted by Martin Bach on May 13, 2015

Teaching is on the things I like doing, and currently I am investigating the Oracle 12c features around caching data in the various memory areas. Since the In-Memory (cost) option has been discussed by other far more knowledgeable people I would like to share some findings about the big table caching here.

Some Background

In Oracle 12c you have two additional options to cache information: full database caching and big table caching. The first is great if you have a massively big machine with lots and lots of DRAM plus a clever OS that can deal with the inevitable ccNUMA setup you will have to tackle. And maybe don’t want to pay for the In-Memory option. This post is not about full database caching, but rather about the other possibility to cache blocks.

This other option is to have just a few tables in an area within the buffer cache. That’s right-after the keep and recycle pools you can now tweak the default buffer cache. In fact you tell Oracle by means of an initialisation parameter how much of the default buffer cache can be used to cache full scans. You can use zero percent (default) for caching full scans, up to 90%. You must leave 10% to OLTP workloads. I haven’t really investigated the pre-12c caching mechanism in detail (hey it works!) but I read that full scans are not cached by default to avoid thrashing the cache.

Big Table Caching

Now what? We can cache full scans too. Interesting, let’s try it. I am using 12.1.0.2.2 on an ODA, but that shouldn’t really matter. The SGA is 24 GB in size, and my segment to be scanned (a non-partitioned table without indexes) is about 20GB in size. I want 50% of the buffer cache allocated for the big table caching tool.

SQL> select component, current_size/power(1024,2) size_mb
  2  from v$sga_dynamic_components where current_size <> 0;

COMPONENT                                                           SIZE_MB
---------------------------------------------------------------- ----------
shared pool                                                            2688
large pool                                                              192
java pool                                                               128
DEFAULT buffer cache                                                  20992
Shared IO Pool                                                          512

SQL> show parameter db_big_table_cache_percent_target

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_big_table_cache_percent_target    string      50

SQL> select * from v$bt_scan_cache;

BT_CACHE_ALLOC BT_CACHE_TARGET OBJECT_COUNT MEMORY_BUF_ALLOC MIN_CACHED_TEMP     CON_ID
-------------- --------------- ------------ ---------------- --------------- ----------
    .500334108              50            0                0            1000          0

SQL> select bytes/power(1024,2) m from dba_segments
  2  where owner = 'MARTIN' and segment_name = 'T1';

         M
----------
     20864

Now let’s see if we can make use of this. 50% of 20GB are about 10GB useable for the scan cache. If I start a query against T1 in another session I can see the object count increase.

SQL> r
  1* select * from v$bt_scan_cache

BT_CACHE_ALLOC BT_CACHE_TARGET OBJECT_COUNT MEMORY_BUF_ALLOC MIN_CACHED_TEMP     CON_ID
-------------- --------------- ------------ ---------------- --------------- ----------
    .500334108              50            1          1292363            1000          0

Subsequent queries against T1 will increase the temperature and potentially the cached information. The “object temperature” is a method Oracle uses to determine the suitability of an object to be cached. In my very basic example there is only one table which has been full-scanned so far. Later on I’ll add T2 to the mix. The temperature and other object information are reflected in the second view, v$bt_scan_obj_temps:

SQL> select * from v$bt_scan_obj_temps;

       TS#   DATAOBJ# SIZE_IN_BLKS TEMPERATURE POLICY     CACHED_IN_MEM     CON_ID
---------- ---------- ------------ ----------- ---------- ------------- ----------
         4      39461      2669763        4000 MEM_PART         1292711          0

Session Counters!

I have developed a habit of looking into session counters when running queries to see if there is anything of interest. Using snapper with the before/after snapshot I got this:

SQL> @snapper4 all,end 5 1 163
Sampling SID 163 with interval 5 seconds, taking 1 snapshots...

-- Session Snapper v4.11 BETA - by Tanel Poder ( http://blog.tanelpoder.com ) - Enjoy the Most Advanced Oracle Troubleshooting Script on the Planet! :)

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    SID, USERNAME  , TYPE, STATISTIC                                                 ,         DELTA, HDELTA/SEC,    %TIME, GRAPH       , NUM_WAITS,  WAITS/SEC,   AVERAGES
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
...
    163, MARTIN    , STAT, user calls                                                ,             3,        .06,         ,             ,          ,           ,          3 per execution
...
    163, MARTIN    , STAT, physical read total IO requests                           ,         10752,     224.51,         ,             ,          ,           ,     10.75k per execution
    163, MARTIN    , STAT, physical read total multi block requests                  ,         10752,     224.51,         ,             ,          ,           ,     10.75k per execution
    163, MARTIN    , STAT, physical read total bytes                                 ,   11262763008,    235.17M,         ,             ,          ,           ,     11.26G per execution
...
    163, MARTIN    , STAT, consistent gets                                           ,       2666722,     55.68k,         ,             ,          ,           ,      2.67M per execution
    163, MARTIN    , STAT, consistent gets from cache                                ,       1292769,     26.99k,         ,             ,          ,           ,      1.29M per execution
    163, MARTIN    , STAT, consistent gets pin                                       ,       1292769,     26.99k,         ,             ,          ,           ,      1.29M per execution
    163, MARTIN    , STAT, consistent gets pin (fastpath)                            ,       1292768,     26.99k,         ,             ,          ,           ,      1.29M per execution
    163, MARTIN    , STAT, consistent gets direct                                    ,       1373953,     28.69k,         ,             ,          ,           ,      1.37M per execution
...
    163, MARTIN    , STAT, physical reads                                            ,       1374849,     28.71k,         ,             ,          ,           ,      1.37M per execution
    163, MARTIN    , STAT, physical reads cache                                      ,           128,       2.67,         ,             ,          ,           ,        128 per execution
    163, MARTIN    , STAT, physical reads direct                                     ,       1374721,     28.71k,         ,             ,          ,           ,      1.37M per execution
    163, MARTIN    , STAT, physical read IO requests                                 ,         10752,     224.51,         ,             ,          ,           ,      1.05M bytes per request
    163, MARTIN    , STAT, physical read bytes                                       ,   11262763008,    235.17M,         ,             ,          ,           ,     11.26G per execution
...
    163, MARTIN    , STAT, data warehousing scanned objects                          ,             1,        .02,         ,             ,          ,           ,          1 per execution
    163, MARTIN    , STAT, data warehousing scanned blocks                           ,       2666668,     55.68k,         ,             ,          ,           ,      2.67M per execution
    163, MARTIN    , STAT, data warehousing scanned blocks - memory                  ,       1292715,     26.99k,         ,             ,          ,           ,      1.29M per execution
    163, MARTIN    , STAT, data warehousing scanned blocks - disk                    ,       1373953,     28.69k,         ,             ,          ,           ,      1.37M per execution
...
    163, MARTIN    , STAT, table scans (short tables)                                ,             1,        .02,         ,             ,          ,           ,          1 per execution
    163, MARTIN    , STAT, table scan rows gotten                                    ,      16000006,    334.09k,         ,             ,          ,           ,        16M per execution
    163, MARTIN    , STAT, table scan disk non-IMC rows gotten                       ,      16000006,    334.09k,         ,             ,          ,           ,        16M per execution
    163, MARTIN    , STAT, table scan blocks gotten                                  ,       2666668,     55.68k,         ,             ,          ,           ,      2.67M per execution
...
    163, MARTIN    , STAT, execute count                                             ,             1,        .02,         ,             ,          ,           ,          1 executions per parse
...
    163, MARTIN    , TIME, parse time elapsed                                        ,            26,      .54us,      .0%, [          ],          ,           ,
    163, MARTIN    , TIME, DB CPU                                                    ,       5682136,   118.65ms,    11.9%, [@@        ],          ,           ,
    163, MARTIN    , TIME, sql execute elapsed time                                  ,      33220099,   693.66ms,    69.4%, [#######   ],          ,           ,
    163, MARTIN    , TIME, DB time                                                   ,      33220306,   693.66ms,    69.4%, [#######   ],          ,           ,       -.32 % unaccounted time
    163, MARTIN    , WAIT, db file scattered read                                    ,         24636,   514.41us,      .1%, [          ],         1,        .02,    24.64ms average wait
    163, MARTIN    , WAIT, direct path read                                          ,      27536664,   574.98ms,    57.5%, [WWWWWW    ],     10694,      223.3,     2.57ms average wait
    163, MARTIN    , WAIT, SQL*Net message to client                                 ,             3,      .06us,      .0%, [          ],         2,        .04,      1.5us average wait
    163, MARTIN    , WAIT, SQL*Net message from client                               ,      14826292,   309.58ms,    31.0%, [WWWW      ],         2,        .04,      7.41s average wait

--  End of Stats snap 1, end=2015-05-11 10:13:13, seconds=47.9

Don’t worry about the wide output-all you need in the output is the STATISTIC and DELTA columns. This can all be a bit overwhelming at first, so let me guide you through-statistics that are not important for the discussion have already been removed.

First of all you see the phyical reads. 10752 IO requests were issued, all of them multi block requests (physical read total IO requests and physical read total multi block requests, repeated also in physical read IO requests). Hmm – 10GB – isn’t that 50% of my buffer cache? It might actually be possible that 10GB of the 20GB table were read from disk using direct path reads, and another 10GB came from memory.

What is interesting is the next set of counters: “data warehousing scanned %” which I haven’t seen until now. In fact, if I run @TanelPoder’s statn.sql script in my session I can see there are more than the ones I have in the snapper output:

SQL> @statn.sql warehousing

     STAT# HEX#      OFFSET NAME                                                                  VALUE
---------- ----- ---------- ---------------------------------------------------------------- ----------
       237 ED          1896 data warehousing scanned objects                                          5
       238 EE          1904 data warehousing scanned blocks                                    13333340
       239 EF          1912 data warehousing scanned blocks - memory                            6462815
       240 F0          1920 data warehousing scanned blocks - flash                                   0
       241 F1          1928 data warehousing scanned blocks - disk                              6870525
       242 F2          1936 data warehousing scanned blocks - offload                                 0
       243 F3          1944 data warehousing evicted objects                                          0
       244 F4          1952 data warehousing evicted objects - cooling                                0
       245 F5          1960 data warehousing evicted objects - replace                                0
       246 F6          1968 data warehousing cooling action                                           0

10 rows selected.

Interesting! I’m quite glad Oracle gave us so many statistics that describe the way the big table caching works. I have queried T1 5 times in my session, and scanned 13333340 blocks during these. The distribution between memory and disk is 6462815 to 6870525, almost 50:50. It looks like you can use flash for this and offloading (note to self: repeat the test on Exadata). That makes sense as in the example just shown: the segment to be scanned is 20GB out of which 10GB are in the buffer cache. If the rest of the segment can be scanned using direct path reads as in the above example then it is more than likely that you can offload the scan as well.

I can even see if there was space pressure on the big table cache, the data warehousing evicted% statistics hint at space management.

Purely from memory?

Looking at the session counters above my take is too big. I would like to see more scans entirely from memory :) So I created a table with 25% of the size of T1 and called it T2 using the sample clause. Initial scans against T2 showed disk scans only (policy = DISK):

SQL> r
  1* select * from v$bt_scan_obj_temps

       TS#   DATAOBJ# SIZE_IN_BLKS TEMPERATURE POLICY     CACHED_IN_MEM     CON_ID
---------- ---------- ------------ ----------- ---------- ------------- ----------
         4      39461      2669763        8000 MEM_PART         1293081          0
         4      39465       668107        5000 DISK                   0          0

But after the 9th scan (each scan increased the temperature by 1000) the situation changed:

SQL> r
  1* select * from v$bt_scan_obj_temps

       TS#   DATAOBJ# SIZE_IN_BLKS TEMPERATURE POLICY     CACHED_IN_MEM     CON_ID
---------- ---------- ------------ ----------- ---------- ------------- ----------
         4      39465       668107        9000 MEM_ONLY          668107          0
         4      39461      2669763        8000 MEM_PART          624974          0

The smaller table now fits into memory and took over! The result is visible in the execution time:

SQL> r
  1* select min(date_created) from t2

MIN(DATE_CREATED)
-----------------
20140513 00:00:00

Elapsed: 00:00:05.39

...

SQL> r
  1* select min(date_created) from t2

MIN(DATE_CREATED)
-----------------
20140513 00:00:00

Elapsed: 00:00:01.25

So to prove the point I reconnected to the database and ran my test again:


SQL> select min(date_created) from t2;

MIN(DATE_CREATED)
-----------------
20140513 00:00:00

Elapsed: 00:00:01.36

SQL> @statn warehousing

     STAT# HEX#      OFFSET NAME                                                                  VALUE
---------- ----- ---------- ---------------------------------------------------------------- ----------
       237 ED          1896 data warehousing scanned objects                                          1
       238 EE          1904 data warehousing scanned blocks                                      666966
       239 EF          1912 data warehousing scanned blocks - memory                             666966
       240 F0          1920 data warehousing scanned blocks - flash                                   0
       241 F1          1928 data warehousing scanned blocks - disk                                    0
       242 F2          1936 data warehousing scanned blocks - offload                                 0
       243 F3          1944 data warehousing evicted objects                                          0
       244 F4          1952 data warehousing evicted objects - cooling                                0
       245 F5          1960 data warehousing evicted objects - replace                                0
       246 F6          1968 data warehousing cooling action                                           0

10 rows selected.

So the popularity of the object plays a big role working out which segment to cache as well. This is an interesting feature worth testing in your development environments. I have no idea if it’s licensable or not so please make sure you check with Oracle before using it to avoid surprises.

Posted in 12c Release 1, Performance | Tagged: | 9 Comments »

Exadata Fast Data File Creation

Posted by Martin Bach on May 11, 2015

This post is the result of some testing I performed on Exadata data file creation. You may know that Exadata offloading incorporates SQL optimisations as well as some infrastructure work. For quite some time Exadata allowed the DBA to create data files a lot quicker than on traditional systems. This has been documented before, for example by @mpnsh here

The final comment on his blog entry was a remark that data file creation is quite fast, but that is not true for online redo logs. Especially in environments where you duplicate production to a lower tier environment you have to wait for the online redo logs (including all members across all threads) to be zeroed. This is no longer an issue with Fast Data File Creation. If your system is configured to use Write-Back Flash Cache (WBFC from now on) and you are on Exadata 11.2.3.3 then you can benefit from super-fast file creation, including online redo logs. Here is an example, taken from a SQL trace:

TKPROF: Release 12.1.0.2.0 
...
Trace file: /u01/app/oracle/diag/rdbms/mbach/MBACH1/trace/MBACH1_ora_124411.trc
Sort options: default

********************************************************************************
count    = number of times OCI procedure was executed
cpu      = cpu time in seconds executing
elapsed  = elapsed time in seconds executing
disk     = number of physical reads of buffers from disk
query    = number of buffers gotten for consistent read
current  = number of buffers gotten in current mode (usually for update)
rows     = number of rows processed by the fetch or execute call
********************************************************************************

SQL ID: fvt6psf2t3cdz Plan Hash: 0

alter database add logfile thread 2 group 5 ('+DATA','+RECO') size 4096m


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.37       5.09          0          0          2           0
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.38       5.09          0          0          2           0

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  Disk file Mirror Read                           2        0.00          0.00
  control file sequential read                   23        0.00          0.01
  KSV master wait                                19        0.04          0.09
  ASM file metadata operation                     6        0.00          0.00
  CSS initialization                              1        0.00          0.00
  CSS operation: query                            6        0.00          0.00
  CSS operation: action                           2        0.00          0.00
  kfk: async disk IO                              2        0.00          0.00
  Disk file operations I/O                        3        0.00          0.00
  cell smart file creation                     3258        0.13          4.64
  log file single write                           2        0.00          0.00
  control file parallel write                    10        0.07          0.07
  log file sync                                   1        0.00          0.00
  SQL*Net message to client                       1        0.00          0.00
  SQL*Net message from client                     1       17.61         17.61
********************************************************************************

I have tried to replicate the command issued in Martin Nash’s blog post. If memory serves me right he was waiting for 37 seconds for this command to complete. The new method uses about 5 seconds-not bad! Interestingly there is no new wait event-the one we know (cell smart file creation) is used to indicate the work that has been done. In the case of Fast Data File Creation only the metadata about the new file is persisted in the WBFC, the actual formatting has not happened when the prompt returned.

Session Counters?

I have spent _a lot_ of time on Exadata related session counters in 12.1.0.2 and earlier releases. I think they are fascinating, and since the wait events do not really show me what happened during the execution of the statement I used the session counters instead. I am using Adrian Billington’s mystats tool for this:

SQL> @scripts/mystats start

SQL> alter database add logfile thread 2 group 5 ('+DATA','+RECO') size 4096m;

Database altered.

Elapsed: 00:00:04.87
SQL> @scripts/mystats stop t=1
...
STAT    cell flash cache read hits                                                      20
STAT    cell logical write IO requests                                                  14
STAT    cell overwrites in flash cache                                                  30
STAT    cell physical IO bytes eligible for predicate offload               17,179,869,184
STAT    cell physical IO bytes saved during optimized file creation         17,179,869,184
STAT    cell physical IO interconnect bytes                                      1,361,920
STAT    cell writes to flash cache                                                      30
...
STAT    physical write requests optimized                                               10
STAT    physical write total IO requests                                             4,126
STAT    physical write total bytes                                          17,180,197,888
STAT    physical write total bytes optimized                                       163,840
STAT    physical write total multi block requests                                    4,096

Let’s begin with the cell%-statistics. These indicate what happened on the storage layer. The command I executed was to create an online redo log group in thread 2, in disk groups DATA and RECO. Each file is approximately 4 GB in size.

SQL> select f.member, l.bytes/power(1024,2) m 
  2  from v$log l, v$logfile f
  3  where l.group# = f.group#
  4  and l.group# = 5;

MEMBER                                                                M
------------------------------------------------------------ ----------
+DATA/MBACH/ONLINELOG/group_5.454.879407723                        4096
+RECO/MBACH/ONLINELOG/group_5.1229.879407725                       4096

2 rows selected.

So why does the database report 17,180,197,888 “bytes saved during optimized file creation” and later again in “physical write total bytes”? The answer is ASM mirroring. In this system DATA and RECO are protected using ASM normal redundancy, doubling the writes.

What I found amusing is that the exact number of bytes eventually written is “eligible for predicate offload”. Until quite recently I only associated Smart Scans with this statistic counter.

You can also see a few writes (and overwrites) to (Write Back) Flash Cache.

Posted in 12c Release 1, Exadata, Performance | Leave a Comment »