Teaching is on the things I like doing, and currently I am investigating the Oracle 12c features around caching data in the various memory areas. Since the In-Memory (cost) option has been discussed by other far more knowledgeable people I would like to share some findings about the big table caching here.
Some Background
In Oracle 12c you have two additional options to cache information: full database caching and big table caching. The first is great if you have a massively big machine with lots and lots of DRAM plus a clever OS that can deal with the inevitable ccNUMA setup you will have to tackle. And maybe don’t want to pay for the In-Memory option. This post is not about full database caching, but rather about the other possibility to cache blocks.
This other option is to have just a few tables in an area within the buffer cache. That’s right-after the keep and recycle pools you can now tweak the default buffer cache. In fact you tell Oracle by means of an initialisation parameter how much of the default buffer cache can be used to cache full scans. You can use zero percent (default) for caching full scans, up to 90%. You must leave 10% to OLTP workloads. I haven’t really investigated the pre-12c caching mechanism in detail (hey it works!) but I read that full scans are not cached by default to avoid thrashing the cache.
Big Table Caching
Now what? We can cache full scans too. Interesting, let’s try it. I am using 12.1.0.2.2 on an ODA, but that shouldn’t really matter. The SGA is 24 GB in size, and my segment to be scanned (a non-partitioned table without indexes) is about 20GB in size. I want 50% of the buffer cache allocated for the big table caching tool.
SQL> select component, current_size/power(1024,2) size_mb 2 from v$sga_dynamic_components where current_size <> 0; COMPONENT SIZE_MB ---------------------------------------------------------------- ---------- shared pool 2688 large pool 192 java pool 128 DEFAULT buffer cache 20992 Shared IO Pool 512 SQL> show parameter db_big_table_cache_percent_target NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ db_big_table_cache_percent_target string 50 SQL> select * from v$bt_scan_cache; BT_CACHE_ALLOC BT_CACHE_TARGET OBJECT_COUNT MEMORY_BUF_ALLOC MIN_CACHED_TEMP CON_ID -------------- --------------- ------------ ---------------- --------------- ---------- .500334108 50 0 0 1000 0 SQL> select bytes/power(1024,2) m from dba_segments 2 where owner = 'MARTIN' and segment_name = 'T1'; M ---------- 20864
Now let’s see if we can make use of this. 50% of 20GB are about 10GB useable for the scan cache. If I start a query against T1 in another session I can see the object count increase.
SQL> r 1* select * from v$bt_scan_cache BT_CACHE_ALLOC BT_CACHE_TARGET OBJECT_COUNT MEMORY_BUF_ALLOC MIN_CACHED_TEMP CON_ID -------------- --------------- ------------ ---------------- --------------- ---------- .500334108 50 1 1292363 1000 0
Subsequent queries against T1 will increase the temperature and potentially the cached information. The “object temperature” is a method Oracle uses to determine the suitability of an object to be cached. In my very basic example there is only one table which has been full-scanned so far. Later on I’ll add T2 to the mix. The temperature and other object information are reflected in the second view, v$bt_scan_obj_temps:
SQL> select * from v$bt_scan_obj_temps; TS# DATAOBJ# SIZE_IN_BLKS TEMPERATURE POLICY CACHED_IN_MEM CON_ID ---------- ---------- ------------ ----------- ---------- ------------- ---------- 4 39461 2669763 4000 MEM_PART 1292711 0
Session Counters!
I have developed a habit of looking into session counters when running queries to see if there is anything of interest. Using snapper with the before/after snapshot I got this:
SQL> @snapper4 all,end 5 1 163 Sampling SID 163 with interval 5 seconds, taking 1 snapshots... -- Session Snapper v4.11 BETA - by Tanel Poder ( http://blog.tanelpoder.com ) - Enjoy the Most Advanced Oracle Troubleshooting Script on the Planet! :) --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- SID, USERNAME , TYPE, STATISTIC , DELTA, HDELTA/SEC, %TIME, GRAPH , NUM_WAITS, WAITS/SEC, AVERAGES --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ... 163, MARTIN , STAT, user calls , 3, .06, , , , , 3 per execution ... 163, MARTIN , STAT, physical read total IO requests , 10752, 224.51, , , , , 10.75k per execution 163, MARTIN , STAT, physical read total multi block requests , 10752, 224.51, , , , , 10.75k per execution 163, MARTIN , STAT, physical read total bytes , 11262763008, 235.17M, , , , , 11.26G per execution ... 163, MARTIN , STAT, consistent gets , 2666722, 55.68k, , , , , 2.67M per execution 163, MARTIN , STAT, consistent gets from cache , 1292769, 26.99k, , , , , 1.29M per execution 163, MARTIN , STAT, consistent gets pin , 1292769, 26.99k, , , , , 1.29M per execution 163, MARTIN , STAT, consistent gets pin (fastpath) , 1292768, 26.99k, , , , , 1.29M per execution 163, MARTIN , STAT, consistent gets direct , 1373953, 28.69k, , , , , 1.37M per execution ... 163, MARTIN , STAT, physical reads , 1374849, 28.71k, , , , , 1.37M per execution 163, MARTIN , STAT, physical reads cache , 128, 2.67, , , , , 128 per execution 163, MARTIN , STAT, physical reads direct , 1374721, 28.71k, , , , , 1.37M per execution 163, MARTIN , STAT, physical read IO requests , 10752, 224.51, , , , , 1.05M bytes per request 163, MARTIN , STAT, physical read bytes , 11262763008, 235.17M, , , , , 11.26G per execution ... 163, MARTIN , STAT, data warehousing scanned objects , 1, .02, , , , , 1 per execution 163, MARTIN , STAT, data warehousing scanned blocks , 2666668, 55.68k, , , , , 2.67M per execution 163, MARTIN , STAT, data warehousing scanned blocks - memory , 1292715, 26.99k, , , , , 1.29M per execution 163, MARTIN , STAT, data warehousing scanned blocks - disk , 1373953, 28.69k, , , , , 1.37M per execution ... 163, MARTIN , STAT, table scans (short tables) , 1, .02, , , , , 1 per execution 163, MARTIN , STAT, table scan rows gotten , 16000006, 334.09k, , , , , 16M per execution 163, MARTIN , STAT, table scan disk non-IMC rows gotten , 16000006, 334.09k, , , , , 16M per execution 163, MARTIN , STAT, table scan blocks gotten , 2666668, 55.68k, , , , , 2.67M per execution ... 163, MARTIN , STAT, execute count , 1, .02, , , , , 1 executions per parse ... 163, MARTIN , TIME, parse time elapsed , 26, .54us, .0%, [ ], , , 163, MARTIN , TIME, DB CPU , 5682136, 118.65ms, 11.9%, [@@ ], , , 163, MARTIN , TIME, sql execute elapsed time , 33220099, 693.66ms, 69.4%, [####### ], , , 163, MARTIN , TIME, DB time , 33220306, 693.66ms, 69.4%, [####### ], , , -.32 % unaccounted time 163, MARTIN , WAIT, db file scattered read , 24636, 514.41us, .1%, [ ], 1, .02, 24.64ms average wait 163, MARTIN , WAIT, direct path read , 27536664, 574.98ms, 57.5%, [WWWWWW ], 10694, 223.3, 2.57ms average wait 163, MARTIN , WAIT, SQL*Net message to client , 3, .06us, .0%, [ ], 2, .04, 1.5us average wait 163, MARTIN , WAIT, SQL*Net message from client , 14826292, 309.58ms, 31.0%, [WWWW ], 2, .04, 7.41s average wait -- End of Stats snap 1, end=2015-05-11 10:13:13, seconds=47.9
Don’t worry about the wide output-all you need in the output is the STATISTIC and DELTA columns. This can all be a bit overwhelming at first, so let me guide you through-statistics that are not important for the discussion have already been removed.
First of all you see the phyical reads. 10752 IO requests were issued, all of them multi block requests (physical read total IO requests and physical read total multi block requests, repeated also in physical read IO requests). Hmm – 10GB – isn’t that 50% of my buffer cache? It might actually be possible that 10GB of the 20GB table were read from disk using direct path reads, and another 10GB came from memory.
What is interesting is the next set of counters: “data warehousing scanned %” which I haven’t seen until now. In fact, if I run @TanelPoder’s statn.sql script in my session I can see there are more than the ones I have in the snapper output:
SQL> @statn.sql warehousing STAT# HEX# OFFSET NAME VALUE ---------- ----- ---------- ---------------------------------------------------------------- ---------- 237 ED 1896 data warehousing scanned objects 5 238 EE 1904 data warehousing scanned blocks 13333340 239 EF 1912 data warehousing scanned blocks - memory 6462815 240 F0 1920 data warehousing scanned blocks - flash 0 241 F1 1928 data warehousing scanned blocks - disk 6870525 242 F2 1936 data warehousing scanned blocks - offload 0 243 F3 1944 data warehousing evicted objects 0 244 F4 1952 data warehousing evicted objects - cooling 0 245 F5 1960 data warehousing evicted objects - replace 0 246 F6 1968 data warehousing cooling action 0 10 rows selected.
Interesting! I’m quite glad Oracle gave us so many statistics that describe the way the big table caching works. I have queried T1 5 times in my session, and scanned 13333340 blocks during these. The distribution between memory and disk is 6462815 to 6870525, almost 50:50. It looks like you can use flash for this and offloading (note to self: repeat the test on Exadata). That makes sense as in the example just shown: the segment to be scanned is 20GB out of which 10GB are in the buffer cache. If the rest of the segment can be scanned using direct path reads as in the above example then it is more than likely that you can offload the scan as well.
I can even see if there was space pressure on the big table cache, the data warehousing evicted% statistics hint at space management.
Purely from memory?
Looking at the session counters above my take is too big. I would like to see more scans entirely from memory :) So I created a table with 25% of the size of T1 and called it T2 using the sample clause. Initial scans against T2 showed disk scans only (policy = DISK):
SQL> r 1* select * from v$bt_scan_obj_temps TS# DATAOBJ# SIZE_IN_BLKS TEMPERATURE POLICY CACHED_IN_MEM CON_ID ---------- ---------- ------------ ----------- ---------- ------------- ---------- 4 39461 2669763 8000 MEM_PART 1293081 0 4 39465 668107 5000 DISK 0 0
But after the 9th scan (each scan increased the temperature by 1000) the situation changed:
SQL> r 1* select * from v$bt_scan_obj_temps TS# DATAOBJ# SIZE_IN_BLKS TEMPERATURE POLICY CACHED_IN_MEM CON_ID ---------- ---------- ------------ ----------- ---------- ------------- ---------- 4 39465 668107 9000 MEM_ONLY 668107 0 4 39461 2669763 8000 MEM_PART 624974 0
The smaller table now fits into memory and took over! The result is visible in the execution time:
SQL> r 1* select min(date_created) from t2 MIN(DATE_CREATED) ----------------- 20140513 00:00:00 Elapsed: 00:00:05.39 ... SQL> r 1* select min(date_created) from t2 MIN(DATE_CREATED) ----------------- 20140513 00:00:00 Elapsed: 00:00:01.25
So to prove the point I reconnected to the database and ran my test again:
SQL> select min(date_created) from t2; MIN(DATE_CREATED) ----------------- 20140513 00:00:00 Elapsed: 00:00:01.36 SQL> @statn warehousing STAT# HEX# OFFSET NAME VALUE ---------- ----- ---------- ---------------------------------------------------------------- ---------- 237 ED 1896 data warehousing scanned objects 1 238 EE 1904 data warehousing scanned blocks 666966 239 EF 1912 data warehousing scanned blocks - memory 666966 240 F0 1920 data warehousing scanned blocks - flash 0 241 F1 1928 data warehousing scanned blocks - disk 0 242 F2 1936 data warehousing scanned blocks - offload 0 243 F3 1944 data warehousing evicted objects 0 244 F4 1952 data warehousing evicted objects - cooling 0 245 F5 1960 data warehousing evicted objects - replace 0 246 F6 1968 data warehousing cooling action 0 10 rows selected.
So the popularity of the object plays a big role working out which segment to cache as well. This is an interesting feature worth testing in your development environments. I have no idea if it’s licensable or not so please make sure you check with Oracle before using it to avoid surprises.
Great – I love it when other people tackle the items on my to-do list :)
Glad I could be of help! Are you around at the systems event later this month? Would really like to catch up properly, last time was too short
Martin,
Checking the table scan stats you have one “table scan (short tables)” – the numbers about blocks suggest that this LONG table is now considered a short table, was the “table scans (long table)” stat not incremented ? How about the “table scans (direct read)” and “table scans (rowid ranges)”.
If it is now a (short table), could you also check v$segstat/v$segment_statistics to see if the scan was counted there. Historically a “segment scan” is only recorded if it is a “long” tablescan – “short” ones don’t count.
Will check as soon as I can!
Had a bit of a YouTube strawl yesterday evening and guess what:
https://youtu.be/-dU69ni8pMQ (big table cache)
https://youtu.be/jLtdDPmb1Ws (full database in memory cache)
Nethertheless, good to see the figures supported via snapper so far
;-)
Thanks Marco! I noticed some other blog posts about this as well, but wasn’t aware of the youtube video. Maybe I have to enhance my search skills! And wow, just had a very quick look, and noticed a query against x$kcbwds in the full caching video!
Pingback: Instance stats | Oracle Scratchpad
Hi Martin,
reminds me of Nikolay Kovachev’s algorithm research and RAC.
1) http://progeeking.com/2014/02/25/the-big-q-direct-path-and-cell-offloading/
2) http://progeeking.com/2014/10/03/rac-12c-direct-path-reads/
Regards
Stefan
So many details of configuration that “interfere” with each other and require you to update the nice easy observation/conclusion notes from yesteryear.