While writing the previous two posts about GTTs and Smart Scan, I stumbled across an interesting observation. When I started putting my thoughts to (virtual) paper, I thought it was going to be a short post. Well, it wasn’t to be, again. Anyway, you might find this interesting.
Initial Situation
If you read the previous posts this code example I used to populate the GTT might look familiar:
insert /*+ append */ into gtt select * from t4 where rownum < 400000; commit;
In situations like this where I’m moving data I developed a habit of using the append hint. I guess I’ve been working on HCC tables a bit too much and therefore didn’t even make a conscious decision to use that hint. It was the right thing to do, as you will see next.
The full test harness is shown here again for reference:
SQL> !cat test.sql set lines 120 tab off trimspool on verify off timing on -- this statement is using the /*+ append */ hint insert /*+ append using_append */ into gtt select * from t4 where rownum < 400000; commit; -- not testing if I can get Smart Scans, I know there are :) alter session set "_serial_direct_read" = always; -- make sure the query can be ever so slightly more complex and always return rows col min_id new_v min_id col max_id new_v max_id select min(id) min_id from gtt; select max(id) max_id from gtt; -- this is the actual test @mystats start s=s set echo on select /* using_append */ count(*), id from gtt where id in (&min_id, &max_id) group by id; set echo off @mystats stop t=1 SQL>
When executing the script and looking at the output I didn’t see anything out of the ordinary. When checking for the “depth” of the Smart Scan, I regularly consult the cell blocks processed by % layer statistics to see if I lose blocks between code layers. It wouldn’t appear so:
------------------------------------------------------------------------------------------ 2. Statistics Report ------------------------------------------------------------------------------------------ Type Statistic Name Value ------ ---------------------------------------------------------------- ---------------- ... STAT cell IO uncompressed bytes 546,136,064 STAT cell blocks processed by cache layer 66,667 STAT cell blocks processed by data layer 66,667 STAT cell blocks processed by txn layer 66,667 ... STAT cell physical IO bytes eligible for predicate offload 546,136,064 STAT cell physical IO interconnect bytes 112,504 STAT cell physical IO interconnect bytes returned by smart scan 96,120 ... STAT physical read IO requests 523 STAT physical read bytes 546,152,448 STAT physical read requests optimized 317 STAT physical read total IO requests 523 STAT physical read total bytes 546,152,448 STAT physical read total bytes optimized 330,153,984 STAT physical read total multi block requests 521 STAT physical reads 66,669 STAT physical reads cache 2 STAT physical reads direct 66,667 ... STAT table scans (direct read) 1 STAT table scans (long tables) 1
66,667 blocks in each code layer. This looks ok to me: the number of all physical reads recorded for this statement is 66,669. I certainly don’t want to endorse tuning by ratio, but in this case it sort-of makes sense: the offload efficiency pretty high in my case:
SQL> @fsx.sql Enter value for sql_text: Enter value for sql_id: a4jrkbnsfgk1j SQL_ID CHILD PLAN_HASH EXECS AVG_ETIME AVG_PX OFFLOAD IO_SAVED_% SQL_TEXT ------------- ------ ---------- ------ ---------- ------ ------- ---------- ---------------------------------------------------------------------- a4jrkbnsfgk1j 0 2363333961 1 .01 0 Yes 100.00 select /* using_append */ count(*), id from gtt where id in ( 7
That should suffice for a recap of the previous posts.
No append hint-different outcome
Now imagine my surprise when I didn’t execute a script but rather typed the insert command in an interactive session. In the next example, created to reproduce the observation, I “forgot” to specify the append hint:
SQL> !cat test2.sql -- not using the append hint insert /* not_using_append */ into gtt select * from t4 where rownum < 400000; commit; -- not testing if I can get Smart Scans, I know there are :) alter session set "_serial_direct_read" = always; -- make sure the query can be ever so slightly more complex and always return rows col min_id new_v min_id col max_id new_v max_id select min(id) min_id from gtt; select max(id) max_id from gtt; -- this is the actual test @mystats start s=s set echo on select /* not_using_append */ count(*), id from gtt where id in (&min_id, &max_id) group by id; set echo off @mystats stop t=1
The difference between test.sql and test2.sql looks harmless enough – just a hint, or lack thereof. However the result is interesting. Most executions of the above script resulted in Oracle reporting the following statistics:
------------------------------------------------------------------------------------------ 2. Statistics Report ------------------------------------------------------------------------------------------ Type Statistic Name Value ------ ---------------------------------------------------------------- ---------------- ... STAT calls to get snapshot scn: kcmgss 462 STAT calls to kcmgcs 28 STAT cell IO uncompressed bytes 544,800,768 STAT cell blocks processed by cache layer 67,026 STAT cell blocks processed by data layer 1 STAT cell blocks processed by txn layer 1 STAT cell commit cache queries 67,025 STAT cell flash cache read hits 633 STAT cell num smartio automem buffer allocation attempts 1 STAT cell physical IO bytes eligible for predicate offload 544,800,768 STAT cell physical IO interconnect bytes 545,000,688 STAT cell physical IO interconnect bytes returned by smart scan 544,984,304 STAT cell scans 1 STAT cleanout - number of ktugct calls 66,503 STAT cleanouts only - consistent read gets 66,503 STAT commit txn count during cleanout 66,503 ... STAT consistent gets 133,250 STAT consistent gets direct 66,504 STAT consistent gets examination 66,506 STAT consistent gets examination (fastpath) 66,506 ... STAT immediate (CR) block cleanout applications 66,503 ... STAT physical read IO requests 1,044 STAT physical read bytes 544,817,152 STAT physical read requests optimized 633 STAT physical read total IO requests 1,044 STAT physical read total bytes 544,817,152 STAT physical read total bytes optimized 329,859,072 STAT physical read total multi block requests 748 STAT physical reads 66,506 STAT physical reads cache 2 STAT physical reads direct 66,504 ...
Huh? I was clearly surprised to see so many rows not making it to the higher layers (txn and data) after having been read by the cache layer. Further executions of the select statement within the same session do not necessarily improve this significantly, but I noticed savings by storage index to come into play.
Speaking more scientifically, the listing immediately above shows that out of 67,026 blocks that were read by the cache layer only 1 (!) was processed by txn and data layers. As a direct result Smart Scan does not offer any real benefit: cell physical IO interconnect bytes returned by smart scan is actually ≥ cell physical IO bytes eligible for predicate offload.
Why? What?
There were a few red herrings in the statistic counters I saw:
- Cleanout statistics have been reported by mystats
- commit txn count during cleanout
- cleanouts only – consistent read gets
- immediate (CR) block cleanout applications
- etc.
- There are consistent gets examination that might be peeks at undo information
- Lots of queries to the commit cache (cell commit cache queries) – but for most of my tests I didn’t see replies. Only 1 out of maybe 20 tests produced some cell transactions found in commit cache, but even then their number was not significant. I will provide an example towards the end of this post.
Further investigation
So this leads me to suspect that the blocks that were created with the append hint are “clean”, e.g. suitable for a consistent read on the cells without further need for processing. It would appear so, judging by this block dump (dump the block from the tempfile, not datafile!):
Start dump data blocks tsn: 3 file#:1 minblk 2305025 maxblk 2305025 ... Block dump from disk: buffer tsn: 3 rdba: 0x00632c01 (1/2305025) scn: 0x0.64aaeb9f seq: 0x01 flg: 0x0c tail: 0xeb9f0601 frmt: 0x02 chkval: 0xd11e type: 0x06=trans data ... Block header dump: 0x00632c01 Object id on Block? Y seg/obj: 0x632c00 csc: 0x00.64aaeb9e itc: 3 flg: - typ: 1 - DATA fsl: 0 fnx: 0x0 ver: 0x01 Itl Xid Uba Flag Lck Scn/Fsc 0x01 0x000a.019.0012beca 0x00000000.0000.00 ---- 0 fsc 0x0000.00000000 0x02 0x0000.000.00000000 0x00000000.0000.00 ---- 0 fsc 0x0000.00000000 0x03 0x0000.000.00000000 0x00000000.0000.00 ---- 0 fsc 0x0000.00000000 bdba: 0x00632c01 data_block_dump,data header at 0x7f6250501074 =============== tsiz: 0x1f88 hsiz: 0x1e pbl: 0x7f6250501074 76543210 flag=-------- ntab=1 nrow=6 frre=-1 fsbo=0x1e fseo=0x746 avsp=0x728 tosp=0x728 0xe:pti[0] nrow=6 offs=0 0x12:pri[0] offs=0x1b7d 0x14:pri[1] offs=0x1772 0x16:pri[2] offs=0x1367 0x18:pri[3] offs=0xf5c 0x1a:pri[4] offs=0xb51 0x1c:pri[5] offs=0x746 block_row_dump: tab 0, row 0, @0x1b7d tl: 1035 fb: --H-FL-- lb: 0x0 cc: 6 col 0: [ 2] c1 50 col 1: [999] 37 39 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 ... tl: 1035 fb: --H-FL-- lb: 0x0 cc: 6 ... tl: 1035 fb: --H-FL-- lb: 0x0 cc: 6 ...
This is in contrast to the blocks I dumped that have been created without the append hint. To make sure I’m comparing apples with apples these blocks have been created in a different session after having disconnected from the previous one:
Start dump data blocks tsn: 3 file#:1 minblk 2305025 maxblk 2305025 ... Block dump from disk: buffer tsn: 3 rdba: 0x00632c01 (1/2305025) scn: 0x0.64aab6bf seq: 0x00 flg: 0x0c tail: 0xb6bf0600 frmt: 0x02 chkval: 0x478e type: 0x06=trans data Hex dump of block: st=0, typ_found=1 ... Block header dump: 0x00632c01 Object id on Block? Y seg/obj: 0x629300 csc: 0x00.64aab6bf itc: 2 flg: - typ: 1 - DATA fsl: 0 fnx: 0x0 ver: 0x01 Itl Xid Uba Flag Lck Scn/Fsc 0x01 0x000a.01b.0012be5d 0x00c04427.d97f.27 ---- 6 fsc 0x0000.00000000 0x02 0x0000.000.00000000 0x00000000.0000.00 ---- 0 fsc 0x0000.00000000 bdba: 0x00632c01 data_block_dump,data header at 0x7f53b75e125c =============== tsiz: 0x1fa0 hsiz: 0x1e pbl: 0x7f53b75e125c 76543210 flag=-------- ntab=1 nrow=6 frre=-1 fsbo=0x1e fseo=0x752 avsp=0x734 tosp=0x734 0xe:pti[0] nrow=6 offs=0 0x12:pri[0] offs=0xf6c 0x14:pri[1] offs=0x1379 0x16:pri[2] offs=0x1786 0x18:pri[3] offs=0x1b93 0x1a:pri[4] offs=0x752 0x1c:pri[5] offs=0xb5f block_row_dump: tab 0, row 0, @0xf6c tl: 1037 fb: --H-FL-- lb: 0x1 cc: 6 col 0: [ 4] c3 0f 05 50 col 1: [999] 31 34 30 34 37 39 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 ... tl: 1037 fb: --H-FL-- lb: 0x1 cc: 6 ... tl: 1037 fb: --H-FL-- lb: 0x1 cc: 6 ...
This supports the theory that the block I created without the append hint features some information that prevents it from being passed from cache to txn layer. I didn’t see active txn count during cleanout though, but commit txn count during cleanout and immediate (CR) block cleanout applications instead. So cleanout is definitely needed and performed.
I stated in the second bullet point that commit cache queries were not producing cache-hits most of the time. I managed to get an execution of the select statement against the blocks created without append hint, where at least some commit cache queries produced hits:
------------------------------------------------------------------------------------------ 2. Statistics Report ------------------------------------------------------------------------------------------ Type Statistic Name Value ------ ---------------------------------------------------------------- ---------------- STAT cell IO uncompressed bytes 544,800,768 STAT cell blocks helped by commit cache 6,981 STAT cell blocks processed by cache layer 66,969 STAT cell blocks processed by data layer 6,982 STAT cell blocks processed by txn layer 6,982 STAT cell commit cache queries 66,968 STAT cell flash cache read hits 576 STAT cell num smartio automem buffer allocation attempts 1 STAT cell physical IO bytes eligible for predicate offload 544,800,768 STAT cell physical IO interconnect bytes 487,801,976 STAT cell physical IO interconnect bytes returned by smart scan 487,785,592 STAT cell scans 1 STAT cell transactions found in commit cache 6,981 STAT cleanout - number of ktugct calls 59,522 STAT cleanouts only - consistent read gets 59,522 STAT commit txn count during cleanout 59,522 ... STAT physical read IO requests 987 STAT physical read bytes 544,817,152 STAT physical read requests optimized 576 STAT physical read total IO requests 987 STAT physical read total bytes 544,817,152 STAT physical read total bytes optimized 329,859,072 STAT physical read total multi block requests 723 STAT physical reads 66,506 STAT physical reads cache 2 STAT physical reads direct 66,504
I couldn’t reproduce this at will though, and it wasn’t for lack of trying.
Summary
I am not entirely sure what causes the observed behaviour, especially with regards to the commit cache and/or minscn optimisation but it evidently seems to be better to populate a GTT with the append hint if you want to make best use of Smart Scans against such segments. A very quick and dirty test against a heap table suggests that this is true for this segment types as well.