Monthly Archives: November 2015

TABLE ACCESS INMEMORY FULL – but there may be more

While preparing to teach a class this week I have had the time to look into the In Memory (cost) option a bit closer. I noticed a few interesting things along the way and would like to share one of these here.

Background

One of the questions I was asking myself was:

“What happens if I scan segments that are within the IM area, and some are not?”

I was primarily thinking of joins in a DWH environment, but in order to keep the test case simple enough and reproducible I decided to go with a partitioned table where the current partition is assigned to the IMCS, and the rest is not. For this test I’m relying on the Swingbench SH schema. All of this runs on my laptop in a VM so I had to be selective when it comes to indexes. I also chose to NOT partition the tables at this stage, I wanted to chose my own partitioning scheme. For reference, here is the command that created the SH schema:

[oracle@oraclelinux7 bin]$ ./shwizard -cl -create -cs //localhost/sh_pdb \
> -dba system -dbap somePassword -nopart -pkindexes -scale 1 -tc 2 -u sh -p sh
...
============================================
|           Datagenerator Run Stats        |
============================================
Connection Time                        0:00:00.005
Data Generation Time                   0:04:35.890
DDL Creation Time                      0:00:14.063
Total Run Time                         0:04:49.962
Rows Inserted per sec                       64,707
Data Generated (MB) per sec                    5.3
Actual Rows Generated                   17,848,007

...

I wanted to create a range partitioned version of the sales table. After the SH-wizard completed the data distribution is as shown:

SQL> select count(*), trunc(time_id,'yy') year from sales group by trunc(time_id,'yy') order by year;

  COUNT(*) YEAR
---------- ---------
    130653 01-JAN-95
    149319 01-JAN-96
    251974 01-JAN-97
    326632 01-JAN-98
    365547 01-JAN-99
    388318 01-JAN-00
    393919 01-JAN-01
    406703 01-JAN-02
    483605 01-JAN-03
    566384 01-JAN-04
    422289 01-JAN-05
    619858 01-JAN-06
    782244 01-JAN-07
   1099551 01-JAN-08
   1249340 01-JAN-09
   1346025 01-JAN-10
   1690302 01-JAN-11
   2028048 01-JAN-12
   2028048 01-JAN-13

19 rows selected.

So based on this information I can create a table that has data from 2013 in the IMCS and everything else will be excluded from it. For the sake of the following discussion 2013 is considered the “current” partition. Here is the partitioned sales table’s DDL with my changes to enable my partitioning scheme.

CREATE TABLE SALES_PART
  (
    PROD_ID            NUMBER NOT NULL ENABLE,
    CUST_ID            NUMBER NOT NULL ENABLE,
    TIME_ID            DATE NOT NULL ENABLE,
    CHANNEL_ID         NUMBER NOT NULL ENABLE,
    PROMO_ID           NUMBER NOT NULL ENABLE,
    QUANTITY_SOLD      NUMBER(10,2) NOT NULL ENABLE,
    SELLER             NUMBER(6,0) NOT NULL ENABLE,
    FULFILLMENT_CENTER NUMBER(6,0) NOT NULL ENABLE,
    COURIER_ORG        NUMBER(6,0) NOT NULL ENABLE,
    TAX_COUNTRY        VARCHAR2(3) NOT NULL ENABLE,
    TAX_REGION         VARCHAR2(3),
    AMOUNT_SOLD        NUMBER(10,2) NOT NULL ENABLE
  )
  partition BY range (time_id)
  (
    partition p_old VALUES less than       (DATE '2001-01-01'),
    partition p_2000_2010 VALUES less than (DATE '2011-01-01'),
    partition p_2011 VALUES less than      (DATE '2012-01-01'),
    partition p_2012 VALUES less than      (DATE '2013-01-01'),
    partition p_2013 VALUES less than      (DATE '2014-01-01') inmemory
  )
  TABLESPACE SH;

SQL> alter session enable parallel dml;

Session altered.

SQL> insert /*+ parallel append */ into sales_part select /*+ parallel */ * from sales
  2  /

14728759 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(user,'SALES_PART')

PL/SQL procedure successfully completed.

SQL> select partition_name, num_rows, inmemory
  2  from user_tab_partitions
  3  where table_name = 'SALES_PART'
  4  order by partition_position;

PARTITION_NAME                   NUM_ROWS INMEMORY
------------------------------ ---------- --------
P_OLD                             1612443 DISABLED
P_2000_2010                       7369918 DISABLED
P_2011                            1690302 DISABLED
P_2012                            2028048 DISABLED
P_2013                            2028048 ENABLED

So that should give me what I needed. The IMCS was now populated with the segment as well:

SQL> select segment_name, partition_name, bytes_not_populated, populate_status from v$im_user_segments;

SEGMENT_NAME         PARTITION_NAME                 BYTES_NOT_POPULATED POPULATE_
-------------------- ------------------------------ ------------------- ---------
SALES_PART           P_2013                                           0 COMPLETED

Test

With the setup work complete I am ready to test. First of all, a simple SQL trace should show me what is happening. Note that I am specifically targeting the “current” partition.

SQL> sho user
USER is "SH"
SQL> alter session set events 'sql_trace level 8';

Session altered.

SQL> select /* current_partition */ count(*) from sales_part
  2  where time_id > DATE '2013-01-02'
  3  and promo_id = 316;

  COUNT(*)
----------
      3947

SQL> alter session set events 'sql_trace off';

Session altered.

SQL> select value from v$diag_info where name like 'Def%';

VALUE
--------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/cdb/CDB/trace/CDB_ora_14034.trc

The tkprof’d trace shows this result:

********************************************************************************

select /* current_partition */ count(*) from sales_part
where time_id > DATE '2013-01-02'
and promo_id = 316

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          2          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.00          0          3          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.01       0.01          0          5          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 117  
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=3 pr=0 pw=0 time=8441 us)
      3947       3947       3947   PARTITION RANGE SINGLE PARTITION: 5 5 (cr=3 pr=0 pw=0 time=7583 us cost=214 size=48252 card=4021)
      3947       3947       3947    TABLE ACCESS INMEMORY FULL SALES_PART PARTITION: 5 5 (cr=3 pr=0 pw=0 time=6972 us cost=214 size=48252 card=4021)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       2        0.00          0.00
  SQL*Net message from client                     2        6.79          6.79
********************************************************************************

As you can see, my partitioning strategy paid off-only the 5th partition is accessed (that’s p_2013 or the “current” partition based on the data). This partition is the one entirely found in the IMCS.

A result like the one above is what I’d expect and the access path “TABLE ACCESS INMEMORY FULL” does it justice. But now the actual reason for the blog post: mixing scans against segments in memory and on disk. Logging on again I ran the same query but without the restriction to 2013.

SQL> alter session set events 'sql_trace level 8';

Session altered.

SQL> select /* every_partition */ count(*) from sales_part
  2  where promo_id = 316;

  COUNT(*)
----------
     28993

SQL> alter session set events 'sql_trace off';

Session altered.

SQL> select value from v$diag_info where name like 'Def%';

VALUE
----------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/cdb/CDB/trace/CDB_ora_14437.trc

The tkprof’d result is shown here:

select /* every_partition */ count(*) from sales_part
where promo_id = 316

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      3.15       3.84      94878      94901          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      3.15       3.84      94878      94901          0           1

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 117  
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=94901 pr=94878 pw=0 time=3848392 us)
     28993      28993      28993   PARTITION RANGE ALL PARTITION: 1 5 (cr=94901 pr=94878 pw=0 time=4503583 us cost=26217 size=117128 card=29282)
     28993      28993      28993    TABLE ACCESS INMEMORY FULL SALES_PART PARTITION: 1 5 (cr=94901 pr=94878 pw=0 time=4254867 us cost=26217 size=117128 card=29282)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       2        0.00          0.00
  Disk file operations I/O                        4        0.00          0.00
  db file sequential read                         6        0.00          0.01
  direct path read                               42        0.01          0.18
  SQL*Net message from client                     2        7.63          7.63
********************************************************************************

Again the access path is clearly indicated as “TABLE ACCESS INMEMORY FULL SALES_PART”. This time it references all 5 partitions (which is expected since I didn’t have a date in my predicate list). And I can see direct path reads plus some other I/O related information! Direct Path Reads are quite likely going to be Smart Scans on Exadata by the way…

Of course “TABLE ACCESS INMEMORY FULL” is correct (because one partition is accessed that way), but I guess that you cannot see the disk-based I/O against the other segments from the regular execution plan as shown in SQLPlus for example .

Is there any other instrumentation I could use?

The tkprof output shows that data retrieval is possible for segments that are not (entirely) found in the IMCS. Where else could I learn about this fact? Session statistics (V$STATNAME, V$SESSTAT, etc) are another useful source of information. I am a great fan of session snapper (who isn’t?). Snapper can be used to display the change in the session counters while the session you are monitoring is running. Another option is Adrian Billington’s mystats script. It can help you capture the changes in session statistics during the execution of a SQL command. I executed the same query again, sandwiched into calls to mystats. Note that the statistics shown next do not match those of to the execution of the query above. I also tried to limit the output only to the relevant ones, output is sorted by statistic name.


------------------------------------------------------------------------------------------
2. Statistics Report
------------------------------------------------------------------------------------------
Type    Statistic Name                                                               Value
------  ----------------------------------------------------------------  ----------------

STAT    IM scan CUs columns accessed                                                     4
STAT    IM scan CUs columns theoretical max                                             48
STAT    IM scan CUs memcompress for query low                                            4
STAT    IM scan CUs predicates applied                                                   4
STAT    IM scan CUs predicates received                                                  4
STAT    IM scan CUs split pieces                                                         4
STAT    IM scan bytes in-memory                                                 47,683,308
STAT    IM scan bytes uncompressed                                              99,118,124
STAT    IM scan rows                                                             2,028,048
STAT    IM scan rows projected                                                       3,963
STAT    IM scan rows valid                                                       2,028,048
STAT    IM scan segments minmax eligible                                                 4

STAT    consistent gets                                                             95,093
STAT    consistent gets direct                                                      94,872
STAT    consistent gets examination                                                      4
STAT    consistent gets examination (fastpath)                                           4
STAT    consistent gets from cache                                                     221
STAT    consistent gets pin                                                            217
STAT    consistent gets pin (fastpath)                                                 217

STAT    logical read bytes from cache                                            9,666,560
STAT    no work - consistent read gets                                              94,871

STAT    physical read IO requests                                                      749
STAT    physical read bytes                                                    777,191,424
STAT    physical read total IO requests                                                749
STAT    physical read total bytes                                              777,191,424
STAT    physical read total multi block requests                                       749
STAT    physical reads                                                              94,872
STAT    physical reads direct                                                       94,872

STAT    session logical reads                                                      111,201
STAT    session logical reads - IM                                                  15,149

STAT    table scan blocks gotten                                                    94,872
STAT    table scan disk non-IMC rows gotten                                     12,700,711
STAT    table scan rows gotten                                                  14,728,759
STAT    table scans (IM)                                                                 1
STAT    table scans (direct read)                                                        4
STAT    table scans (long tables)                                                        5

STAT    temp space allocated (bytes)                                             1,048,576

STAT    workarea executions - optimal                                                    1
STAT    workarea memory allocated                                                        5
------------------------------------------------------------------------------------------
3. About
------------------------------------------------------------------------------------------
- MyStats v2.01 by Adrian Billington (http://www.oracle-developer.net)
- Based on the SNAP_MY_STATS utility by Jonathan Lewis
==========================================================================================
End of report
==========================================================================================

The highlights are:

  • You can see the compression ratio of the data in the IMCS (IM scan bytes in-memory and IM scan bytes uncompressed)
  • All rows in the partition were valid (the number of IM Scan rows equals the number of rows in the partition)
  • One segment was scanned using IM, and 4 were scanned as direct (path) reads. A total of 5 segments were scanned

Hope this helps!

Advertisements

Little things worth knowing: automatic generation of extended statistics in 12c Part II

In my last post I demonstrated an example how to use a PL/SQL package to monitor a workload in order to identify correlated columns. Such columns are good candidates for the creation of extended statistics since the optimiser does not assume that information stored in different columns may be related. Before starting my tests I invoked DBMS_STATS.SEED_COL_USAGE, which in turn increased Oracle’s attention level trying to find correlated columns. Eventually these have been identified (as shown in the output of DBMS_STATS.REPORT_COL_USAGE) and a subsequent call to DBMS_STATS.GATHER_TABLE_STATS caused extended statistics to be created, including histograms. This is one way you can get extended statistics automatically, but it requires you to enable monitoring of a workload by invoking a PL/SQL API. In other words, it is a conscious effort.

More Ways to Extended Statistics

The other way of ending up with extended statistics is more subtle as it does not require user intervention. As described in the documentation (Database SQL Tuning Guide, chapter 13 “Managing Optimizer Statistics: Advanced Topics”) there is another way to gather extended statistics automatically. It is based on statistics feedback and SQL Plan Directives, both new to 12c (actually statistics feedback is not quite so new, it seems to be an evolution of cardinality feedback).

Demo

To start with a clean sheet I dropped the table I had used previously, which got rid of all the extended stats and SQL Plan Directives (SPD) from the previous example. To be really thorough I also flushed the shared pool.

SQL> drop table martin.customers purge;

Table dropped.

SQL> select * from dba_stat_extensions where owner = user;

no rows selected

SQL> select count(*) from DBA_SQL_PLAN_DIR_OBJECTS where owner = user;

  COUNT(*)
----------
         0

SQL> alter system flush shared_pool;

System altered.

BTW I have switched environments to an Oracle Restart 12.1.0.2 database with the JUL 2015 PSU applied. The second patch you see here is the OJVM patch.

SQL> select patch_id, version, action from dba_registry_sqlpatch;

  PATCH_ID VERSION              ACTION
---------- -------------------- ---------------
  20831110 12.1.0.2             APPLY
  21068507 12.1.0.2             APPLY

Back to creating the table… I am using the same technique as before, but this time without the call to DBMS_STATS.SEED_COL_USAGE:

SQL> create table martin.customers as select * from sh.customers;

Table created.

SQL> select num_rows, last_analyzed from user_tables where table_name = 'CUSTOMERS';

  NUM_ROWS LAST_ANALYZED
---------- -------------------
     55500 15.10.2015 16:57:42

I love the 12c feature where stats are gathered during a CTAS operation …

And now a few queries to tickle the optimiser. Apologies for the wide output but that makes it so much easier to use RLWRAP and command line history. Credit again to
the blog post by Maria Colgan
for the query examples.

SQL> select /*+ gather_plan_statistics */ count(*) from customers where cust_city = 'Los Angeles' and cust_state_province = 'CA' and country_id = 52790;

  COUNT(*)
----------
       932

SQL>
SQL> select * from table(dbms_xplan.display_cursor(format => 'allstats last'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------
SQL_ID  34zmr3acgz06g, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from customers where
cust_city = 'Los Angeles' and cust_state_province = 'CA' and country_id
= 52790

Plan hash value: 296924608

---------------------------------------------------------------------------------------------------
| Id  | Operation          | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |      1 |        |      1 |00:00:00.03 |    1521 |   1516 |
|   1 |  SORT AGGREGATE    |           |      1 |      1 |      1 |00:00:00.03 |    1521 |   1516 |
|*  2 |   TABLE ACCESS FULL| CUSTOMERS |      1 |      1 |    932 |00:00:00.04 |    1521 |   1516 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("CUST_CITY"='Los Angeles' AND "CUST_STATE_PROVINCE"='CA' AND
              "COUNTRY_ID"=52790))


22 rows selected.

There is nothing too new here-the optimiser’s cardinality estimate is not great. Following the example from the SQL Tuning Guide-chapter 13 (btw country_id is a number, not a varchar2 field) we can now check if there is anything fancy going to happen next. And sure enough there is:

SQL> select sql_id, child_number, is_obsolete, is_reoptimizable, is_resolved_adaptive_plan from v$sql where sql_id = '34zmr3acgz06g';

SQL_ID        CHILD_NUMBER I I I
------------- ------------ - - -
34zmr3acgz06g            0 N Y

The second flag, is_reoptimisable, is interesting. It indicates the optimiser’s intention to apply information found at run-time to the next execution of the cursor. Let’s run the original query again.

SQL> select /*+ gather_plan_statistics */ count(*) from customers where cust_city = 'Los Angeles' and cust_state_province = 'CA' and country_id = 52790;

  COUNT(*)
----------
       932

So what did that do? Let’s have a look at the diagnostics:


SQL> select sql_id, child_number, is_obsolete, is_reoptimizable, is_resolved_adaptive_plan from v$sql where sql_id = '34zmr3acgz06g';

SQL_ID        CHILD_NUMBER I I I
------------- ------------ - - -
34zmr3acgz06g            0 N Y
34zmr3acgz06g            1 N N

SQL> select * from table(dbms_xplan.display_cursor(sql_id=>'34zmr3acgz06g',cursor_child_no=>1,format => 'allstats last'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  34zmr3acgz06g, child number 1
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from customers where
cust_city = 'Los Angeles' and cust_state_province = 'CA' and country_id
= 52790

Plan hash value: 296924608

------------------------------------------------------------------------------------------
| Id  | Operation          | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |      1 |        |      1 |00:00:00.01 |    1521 |
|   1 |  SORT AGGREGATE    |           |      1 |      1 |      1 |00:00:00.01 |    1521 |
|*  2 |   TABLE ACCESS FULL| CUSTOMERS |      1 |    932 |    932 |00:00:00.01 |    1521 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("CUST_CITY"='Los Angeles' AND "CUST_STATE_PROVINCE"='CA' AND
              "COUNTRY_ID"=52790))

Note
-----
   - statistics feedback used for this statement


26 rows selected.

As you can see another child cursor has been created. Why?

SQL> select xmltype(reason) from v$sql_shared_cursor where dbms_lob.getlength(reason) > 1 and sql_id = '34zmr3acgz06g';

XMLTYPE(REASON)
-----------------------------------------------------------------------------------------------------------------------------
<ChildNode>
  <ChildNumber>0</ChildNumber>
  <ID>49</ID>
  <reason>Auto Reoptimization Mismatch(1)</reason>
  <size>3x4</size>
  <kxscflg>32</kxscflg>
  <kxscfl4>4194560</kxscfl4>
  <dnum_kksfcxe>262144</dnum_kksfcxe>
</ChildNode>

Ah – the cursor was created because of a reoptimisation mismatch. Checking against v$sql you can see that the optimiser is now happy with the cardinality estimate (not anticipating further reoptimisation for this statement). Note that depending on the query’s complexity many more child cursors can be created as part of the reoptimisation. Also note that for child cursor 1 the A and E rows are perfect matches. Our work here is done. But wait- what about that column usage?


SQL> select dbms_stats.report_col_usage(user, 'CUSTOMERS') from dual;

DBMS_STATS.REPORT_COL_USAGE(USER,'CUSTOMERS')
--------------------------------------------------------------------------------
LEGEND:
.......

EQ         : Used in single table EQuality predicate
RANGE      : Used in single table RANGE predicate
LIKE       : Used in single table LIKE predicate
NULL       : Used in single table is (not) NULL predicate
EQ_JOIN    : Used in EQuality JOIN predicate
NONEQ_JOIN : Used in NON EQuality JOIN predicate
FILTER     : Used in single table FILTER predicate
JOIN       : Used in JOIN predicate
GROUP_BY   : Used in GROUP BY expression
...............................................................................

###############################################################################

COLUMN USAGE REPORT FOR MARTIN.CUSTOMERS
........................................

1. COUNTRY_ID                          : EQ
2. CUST_CITY                           : EQ
3. CUST_STATE_PROVINCE                 : EQ
###############################################################################

SQL> select * from dba_stat_extensions where owner = user;

no rows selected

Nothing here except some equality predicates! This is the big difference to part I of this article where the filter on the 3 columns was detected thanks to the call to DBMS_STATS.SEED_COL_USAGE.

It took me a while to connect the dots and understand what needed to be done next. In the meantime, hidden from view, and in the background, Oracle created a SQL Plan Directive for that table which I failed to notice for a while. SQL Plan Directives are not persisted immediately, the documentation states they are written to SYSAUX every 15 minutes. I must have thought about this for more than 15 minutes, as you will see shortly. If you are a bit more impatient then force the write of the SPD to disk using DBMS_SPD.FLUSH_SQL_PLAN_DIRECTIVE now.

To verify if there were any SPDs I used a query taken more or less literally from the documentation, and yes, there are:

SQL> SELECT TO_CHAR(d.DIRECTIVE_ID) dir_id, o.OBJECT_NAME,
  2         o.SUBOBJECT_NAME col_name, o.OBJECT_TYPE, d.TYPE, d.STATE, d.REASON
  3  FROM   DBA_SQL_PLAN_DIRECTIVES d, DBA_SQL_PLAN_DIR_OBJECTS o
  4  WHERE  d.DIRECTIVE_ID=o.DIRECTIVE_ID
  5  AND    o.OWNER = user
  6  ORDER BY 1,2,3,4,5;

DIR_ID                         OBJECT_NAME          COL_NAME             OBJECT TYPE             STATE    REASON
------------------------------ -------------------- -------------------- ------ ---------------- -------- ------------------------------------
15273172249382976180           CUSTOMERS            COUNTRY_ID           COLUMN DYNAMIC_SAMPLING USABLE   SINGLE TABLE CARDINALITY MISESTIMATE
15273172249382976180           CUSTOMERS            CUST_CITY            COLUMN DYNAMIC_SAMPLING USABLE   SINGLE TABLE CARDINALITY MISESTIMATE
15273172249382976180           CUSTOMERS            CUST_STATE_PROVINCE  COLUMN DYNAMIC_SAMPLING USABLE   SINGLE TABLE CARDINALITY MISESTIMATE
15273172249382976180           CUSTOMERS                                 TABLE  DYNAMIC_SAMPLING USABLE   SINGLE TABLE CARDINALITY MISESTIMATE

Once you can see the SPD in the dictionary, you can also see them in action when you run a statement with the same predicates but different SQL_ID, as in this example.

SQL> select /*+ gather_plan_statistics new_parse_please */ count(*) from customers where cust_city = 'Los Angeles' and cust_state_province = 'CA' and country_id = 52790;

  COUNT(*)
----------
       932

SQL> select * from table(dbms_xplan.display_cursor(format => 'allstats last'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  85qvryzgzj57q, child number 0
-------------------------------------
select /*+ gather_plan_statistics new_parse_please */ count(*) from
customers where cust_city = 'Los Angeles' and cust_state_province =
'CA' and country_id = 52790

Plan hash value: 296924608

------------------------------------------------------------------------------------------
| Id  | Operation          | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |      1 |        |      1 |00:00:00.01 |    1521 |
|   1 |  SORT AGGREGATE    |           |      1 |      1 |      1 |00:00:00.01 |    1521 |
|*  2 |   TABLE ACCESS FULL| CUSTOMERS |      1 |    829 |    932 |00:00:00.01 |    1521 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("CUST_CITY"='Los Angeles' AND "CUST_STATE_PROVINCE"='CA' AND
              "COUNTRY_ID"=52790))

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - 1 Sql Plan Directive used for this statement


27 rows selected.

The SQL Plan directive stays even if you flush the cursor with SQL ID 34zmr3acgz06g from the cursor cache or supply a different set of predicates. I used Kerry Osborne’s flush_sql.sql script for this.

SQL> @flush_sql 
Enter value for sql_id: 34zmr3acgz06g
old  14:   where sql_id like '&sql_id';
new  14:   where sql_id like '34zmr3acgz06g';

PL/SQL procedure successfully completed.

SQL> select sql_id, child_number, is_obsolete, is_reoptimizable, is_resolved_adaptive_plan from v$sql where sql_id = '34zmr3acgz06g';

no rows selected

SQL> select /*+ gather_plan_statistics */ count(*) from customers where cust_city = 'Los Angeles' and cust_state_province = 'CA' and country_id = 52790;

  COUNT(*)
----------
       932

SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  34zmr3acgz06g, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from customers where
cust_city = 'Los Angeles' and cust_state_province = 'CA' and country_id
= 52790

Plan hash value: 296924608

------------------------------------------------------------------------------------------
| Id  | Operation          | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |      1 |        |      1 |00:00:00.01 |    1521 |
|   1 |  SORT AGGREGATE    |           |      1 |      1 |      1 |00:00:00.01 |    1521 |
|*  2 |   TABLE ACCESS FULL| CUSTOMERS |      1 |    953 |    932 |00:00:00.01 |    1521 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("CUST_CITY"='Los Angeles' AND "CUST_STATE_PROVINCE"='CA' AND
              "COUNTRY_ID"=52790))

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - 1 Sql Plan Directive used for this statement


27 rows selected.


SQL> select /*+ gather_plan_statistics */ count(*) from customers where cust_city = 'Palmdale' and cust_state_province = 'FL' and country_id = 52790;

  COUNT(*)
----------
       250

SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  dbkfpchpfwap3, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from customers where
cust_city = 'Palmdale' and cust_state_province = 'FL' and country_id =
52790

Plan hash value: 296924608

------------------------------------------------------------------------------------------
| Id  | Operation          | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |      1 |        |      1 |00:00:00.01 |    1521 |
|   1 |  SORT AGGREGATE    |           |      1 |      1 |      1 |00:00:00.01 |    1521 |
|*  2 |   TABLE ACCESS FULL| CUSTOMERS |      1 |    287 |    250 |00:00:00.02 |    1521 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("CUST_CITY"='Palmdale' AND "CUST_STATE_PROVINCE"='FL' AND
              "COUNTRY_ID"=52790))

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - 1 Sql Plan Directive used for this statement


27 rows selected.

And NOW you get the update to REPORT_COL_USAGE:

SQL> select dbms_stats.report_col_usage(user, 'CUSTOMERS') from dual;

DBMS_STATS.REPORT_COL_USAGE(USER,'CUSTOMERS')
--------------------------------------------------------------------------------
LEGEND:
.......

EQ         : Used in single table EQuality predicate
RANGE      : Used in single table RANGE predicate
LIKE       : Used in single table LIKE predicate
NULL       : Used in single table is (not) NULL predicate
EQ_JOIN    : Used in EQuality JOIN predicate
NONEQ_JOIN : Used in NON EQuality JOIN predicate
FILTER     : Used in single table FILTER predicate
JOIN       : Used in JOIN predicate
GROUP_BY   : Used in GROUP BY expression
...............................................................................

###############################################################################

COLUMN USAGE REPORT FOR MARTIN.CUSTOMERS
........................................

1. COUNTRY_ID                          : EQ
2. CUST_CITY                           : EQ
3. CUST_STATE_PROVINCE                 : EQ
4. (CUST_CITY, CUST_STATE_PROVINCE,
    COUNTRY_ID)                        : FILTER
###############################################################################

From here on it’s the same as in part 1 of this article. You gather stats, either manually like me or automatically like Oracle would, and the end result are extended statistics on the “filter” shown in line 4.


SQL> exec dbms_stats.gather_table_stats(user, 'CUSTOMERS')

PL/SQL procedure successfully completed.

SQL> select table_name, extension_name, extension from dba_stat_extensions where owner = user;

TABLE_NAME                     EXTENSION_NAME                           EXTENSION
------------------------------ ---------------------------------------- --------------------------------------------------------------------------------
CUSTOMERS                      SYS_STSMZ$C3AIHLPBROI#SKA58H_N           ("CUST_CITY","CUST_STATE_PROVINCE","COUNTRY_ID")

Voila! Extended statistics.

There are many more interesting implications to this whole concept, which is something I’ll write about in another post. It’s incredibly interesting, I can’t believe I’m writing optimiser posts …

PS: thanks to @Mautro for casting an eye over this article!