Martins Blog

Trying to explain complex things in simple terms

Incrementally gathering statistics in 10g

Posted by Martin Bach on June 7, 2010

One of the cool new features of the Oracle 11g database is that you can copy statistics and incrementally gather global stats for rlt (really large tables). I wanted to do some testing around these features in 10.2.0.4 but ran into a number of problems. So this is the story…. I have decided to split the blog post into two parts, the first which you are currently reading will detail the installation of the patch on a single instance 10.2.0.4 database with PSU 1 installed, the second post will go through an example where I made use of the new feature.

Let’s start with the installation of the patch. My platform is Linux x86-64, please ensure that you download the patch for your platform or request development to back port it (good luck!).

I found out about the cool new stuff in DBMS_STATS in this post from the Oracle optimizer magic blog:

http://optimizermagic.blogspot.com/2009/02/maintaining-statistics-on-large.html

As good as the blog is, the patch numbers at the very end for 10.2.0.4 were wrong. After some digging around I found patch 7381308 in MOS document “Poor Plan Intermittently on Very Large Partitioned Table with too Many Partitions”. The clue was at the end which reads:

4- It is necessary to install the one off patch for the unpublished Bug 7381308 if you are using copy_table_stats
procedure or APPROX_GLOBAL option in 10.2.0.4.

The patch description seemed a bit misleading “ADJUSTING MIN/MAX OF PARTITION COLUMN #(5643297) DOES NOT WORK”. Ok then, let’s give it a go. I downloaded the patch to /tmp on the database server and extracted it. After shutting down the database I did the usual opatch apply which was so straight forward I don’t even mention it. The fun started afterwards.

The post installation instructions state that you need to start the database in exclusive mode for upgrade. In RAC, that means with “cluster_database” set to false and only one instance. With single instance Oracle (my system), it was sufficient to “startup upgrade”. I did and run the commands from the readme:

SQL> startup upgrade

ORACLE instance started.
Total System Global Area  444596224 bytes
Fixed Size            2084584 bytes
Variable Size          234881304 bytes
Database Buffers      201326592 bytes
Redo Buffers            6303744 bytes
Database mounted.
Database opened.

SQL> show user
USER is "SYS"
SQL> @?/rdbms/admin/prvtstas.plb

Package created.

No errors.

Package body created.

No errors.

Now for the next file:

SQL> @?/rdbms/admin/prvtstat.plb

Warning: Package Body created with compilation errors.

Errors for PACKAGE BODY DBMS_STATS:

LINE/COL ERROR
-------- -----------------------------------------------------------------
2769/13  PLS-00323: subprogram or cursor 'COPY_TABLE_STATS' is declared in
 a package specification and must be defined in the package body

begin
*
ERROR at line 1:
ORA-04063: package body "SYS.DBMS_STATS" has errors
ORA-06508: PL/SQL: could not find program unit being called: "SYS.DBMS_STATS"
ORA-06512: at line 2

This was less cool. So DBMS_STATS has errors, how inconvenient. One would expect Oracle to do better. What can you do? Have a look at the patch contents. This revealed a third file, which turned out to be the new DBMS_STATS. So continuing the patch application all is well:

mbach@uklnxpc005:/tmp/7381308/files/rdbms/admin> ls -l
total 692
-rw-rw-rw- 1 mbach users 132280 2009-02-05 11:21 dbmsstat.sql
-rw-rw-r-- 1 mbach users 124443 2009-02-05 11:21 prvtstas.plb
-rw-rw-r-- 1 mbach users 443048 2009-02-05 11:21 prvtstat.plb

-> trying again

SQL> @?/rdbms/admin/dbmsstat.sql

Package created.

No errors.

Synonym created.

Grant succeeded.

Role created.

Grant succeeded.

Grant succeeded.

Library created.

SQL> @?/rdbms/admin/prvtstas.plb

Package created.

No errors.

Package body created.

No errors.
SQL> @?/rdbms/admin/prvtstat.plb

Package body created.

No errors.

PL/SQL procedure successfully completed.

SQL>

As with anything that touches other code in the database we might get invalid objects. I personally don’t like invalid objects in my database so I checked:

Fair enough-but don’t worry, they will all compile without error:

SQL> r
 1* select object_name,object_type from dba_objects where owner = 'SYS' and status='INVALID'

OBJECT_NAME                    OBJECT_TYPE
------------------------------ ------------------------------
LOGMNR_KRVRDLUID3              PROCEDURE
LOGMNR_KRVRDREPDICT3           PROCEDURE
DBMS_LOGMNR_FFVTOLOGMNRT       PROCEDURE
DBMS_LOGMNR_OCTOLOGMNRT        PROCEDURE
DBMS_REGISTRY_SYS              PACKAGE BODY
DBMS_SCHEMA_COPY               PACKAGE BODY
UTL_RECOMP                     PACKAGE BODY
DBMS_SWRF_INTERNAL             PACKAGE BODY
DBMS_SQLTUNE_INTERNAL          PACKAGE BODY

9 rows selected.

So all this effort, but what do we get? Here’s an excerpt from the header of the new DBMS_STATS for  the new granularity option:

--   granularity - the granularity of statistics to collect (only pertinent
--      if the table is partitioned)
--     'AUTO' - the procedure determines what level of statistics to collect
--     'GLOBAL AND PARTITION' - gather global- and partition-level statistics
--     'APPROX_GLOBAL AND PARTITION' - This option is similar to
--        'GLOBAL AND PARTITION'. But the global statistics are aggregated
--         from partition level statistics. It will aggregate all statistics
--         except number of distinct values for columns and number of distinct
--         keys of indexes.
--         The existing histograms of the columns at the table level
--         are also aggregated.The global statistics are  gathered
--         (i.e., going back to GLOBAL AND PARTITION behaviour)
--         if partname argument is null or if the aggregation cannot be done
--         e.g., statistics of one of the partitions is missing.
--         This option is useful when you collect statistics for a new
--         partition added into a range partitioned table (for example,
--         a table  partitioned by month).
--         The new data in the partition makes the global statistics stale
--         (especially the min/max values of the partitioning column).
--         This stale global statistics may cause suboptimal plans.  
--         In this scenario, users can collect statistics for the newly added
--         partition with 'APPROX_GLOBAL AND PARTITION' option so that the
--         global statistics will reflect the newly added range.
--         This option will take less time than 'GLOBAL AND PARTITION' option
--         since the global statistics are aggregated from underlying
--         partition level statistics.
--         Note that, if you are using APPROX_GLOBAL AND PARTITION,
--         you still  need to collect global statistics (with granularity =
--         'GLOBAL' option)
--         when there is substantial amount of change at the table level.
--         For example you added 10% more data to the table.  
--         This is needed to get the correct number of distinct values/keys
--         statistic at table level.
--     'SUBPARTITION' - gather subpartition-level statistics
--     'PARTITION' - gather partition-level statistics
--     'GLOBAL' - gather global statistics
--     'ALL' - gather all (subpartition, partition, and global) statistics

The copy_table_stats procedure is documented as follows:

--
-- This procedure copies the stats of the source [sub] partition to the
-- dst [sub] partition after scaling (the number of blks, number of rows etc.).
-- It sets the high bound partitioning value as the max value of the first
-- partitioning col and high bound partitioning value of the previous partition
-- as the min value of the first partitioning col for range partitioned table.
-- It finds the max and min from the list of values for the list partitioned table.
-- It also sets the normalized max and min values. If the destination partition
-- is the first partition then min values are equal to max values. If the stats for
-- source are not avaliable then nothing is copied.
--
--   ownname - schema of index to analyze
--   tabname - table name of source and destination [sub]partitions
--   srcpartname - source [sub]partition
--   dstpartname - destination [sub]partition
--   scale_factor - scale factor to scale nblks, nrows etc. in dstpart
-- Exceptions:
--   ORA-20000: Invalid partition name
--   ORA-20001: Bad input value
--
procedure copy_table_stats(
  ownname varchar2,
  tabname varchar2,
  srcpartname varchar2,
  dstpartname varchar2,
  scale_factor number DEFAULT 1,
  flags number DEFAULT null,
  force boolean DEFAULT FALSE);

The next post of will compare the full stats gathering with the new APPROX_GLOBAL AND PARTITION method.

4 Responses to “Incrementally gathering statistics in 10g”

  1. Doug Burns said

    That’s not fair! This is in my To Do list ;-)

    • Martin said

      Hi Doug,

      thanks for passing by. BTW, this has been out for quite some time now, my apologies :)

      Are you around for the RAC & HA SIG on Thursday in London?

  2. Doug Burns said

    Ha, I got my Martin’s mixed up! Thought this was a Martin Widlake post. Fool ;-)

    Unfortunately I won’t be able to make it to the SIG

    • mwidlake said

      Wha? Hey?! If I’m going to be insulted can you guys keep me informed?? I had to search on “widlake + idiot” to find this.

      Sadly I was not able to make that SIG too, too much going on. And for the record, nice posting.

      I’m stuck working mostly on 10.2.0.3 and though you can role up partition stats to global stats in that version, I am not convinved it is a good idea…

      Martin “idiot” Widlake

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: