Martins Blog

Trying to explain complex things in simple terms

Concurrent statistics gathering in 12.1.0.2

Posted by Martin Bach on November 10, 2014

Prompted by an actual task at hand I spent some time investigating an 11.2.0.2 feature – concurrent statistics gathering. It has been on my to-do list for quite some time but so far I didn’t have a use case, and use cases make it so much easier. The question was-how can I gather statistics on a really large, partitioned table? Previously, you could revert to the degree in dbms_stats.gather_table_stats to ensure that statistics were gathered in parallel. This is all good, but sometimes you need more umph. Some DBAs wrote scripts to execute individual statistic gathering jobs against partitions in parallel, using the tabname and partname arguments in dbms_stats.gather_table_stats(). But that requires manual effort – and the not-quite-so-new concurrent option is so much nicer. Let me take you along the ride… Actually I have to tell the story starting with the happy ending as I had a few snags along the way. This is 12.1.0.2.1 on Oracle Linux 6.5.

Enabling concurrent statistics gathering

A flag needs to be set to enable the new concurrent mechanism for gathering statistics. The initial blog post on the subject set the flag globally:

https://blogs.oracle.com/optimizer/entry/gathering_optimizer_statistics_is_one

Scroll down to “configuration and settings” for the detail. I personally would rather like to set this at table level, but despite the Oracle documentation suggesting that it was possible, it is not. You can check the header of DBMS_STATS to see the different values that can be passed as PNAME to set_table_prefs().

After enabling the concurrent flag on the database level, you can query it:

SQL> select dbms_stats.get_prefs('CONCURRENT')
  2  from dual;

DBMS_STATS.GET_PREFS('CONCURRENT')
------------------------------------------------------------------------------------
TRUE

SQL> select dbms_stats.get_prefs('CONCURRENT',user,'BIGTAB_PART') flag from dual;

FLAG
-------------------------------------------------------------------------------------
TRUE

We should be ready for using the concurrent method now. Here is some background information about the table in question:

SQL>  select partition_name, num_rows, sample_size, last_analyzed 
  2   from user_tab_partitions where table_name = 'BIGTAB_PART'
  3  ;

PARTITION_NAME                   NUM_ROWS SAMPLE_SIZE LAST_ANAL
------------------------------ ---------- ----------- ---------
SYS_P1254
SYS_P1253
SYS_P1252
SYS_P1251
SYS_P1250
SYS_P1249
SYS_P1248
SYS_P1247
SYS_P1246
SYS_P1245
P_MAN

11 rows selected.

SQL> select partition_name, bytes/power(1024,2) m from user_segments where segment_name = 'BIGTAB_PART';

PARTITION_NAME                          M
------------------------------ ----------
SYS_P1245                            8064
SYS_P1246                            8064
SYS_P1247                            8064
SYS_P1248                            8064
P_MAN                                8064
SYS_P1249                            8064
SYS_P1250                            8064
SYS_P1251                            8064
SYS_P1252                            8064
SYS_P1253                            8064
SYS_P1254                               8

11 rows selected.

So each partition is about 8 GB in size. A little helper script prints all the table preferences:

getting table preferences for table BIGTAB_PART
value for CASCADE is DBMS_STATS.AUTO_CASCADE
value for DEGREE is NULL
value for ESTIMATE_PERCENT is DBMS_STATS.AUTO_SAMPLE_SIZE
value for METHOD_OPT is FOR ALL COLUMNS SIZE AUTO
value for NO_INVALIDATE is DBMS_STATS.AUTO_INVALIDATE
value for GRANULARITY is AUTO
value for PUBLISH is TRUE
value for INCREMENTAL is FALSE
value for STALE_PERCENT is 10
value for CONCURRENT is TRUE

Now you can simply execute the call to dbms_stats.gather_table_stats(user, ‘BIGTAB_PART’, degree=>4) and thanks to all the defaults the gathering of statistics will be performed by multiple jobs. In order for this to work you need to have job_queue_processes set to > 4 (the default is 1000 in 12.1.0.2) and you need a resource manager plan to be active. Refer to the oracle blog post referenced earlier for more information about statement queueing and the execution of these jobs. The post also recommends setting parallel_adaptive_multi_user to false, which is the default in 11.2. Just be careful changing PX parameters-these can have undesired side effects.

Using the diagnostic queries take from the Oracle blog post you can see what’s going on. In my case, 6 jobs were created:

select job_name, state, comments from dba_scheduler_jobs where job_class like 'CONC%'

JOB_NAME             STATE           COMMENTS
-------------------- --------------- ----------------------------------------------------------------------------------------------------
ST$T3606_6_B2        SCHEDULED       "MARTIN"."BIGTAB_PART"."SYS_P1254";"MARTIN"."BIGTAB_PART"
ST$T3606_5_B2        RUNNING         "MARTIN"."BIGTAB_PART"."SYS_P1252";"MARTIN"."BIGTAB_PART"."SYS_P1253"
ST$T3606_4_B2        SCHEDULED       "MARTIN"."BIGTAB_PART"."SYS_P1250";"MARTIN"."BIGTAB_PART"."SYS_P1251"
ST$T3606_3_B2        RUNNING         "MARTIN"."BIGTAB_PART"."SYS_P1248";"MARTIN"."BIGTAB_PART"."SYS_P1249"
ST$T3606_2_B2        SCHEDULED       "MARTIN"."BIGTAB_PART"."SYS_P1246";"MARTIN"."BIGTAB_PART"."SYS_P1247"
ST$T3606_1_B2        RUNNING         "MARTIN"."BIGTAB_PART"."P_MAN";"MARTIN"."BIGTAB_PART"."SYS_P1245"

6 rows selected.

By the way if you are interested in the payload, you can use DBMS_METADATA:

SQL> get get_job
  1* SELECT DBMS_METADATA.GET_DDL('PROCOBJ', '&1') from dual;
SQL>

You pass it the job name and you get some pretty cryptic looking output :)

Interlude

Actually, you have to see this…

DBMS_METADATA.GET_DDL('PROCOBJ','ST$T3982_1')
-----------------------------------------------------------------------------------------


BEGIN
dbms_scheduler.create_job('"ST$T3982_1"',
job_type=>'PLSQL_BLOCK', job_action=>
'declare c dbms_stats.CContext := dbms_stats.CContext(); begin c.extend(30); c(1):=''ST$T3982_1;1;855;1;FALSE;FALSE;''; c(2):=q''#"MARTIN";#''; c(3):=q''#"BIGTAB_PART";#''; c(4):=q''#"SYS_P1250";#''; c(5):=''0;''; c(6):=''FALSE;''; c(7):=q''#FOR ALL
COLUMNS SIZE AUTO;#''; c(8):=''NULL;''; c(9):=''PARTITION;''; c(10):=''TRUE;''; c(11):=q''#NULL;#''; c(12):=q''#NULL;#''; c(13):=q''#NULL;#''; c(14):=''NULL;''; c(15):=''DATA;''; c(16):=''FALSE;''; c(17):=''GLOBAL AND PARTITION;''; c(18):=''TRUE;'';
c(19):=''0;''; c(20):='';''; c(21):=''FALSE;''; c(22):=''FALSE;''; c(23):=''4;''; c(24):=''ST$T3982;''; c(25):=''TRUE;''; c(26):=''FALSE;''; c(27):=''4.562199;''; c(28):=''228.109937;''; c(29):=''0;''; c(30):=''0;''; dbms_stats.gather_table_stats('''
','''',context=>c); end;'
, number_of_arguments=>0,
start_date=>NULL, repeat_interval=>
NULL
, end_date=>NULL,
job_class=>'"CONC_ST$T3982"', enabled=>FALSE, auto_drop=>TRUE,comments=>
'"MARTIN"."BIGTAB_PART"."SYS_P1250"'
);
dbms_scheduler.set_attribute('"ST$T3982_1"','raise_events',38);
dbms_scheduler.enable('"ST$T3982_1"');
COMMIT;
END;

This looks better when you parse it a bit more so that it reads:

DECLARE
  c dbms_stats.CContext := dbms_stats.CContext();
BEGIN
  c.extend(30);
  c(1) :='ST$T3982_1;1;855;1;FALSE;FALSE;';
  c(2) :=q'#"MARTIN";#';
  c(3) :=q'#"BIGTAB_PART";#';
  c(4) :=q'#"SYS_P1250";#';
  c(5) :='0;';
  c(6) :='FALSE;';
  c(7) :=q'#FOR ALL COLUMNS SIZE
AUTO;#';
  c(8) :='NULL;';
  c(9) :='PARTITION;';
  c(10):='TRUE;';
  c(11):=q'#NULL;#';
  c(12):=q'#NULL;#';
  c(13):=q'#NULL;#';
  c(14):='NULL;';
  c(15):='DATA;';
  c(16):='FALSE;';
  c(17):='GLOBAL AND PARTITION;';
  c(18):='TRUE;';
  c(19):='0;';
  c(20):=';';
  c(21):='FALSE;';
  c(22):='FALSE;';
  c(23):='4;';
  c(24):='ST$T3982;';
  c(25):='TRUE;';
  c(26):='FALSE;';
  c(27):='4.562199;';
  c(28):='228.109937;';
  c(29):='0;';
  c(30):='0;';
  dbms_stats.gather_table_stats('','',context=>c);
END;

Interesting!

The concurrent statistics gathering works for incremental statistics as well and is a good way to get the initial set of stats it needs anyway (see Doc ID 1541543.1 for background on incremental statistics gathering).

The end result are statistics!

SQL> select partition_name, num_rows, sample_size, last_analyzed
  2   from user_tab_partitions where table_name = 'BIGTAB_PART';

PARTITION_NAME                   NUM_ROWS SAMPLE_SIZE LAST_ANAL
------------------------------ ---------- ----------- ---------
SYS_P1254                              32          32 07-NOV-14
SYS_P1253                        25600000    25600000 07-NOV-14
SYS_P1252                        25600000    25600000 07-NOV-14
SYS_P1251                        25600000    25600000 07-NOV-14
SYS_P1250                        25600000    25600000 07-NOV-14
SYS_P1249                        25600000    25600000 07-NOV-14
SYS_P1248                        25600000    25600000 07-NOV-14
SYS_P1247                        25600000    25600000 07-NOV-14
SYS_P1246                        25600000    25600000 07-NOV-14
SYS_P1245                        25600000    25600000 07-NOV-14
P_MAN                            25599968    25599968 07-NOV-14

11 rows selected.

Stats not gathered concurrently if partitions are small

This all looks pretty straight forward, except that it wasn’t when I first tried. I set all the parameters as documented but still had no jobs running in the background. The first table I tried to gather statistics on concurrently was partitioned exactly like BIGTAB_PART with the difference that all partitions where compressed for QUERY HIGH. This means that a lot less space is taken for each. I got an idea that segment_size might be reladed, so I compressed BIGTAB_PART and here is the result:

SQL> select partition_name, bytes/power(1024,2) m from user_segments 
  2   where segment_name = 'BIGTAB_PART';

PARTITION_NAME                                   M
------------------------------ -------------------
P_MAN                                       321.75
SYS_P1245                                   321.00
SYS_P1246                                   322.50
SYS_P1247                                   321.50
SYS_P1248                                   320.75
SYS_P1249                                   322.25
SYS_P1250                                   322.25
SYS_P1251                                   322.25
SYS_P1252                                   322.25
SYS_P1253                                   322.00
SYS_P1254                                     8.25

11 rows selected.

When trying to gather statistics concurrently (without changing the table preferences at all), this did not happen. So remember next time you gather statistics concurrently and don’t see multiple jobs for intra-table statistics gathering that your partitions might actually be too small.

2 Responses to “Concurrent statistics gathering in 12.1.0.2”

  1. Hi Martin,
    i have not used concurrent statistics in 12c yet, but according to the Oracle documentation parameter “concurrent” got only 4 valid values (MANUAL, AUTOMATIC, ALL, OFF). I assume that TRUE is still possible for backward compatibility, but don’t know the equivalent/translated behavior in 12c for that.

    The Oracle documentation about table prefs is wrong. Christian Antognini mentioned this in his book “Troubleshooting Oracle Performance, 2nd Edition” on page 247 as well (“Note that the parameters autostats_target and concurrent can only be modified with the set_global_prefs procedure.”)

    Regards
    Stefan

    • Martin Bach said

      Hi Stefan,

      always good to see you passing by. The concurrent setting is a bit odd. It’s not documented in 11.2 dbms_stats (https://docs.oracle.com/cd/E18283_01/appdev.112/e16760/d_stats.htm), but it is documented in the 12c packages and types reference (although incorrectly).

      Now that you mentioned it, I had a second look and found it in the Database SQL Tuning Guide, table 12-1:

      You can only set the CONCURRENT preference at the global level (see “About Concurrent Statistics Gathering”).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: