Martins Blog

Trying to explain complex things in simple terms

Limiting the Degree of Parallelism via Resource Manager and a gotcha

Posted by Martin Bach on April 23, 2013

This might be something very obvious for the reader but I had an interesting revelation recently when implementing parallel_degree_limit_p1 in a resource consumer group. My aim was to prevent users mapped to a resource consumer group from executing any query in parallel. The environment is fictional, but let’s assume that it is possible that maintenance operations for example leave indexes and tables decorated with a parallel x attribute. Another common case is the restriction of PQ resource to users to prevent them from using all the machine’s resources.

This can happen when you perform an index rebuild for example in parallel to speed the operation up. However the DOP will stay the same with the index after the maintenance operation, and you have to explicitly set it back:

SQL> alter index I_T1$ID1 rebuild parallel 4;

Index altered.

SQL> select index_name,degree from ind where index_name = 'I_T1$ID1';

INDEX_NAME		       DEGREE
------------------------------ ----------------------------------------
I_T1$ID1		       4

SQL>

Now when you select from the table and the index is involved you can end up with a parallel query (PQ). Admittedly this example is a little contrived but not too far from reality either.

SQL> select count(1) from t1 a, t1 b
 2 where a.object_id = b.object_id;

  COUNT(1)
----------
     74511

Now I’m getting the execution plan for this statement-I expect it to be parallel:

SQL> select * from table(dbms_xplan.display_cursor);

SQL_ID	66xstr294k2f6, child number 0
-------------------------------------
select count(1) from t1 a, t1 b where a.object_id = b.object_id

Plan hash value: 3484396843

----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name     | Rows	| Bytes | Cost (%CPU)| Time	|    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	     |		|	|	|   127 (100)|		|	 |	|	     |
|   1 |  SORT AGGREGATE 	     |		|     1 |    26 |	     |		|	 |	|	     |
|   2 |   PX COORDINATOR	     |		|	|	|	     |		|	 |	|	     |
|   3 |    PX SEND QC (RANDOM)	     | :TQ10001 |     1 |    26 |	     |		|  Q1,01 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE	     |		|     1 |    26 |	     |		|  Q1,01 | PCWP |	     |
|*  5 |      HASH JOIN		     |		|   134K|  3414K|   127   (4)| 00:00:02 |  Q1,01 | PCWP |	     |
|   6 |       PX BLOCK ITERATOR      |		|   134K|  1707K|    61   (0)| 00:00:01 |  Q1,01 | PCWC |	     |
|*  7 |        INDEX FAST FULL SCAN  | I_T1$ID1 |   134K|  1707K|    61   (0)| 00:00:01 |  Q1,01 | PCWP |	     |
|   8 |       PX RECEIVE	     |		|   134K|  1707K|    61   (0)| 00:00:01 |  Q1,01 | PCWP |	     |
|   9 |        PX SEND BROADCAST     | :TQ10000 |   134K|  1707K|    61   (0)| 00:00:01 |  Q1,00 | P->P | BROADCAST  |
|  10 | 	PX BLOCK ITERATOR    |		|   134K|  1707K|    61   (0)| 00:00:01 |  Q1,00 | PCWC |	     |
|* 11 | 	 INDEX FAST FULL SCAN| I_T1$ID1 |   134K|  1707K|    61   (0)| 00:00:01 |  Q1,00 | PCWP |	     |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("A"."OBJECT_ID"="B"."OBJECT_ID")
   7 - access(:Z>=:Z AND :Z=:Z AND :Z<=:Z)

OK, that’s been done in parallel. But I don’ want it to run in parallel!

So the task at hand is to explicitly disable PQ for users connecting via service OLTP_SRVC. I spare you the detail of the creation of the consumer group, for this example I have created the group and assigned user martin to it. The mapping to the consumer group is based on the service name. I had to explicitly grant my user the right to switch consumer group as shown here:

BEGIN
 dbms_resource_manager_privs.grant_switch_consumer_group(
  GRANTEE_NAME   => 'MARTIN',
  CONSUMER_GROUP => 'OLTP_GRP',
  GRANT_OPTION   => FALSE);
END;
/

Then it’s time to create the plan and corresponding directives. The relevant piece of code is shown here:

BEGIN
 dbms_resource_manager.clear_pending_area();
 dbms_resource_manager.create_pending_area();

 -- more code

 dbms_resource_manager.create_plan_directive(
 plan => 'someplan',
 group_or_subplan => 'OLTP_GRP',
 mgmt_p2 => 65,
 comment => 'no parallel!',
 parallel_degree_limit_p1 => 0
 );

 -- more code

 dbms_resource_manager.validate_pending_area();
 dbms_resource_manager.submit_pending_area();
end;
/

After the familiar “PL/SQL procedure completed successfully” message I enabled someplan, I was all ready to enjoy the success-but wait!

There’s something strange, at least to me. I connected using the service and executed the query again but NO CHANGE. The optimizer still uses a parallel execution plan:

SQL> select * from table(dbms_xplan.display_cursor);
SQL_ID	66xstr294k2f6, child number 0
-------------------------------------
select count(1) from t1 a, t1 b where a.object_id = b.object_id

Plan hash value: 3484396843

----------------------------------------------------------------------------------------------------------------------
| Id  | Operation		     | Name	| Rows	| Bytes | Cost (%CPU)| Time	|    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	     |		|	|	|   127 (100)|		|	 |	|	     |
|   1 |  SORT AGGREGATE 	     |		|     1 |    26 |	     |		|	 |	|	     |
|   2 |   PX COORDINATOR	     |		|	|	|	     |		|	 |	|	     |
|   3 |    PX SEND QC (RANDOM)	     | :TQ10001 |     1 |    26 |	     |		|  Q1,01 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE	     |		|     1 |    26 |	     |		|  Q1,01 | PCWP |	     |
|*  5 |      HASH JOIN		     |		|   134K|  3414K|   127   (4)| 00:00:02 |  Q1,01 | PCWP |	     |
|   6 |       PX BLOCK ITERATOR      |		|   134K|  1707K|    61   (0)| 00:00:01 |  Q1,01 | PCWC |	     |
|*  7 |        INDEX FAST FULL SCAN  | I_T1$ID1 |   134K|  1707K|    61   (0)| 00:00:01 |  Q1,01 | PCWP |	     |
|   8 |       PX RECEIVE	     |		|   134K|  1707K|    61   (0)| 00:00:01 |  Q1,01 | PCWP |	     |
|   9 |        PX SEND BROADCAST     | :TQ10000 |   134K|  1707K|    61   (0)| 00:00:01 |  Q1,00 | P->P | BROADCAST  |
|  10 | 	PX BLOCK ITERATOR    |		|   134K|  1707K|    61   (0)| 00:00:01 |  Q1,00 | PCWC |	     |
|* 11 | 	 INDEX FAST FULL SCAN| I_T1$ID1 |   134K|  1707K|    61   (0)| 00:00:01 |  Q1,00 | PCWP |	     |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("A"."OBJECT_ID"="B"."OBJECT_ID")
   7 - access(:Z>=:Z AND :Z=:Z AND :Z<=:Z)

I made sure the consumer group was indeed OLTP_GRP (visible from v$session). I flushed the statement from the shared pool (as described here for example), when that didn’t help I flushed the flushed the whole shared pool (in my lab-don’t just flush the shared pool!) but still no change. I proverbially scratched my head a for moment and then decided to check if there wasn’t any more information about DBRM and its decisions available. Sure enough there is (the Oracle DBMS is really well instrumented)

So whenever you are in doubt you can check the meta information about resource manager: v$resource_consumer_group:

SQL> select name,pqs_completed,pq_servers_used from V$RSRC_CONSUMER_GROUP
  2  where name = 'OLTP_GRP'
  3  /

NAME                             PQS_COMPLETED PQ_SERVERS_USED
-------------------------------- ------------- ---------------
OLT_GRP                                      1               0

SQL>

Aha! So it appears that there is a parallel execution plan even though it doesn’t execute in parallel. I would have expected a “parallel force serial” in the plan but there is no such thing. Also, when using Real Time SQL Monitor you won’t see any information about a DOP (requested/used) so there is additional proof that the execution was performed in serial. What initially left me puzzled though is that there was no serial execution plan.

Conclusion

The observed behaviour makes complete sense once you think about it for a moment. The optimizer does not know about the DBRM restriction, and it won’t kick in until run-time. When the optimizer hard-parses a statement it believes it has all the required resources available in form of the parallel% parameters, but that is not the case. I assume that their implementation allowed the Oracle engineers to use a more generic code to implement DOP restrictions, out of which DOP=0 really only is an edge case. I will test a little more if there is an overhead with the approach. In the meantime, if you really need this you could use the following login trigger for example (if you have no control over the code of course!)

SQL> get trigger
  1  create or replace trigger force_serial
  2  after logon on database
  3  declare
  4   v_service_name varchar2(100);
  5  begin
  6   v_service_name := sys_context('userenv','service_name');
  7
  8   if v_service_name = 'OLTP_SRV' then
  9    execute immediate 'alter session disable parallel query';
 10
 11   end if;
 12
 13* end;
SQL>

This is an example only and must complement the DBRM settings shown above. Admittedly it is not the most elegant way of forcing a truly serial execution and it doesn’t prevent a user from overriding the settings in his session but if your users cannot execute “alter session” than it helps a little.

About these ads

4 Responses to “Limiting the Degree of Parallelism via Resource Manager and a gotcha”

  1. Hi Martin,

    the logon trigger doesn’t really prevent users from getting Parallel Execution plans – any explicit PARALLEL hint will override the “DISABLE PARALLEL QUERY” setting on session level, which is quite counter-intuitive.

    Of course, the hint doesn’t override the final restriction via the Resource Manager, but the costing of the optimizer will again be based on whatever degree is assumed.

    Randolf

    • Martin Bach said

      Hi Randolf,

      thanks for passing by!

      I completely agree with you, the only intention of using the trigger was to make it just a tiny bit harder to run something in parallel. If for example you use a reporting tool, and the user cannot modify the queries (assuming they don’t have any parallel in them) then you could rely a bit more on serial execution. It’s not the hardest obstacle to overcome though…

      Have you ever measured the effect of reduced DOP at run time compared to the available DOP at parse time? In other words, does it make a difference to parse a cursor with an assumed DOP of up to 32 and a runtime DOP of less than that? If I find the time I’ll do some testing.

      • Hi Martin,

        I’m not entirely sure I get your question right, but generating a Parallel Execution is probably more resource consuming than generating a serial execution plan, since there are simply more things to consider (if that was your question). So in general I would assume that the optimization might take a tad longer. However very likely in reality you’ll find both cases, where optimizing serial execution is faster than parallel and vice-versa.

        However, this is probably negligible in comparison to the possible effect of having a Parallel Execution plan based on a costing at DOP = 32 and running it serially for example… Having such a large discrepancy between DOP assumed at optimization time and actual DOP at execution time might lead to execution plans that are unsuitable for the much lower DOP used at execution time.

        Only the new Auto-DOP code path in the optimizer negotiates with the Resource Manager, the (old) non-Auto-DOP code doesn’t do so.

        And by the way, running at a very low DOP (for example DOP = 2) might actually be less efficient than running serially, because Parallel Execution comes with some (implementation) overhead that can make a Parallel Execution slower than a serial counterpart – but this depends on the execution plan and actual data pattern.

  2. […] This might be something very obvious for the reader but Martin had an interesting revelation recently when implementing parallel_degree_limit_p1 in a resource consumer group. […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

Join 2,487 other followers

%d bloggers like this: