Martins Blog

Trying to explain complex things in simple terms

Archive for the ‘11g Release 1’ Category

Little things worth knowing: auto-DOP or hint-which one takes precedence?

Posted by Martin Bach on June 2, 2016

This is just another short post about one of the little things worth knowing. Assume you are on 12.1.0.2 for example and you want to know if parallel_degree_limit reliably keeps the degree of parallelism at bay, for all sessions and queries across the board. After all, every machine has finite CPU (and other) resources that shouldn’t be exhausted. Setting parallel_degree_limit of course only has an effect if you set parallel_degree_policy to limited, auto or adaptive. The latter is the Big Knob to enable Automatic Degree Of Parallelism (DOP), and the post assumes that Auto DOP is enabled for the session. In this configuration, what happens to hinted statements?

Documentation

The documentation (12c Reference Guide, section on parallel_degree_limit) states that you can provide an integer in addition to “CPU” and “IO”:

A numeric value for this parameter specifies the maximum degree of parallelism the optimizer can choose for a SQL statement when automatic degree of parallelism is active. Automatic degree of parallelism is only enabled if PARALLEL_DEGREE_POLICY is set to ADAPTIVE, AUTO, or LIMITED.

Without further context this may sound a bit ambiguous-does enabling Auto DOP plus setting parallel_degree_limit to 8 ensure no query can exceed that DOP? And does that setting actually stop users from going over and above that value?

Testing

I am using a 12.1.0.2 RAC database for the test. I would like to limit the maximum DOP of a given statement to 8. Here are my settings for my lab system (emphasis on lab):

SQL> show parameter parallel

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
fast_start_parallel_rollback         string      LOW
parallel_adaptive_multi_user         boolean     FALSE
parallel_automatic_tuning            boolean     FALSE
parallel_degree_level                integer     100
parallel_degree_limit                string      8
parallel_degree_policy               string      AUTO
parallel_execution_message_size      integer     16384
parallel_force_local                 boolean     TRUE
parallel_instance_group              string
parallel_io_cap_enabled              boolean     FALSE
parallel_max_servers                 integer     128
parallel_min_percent                 integer     0
parallel_min_servers                 integer     16
parallel_min_time_threshold          string      AUTO
parallel_server                      boolean     TRUE
parallel_server_instances            integer     2
parallel_servers_target              integer     128
parallel_threads_per_cpu             integer     2
recovery_parallelism                 integer     0

Now I need a table to query – after looking around in my demo-schema I found T1 which looks like a good candidate.

SQL> select table_name,num_rows,partitioned,compression,degree from tabs where table_name = 'T1';

TABLE_NAME                       NUM_ROWS PAR COMPRESS DEGREE
------------------------------ ---------- --- -------- ----------
T1                               32000000 NO  DISABLED          1

Unlike when setting parallel_degree_policy to limited I don’t need to worry about decorating the table with a default DOP. With parallel_degree_policy set to auto a parallel plan can be chosen even with the DOP on the table set to 1, provided the optimiser reckons that statement execution exceeds parallel_min_time_threshold. In my case I left its value at its default, which is 10 seconds. When querying the table I do not have to specify a hint to enable PX, or define PX on the session level as you can see in this example:

SQL> select count(*) from T1;

  COUNT(*)
----------
  32000000

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------
SQL_ID  byb4cbw4vy1cw, child number 1
-------------------------------------
select count(*) from T1

Plan hash value: 2755517917

----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name           | Rows  | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                |       |   200K(100)|          |        |      |            |
|   1 |  SORT AGGREGATE                |                |     1 |            |          |        |      |            |
|   2 |   PX COORDINATOR               |                |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM)         | :TQ10000       |     1 |            |          |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE             |                |     1 |            |          |  Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR         |                |    32M|   200K  (1)| 00:00:04 |  Q1,00 | PCWC |            |
|*  6 |       TABLE ACCESS STORAGE FULL| T1             |    32M|   200K  (1)| 00:00:04 |  Q1,00 | PCWP |            |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - storage(:Z>=:Z AND :Z<=:Z)

Note
-----
   - automatic DOP: Computed Degree of Parallelism is 8 because of degree limit

The important bit here is the information in the Note section: Oracle computed the DOP to be 8 because of the degree limit. Exactly what I wanted.

Now what if I use the hint?

No changes to the setup, this is the next statement I typed in:

SQL> select /*+ parallel(32) */ count(*) from T1;


  COUNT(*)
----------
  32000000

SQL> SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  9d9aha2bdd5zc, child number 0
-------------------------------------
select /*+ parallel(32) */ count(*) from T1

Plan hash value: 2755517917

----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name           | Rows  | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                |       | 50217 (100)|          |        |      |            |
|   1 |  SORT AGGREGATE                |                |     1 |            |          |        |      |            |
|   2 |   PX COORDINATOR               |                |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM)         | :TQ10000       |     1 |            |          |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE             |                |     1 |            |          |  Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR         |                |    32M| 50217   (1)| 00:00:01 |  Q1,00 | PCWC |            |
|*  6 |       TABLE ACCESS STORAGE FULL| T1             |    32M| 50217   (1)| 00:00:01 |  Q1,00 | PCWP |            |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - storage(:Z>=:Z AND :Z<=:Z)

Note
-----
   - Degree of Parallelism is 32 because of hint

So based on this experiment it appears as if the hint took precedence over parallel_degree_limit in a session that had Auto DOP enabled. That doesn’t mean the DBA is out of luck, other tools come to mind to limit the DOP such as Database Resource Manager (DBRM). I have written about using DBRM to limit the DOP for all users of a consumer group some time ago. If you read that post make sure you look at Randolf’s comments.

Posted in 11g Release 1, 12c Release 1, Linux | Tagged: | Leave a Comment »

WARNING: FAILED to load library: /opt/oracle/extapi/64/asm/orcl/1/libasm.so

Posted by Martin Bach on April 12, 2012

This post is long overdue, and it’s as much a reminder to myself as it is a post for you to find when scratching your head.

Here’s the situation: Oracle Restart on an Oracle Enterprise Linux 5.5 system with ASMLib in use. I have installed 11.2.0.2 plus the latest PSU to bring database and Grid Infrastructure to 11.2.0.2.5-so far so good. I have used the non-GUI approach and everything seemed ok until I tried to create a database for some benchmarking. Read the rest of this entry »

Posted in 10g Release 2, 11g Release 1, 11g Release 2, Automatic Storage Management | Tagged: , , , | 2 Comments »

A more user friendly multipath.conf

Posted by Martin Bach on June 1, 2011

Update

The below described changes to the device uid, gid and mode have been deprecated with RHEL6/OL 6. Back to udev rules… See http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/pdf/DM_Multipath/Red_Hat_Enterprise_Linux-6-DM_Multipath-en-US.pdf for more information

During some recent work I did involving a stretched 11.2.0.2 RAC for a SAP implementation at a customer site I researched TimeFinder/Clone backups. As part of this exercise I have been able to experiment with RHEL (OEL) 5.6 and the new device mapper multipath package on the mount host. I have been very pleasantly surprise about this new feature which I’d like to share.

Background of this article

Device Mapper Multipath is the “native” Linux multipathing software, as opposed to vendor-supplied multipathing such as EMC’s Power Path or Hitachi’s HDLM.

My customer’s setup is rather unique for a SAP environment as it uses Oracle Enterprise Linux and not Solaris/SPARC or AIX on the Power platform with an active/passive solution. Well if that doesn’t make it sound unique, the fact that there is a plan to run Oracle 11.2.0.2 RAC potentially across sites using ASM and ACFS certainly makes this deployment stand out from the rest. Read the rest of this entry »

Posted in 10g Release 2, 11g Release 1, 11g Release 2, Linux | 8 Comments »

ASM normal redundancy and failure groups spanning SANs

Posted by Martin Bach on April 6, 2011

Julian Dyke has started an interesting thread on the Oak Table mailing list after the latest UKOUG RAC and HA SIG. Unfortunately I couldn’t attend that event, I wish I had, and I knew it would be great.

Anyway, the question revolved around an ASM disk group created with normal redundancy spanning two storage arrays. This should in theory protect against the failure of an array, although at a high price. All ASM disks exported from an array would be 1 failure group. Remember that disks in a failure group all fail if the supporting infrastructure (network, HBA, controller etc) fails. So what would happen with such a setup, if you followed these steps:

  • Shutdown the array for failure group 2
  • Stop the database
  • Shutdown the second array – failure group 1
  • Do some more maintenance…
  • Startup failgroup B SAN
  • Start the database
  • Startup failgroup A SAN

ASM can tolerate the failure of one failgroup (capacity permitting), so the failure of failgroup 2 should not bring the disk group down, which would result in immediate loss of service. But what happens if it comes up after the data in the other failure group has been modified? Will there be data corruption?

Read the rest of this entry »

Posted in 11g Release 1, 11g Release 2, Automatic Storage Management, Linux, RAC | Tagged: , | 5 Comments »

11.1 GC agent refuses to start

Posted by Martin Bach on February 14, 2011

This is a follow-up post from my previous tale of how not to move networks for RAC. After having successfully restarted the cluster as described here in a previous post I went on to install a Grid Control 11.1 system. This was to be on Solaris 10 SPARC-why SPARC? Not my platform of choice when it comes to Oracle software, but my customer has a huge SPARC estate and wants to make most of it.

After the OMS has been built (hopefully I’ll find time to document this as it can be quite tricky on SPARC) I wanted to secure the agents on my cluster against it. That worked ok for the first node:

  • emctl clearstate agent
  • emctl secure agent
  • emctl start agent

Five minutes later the agent appeared in my list of agents in the Grid Control console. With this success backing me I went to do the same on the next cluster node.

Here things were different-here’s the sequence of commands I used:

$ emctl stop agent
Oracle Enterprise Manager 11g Release 1 Grid Control 11.1.0.1.0
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved
$

Read the rest of this entry »

Posted in 11g Release 1, 11g Release 2, Grid Control, Linux | Leave a Comment »

Patch 10270073 – 11.1.0.1.2 Patch Set Update for Oracle Management Service

Posted by Martin Bach on February 1, 2011

Today is patch day – on my current site we are having quite a few problems with our management server (OMS from now on). It occasionally simply “hangs” and doesn’t respond when connecting to the SSH port. A few minutes later it’s back to normal-but this behaviour is not reproducible.

So in an effort to please Oracle support who couldn’t find a reason for this I decided to apply Patch Set Update 2 to the OMS to get it to 11.1.0.1.2. I would also like to filter the corresponding agent patch for our 11.1 agents through as well. The PSU has been released 2 weeks ago so it’s reasonably fresh. Read the rest of this entry »

Posted in 11g Release 1, Grid Control | 3 Comments »

Adding storage dynamically to ASM on Linux

Posted by Martin Bach on January 14, 2011

Note: This discussion is potentially relevant only to OEL 5.x and RHEL 5.x- I haven’t been able to verify that it works the same way on other Linux distributions. I would assume so though. Before starting with the article, here are some facts:

  • OEL/RHEL 5.5 64bit
  • Oracle 11.2.0.2
  • native multipathing: device-mapper-multipath

The question I have asked myself many times is: how can I dynamically add a LUN to ASM without having to stop any component of the stack? Mocking “reboot-me” OS’s like Windows I soon was quiet when it came to discussing the addition of a LUN to ASM on Linux. Today I learned how to do this, by piecing together information I got from Angus Thomas, a great Red Hat system administrator I had the pleassure to work with in 2009 and 2010. And since I have a short lived memory I decided to write it down.

I’ll describe the process from the top to bottom, from the addition of the LUN to the server all the way up to the addition of the ASM disk to the disk group. Read the rest of this entry »

Posted in 11g Release 1, 11g Release 2, Linux, RAC | 10 Comments »

Huge Pages and Linux-Real World example

Posted by Martin Bach on April 19, 2010

A common situation for many DBAs: all over sudden you are tasked to look at a database, and you are told you inherit it. Of course, a lot of problems exist with it, and you are supposed to fix them all. A few weeks ago this happened to me.

The system is a Sun 4660 x86-64 server with Red Hat5.3, and it has 64GB of memory, 8 dual core Opteron 8218 processors. SGA_TARGET was set to 40G, and PGA_AGGREGATE_TARGET to 5GB. That sounds like plenty, however the box was very busy trying to free up memory:

top - 12:16:11 up 23 days, 23:46, 19 users,  load average: 28.84, 25.69, 23.19
Tasks: 970 total,   3 running, 967 sleeping,   0 stopped,   0 zombie
Cpu(s):  9.3%us, 13.0%sy,  0.0%ni, 36.0%id, 40.6%wa,  0.1%hi,  1.0%si,  0.0%st
Mem:  66068664k total, 65992772k used,    75892k free,    43168k buffers
Swap:  2096472k total,  2096472k used,        0k free, 40782300k cached

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 1068 root      20  -5     0    0    0 R 76.0  0.0 566:03.07 [kswapd5]
13284 oracle    18   0 40.3g 6.3g 6.2g D 31.5 10.0  28:30.12 oraclePROD (LOCAL=NO)
 1070 root      10  -5     0    0    0 D 21.6  0.0 925:57.03 [kswapd7]
 1065 root      10  -5     0    0    0 D 13.8  0.0 196:27.93 [kswapd2]
 7771 oracle    18   0 40.6g 4.5g 4.1g D 12.1  7.1  26:05.04 oraclePROD (LOCAL=NO)
 8073 oracle    16   0 40.2g 2.8g 2.8g D 12.1  4.4   5:29.79 oraclePROD (LOCAL=NO)
 1066 root      10  -5     0    0    0 S 11.8  0.0  84:57.85 [kswapd3]
 1067 root      10  -5     0    0    0 D 10.5  0.0 165:39.98 [kswapd4]
 1069 root      10  -5     0    0    0 S  7.9  0.0 277:15.84 [kswapd6]

Read the rest of this entry »

Posted in 11g Release 1, Linux | 5 Comments »

Archivelog retention policy changes in RMAN 11g

Posted by Martin Bach on October 8, 2009

In the past, that means at least until 10.2.x it used to be a bit clumsy to set the archivelog deletion policy to “applied on standby” on both primary and standby database. It involved setting an underscore parameter, as per Metalink Note 728053.1 “Configure RMAN to purge archivelogs after applied on standby”. For reference, the parameter was _log_deletion_policy, and you’d have to bounce the database for it to take effect.

SQL> alter system set "_log_deletion_policy"='ALL' scope=spfile;

Only then could you avoid the error message.Now with 11g that’s a lot easier.

Read the rest of this entry »

Posted in 11g Release 1, 11g Release 2, Oracle, RMAN | Tagged: , , , | 7 Comments »