Martins Blog

Trying to explain complex things in simple terms

Public appearances 2015

Posted by Martin Bach on March 23, 2015

This is going to be a list of my public appearances in 2015.

Upcoming Events

DOAG ExadayA couple of weeks later I am speaking at the DOAG Exaday 2015 on April 28. An established conference in the UK where the Exadata Days have been attracting top notch speakers this format now makes it to Germany. I am very keen to discuss more about Engineered Systems from a user’s point of view in Frankfurt. I will present my brand new talk about Resource Manager in action (in German) as well as have a hands-on workshop on offer during the afternoon during which I’ll cover the most common Exadata optimisations and how to make good use of them.
UKOUG Systems EventI’ll be at the UKOUG systems event in London May 20. A well established one day event where many great speakers have presented in the past. After a short detour to Birmingham the event is now back in London and @Enkitec I am thrilled to be there. My presentation is titled “Migrating to Exadata the easy way” and incorporates a fair bit of new research.
Enkitec Extreme Exadata Expo (E4)E4 is the conference when it comes to Exadata and Big Data. I have been over twice and it’s been more than fantastic. And I would tell you the same even if I wasn’t working for Enkitec, it truly is a serious tech conference (have a look at the agenda and speakers then I’m sure you’ll believe me). My part in the conference is my all-new and updated talk about Hybrid Columnar Compression internals.

Past conferences

I have attended the following events thus far:

  • Fill the Glass webinar with Cary Millsap
  • ClubOracle in London
  • Club Oracle in Paris

Posted in Public Appearances | Leave a Comment »

Little things worth knowing: Is there a penalty in establishing a connection to Oracle using the MAA connection string?

Posted by Martin Bach on April 20, 2015

Sorry for the long title!

I had a question during my session about “advanced RAC programming features” during the last Paris Oracle Meetup about the MAA connection string. I showed an example taken from the Appication Continuity White Paper (http://www.oracle.com/technetwork/database/options/clustering/application-continuity-wp-12c-1966213.pdf). Someone from the audience asked me if I had experienced any problems with it, such as very slow connection timeouts. I haven’t, but wanted to double-check anyway. This is a simplified test using a sqlplus connection since it is easier to time than a call to a connection pool creation. If you know of a way to reliably do so in Java/UCP let me know and I’ll test it.

My system is 12.1.0.2 on Oracle Linux 7.1 with UEK (3.8.13-55.1.6.el7uek.x86_64) and the following patches on the RDBMS and Grid Infrastructure side:

[oracle@rac12sby1 ~]$ opatch lspatches -oh /u01/app/12.1.0.2/grid
19872484;WLM Patch Set Update: 12.1.0.2.2 (19872484)
19769480;Database Patch Set Update : 12.1.0.2.2 (19769480)
19769479;OCW Patch Set Update : 12.1.0.2.2 (19769479)
19769473;ACFS Patch Set Update : 12.1.0.2.2 (19769473)

[oracle@rac12sby1 ~]$ opatch lspatches -oh /u01/app/oracle/product/12.1.0.2/dbhome_1
19877336;Database PSU 12.1.0.2.2, Oracle JavaVM Component (Jan2015)
19769480;Database Patch Set Update : 12.1.0.2.2 (19769480)
19769479;OCW Patch Set Update : 12.1.0.2.2 (19769479)

This is the January System Patch by the way, 20132450. At first I wanted to blog about the patch application but it was so uneventful I decided against it.

I have two clusters, rac12pri and rac12sby. Both consist of 2 nodes each. Database CDB is located on rac12pri, STDBY on rac12sby. Normally I am against naming databases by function as you might end up using STDBY in primary role after a data centre migration, and people tend to find the use of STDBY as primary database odd. In this case I hope it helps understanding the concept better, and it’s a lab environment anyway …

Creating the broker configuration

There is nothing special about creating the Broker Configuration, just remember to define the broker configuration files on shared storage and enabling automatic standby file management. I also recommend standby redo logs on both the primary and standby databases. Once you have the configuration in place, check the database(s) in verbose mode to get the broker connection string. You can copy/paste the connection string to sqlplus to ensure that every instance can be started thanks to a static listener registration (Data Guard will restart databases during switchover and failover operations, which is bound to fail unless the databases are statically registered with the listener). Here is what I mean:

DGMGRL> show instance verbose 'CDB1';

Instance 'CDB1' of database 'CDB'

  Host Name: rac12pri1
  PFILE:     
  Properties:
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.108)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=CDB_DGMGRL)(INSTANCE_NAME=CDB1)(SERVER=DEDICATED)))'
    StandbyArchiveLocation          = 'USE_DB_RECOVERY_FILE_DEST'
    AlternateLocation               = ''
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = '%t_%s_%r.dbf'
    TopWaitEvents                   = '(monitor)'

Instance Status:
SUCCESS

Now I can take the static connection identifier and use it (be sure to specify the “as sysdba” at the end):

[oracle@rac12pri1 ~]$ sqlplus 

SQL*Plus: Release 12.1.0.2.0 Production on Sat Apr 18 10:50:09 2015

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Enter user-name: sys@(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.108)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=CDB_DGMGRL)(INSTANCE_NAME=CDB1)(SERVER=DEDICATED))) as sysdba
Enter password: 

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL>

When this worked for all instances in the configuration you are a big step further in your Data Guard setup!

Service setup

I have created a new service, named rootsrv on both databases.

[oracle@rac12sby1 ~]$ srvctl add service -d STDBY -s rootsrv -role primary -policy automatic -preferred STDBY1,STDBY2

[oracle@rac12pri1 ~]$ srvctl add service -d CDB -s rootsrv -role primary -policy automatic -preferred CDB1,CDB2

To better follow along here is the current status of the configuration (In real life you might want to use a different protection mode):

DGMGRL> show configuration

Configuration - martin

  Protection Mode: MaxPerformance
  Members:
  STDBY - Primary database
    CDB   - Physical standby database 

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 28 seconds ago)

As per its definition, the service is started on the primary but not on the standby. In fact, I can’t start it on the standby:

[oracle@rac12sby1 ~]$ srvctl status service -d STDBY -s rootsrv
Service rootsrv is running on instance(s) STDBY1,STDBY2

[oracle@rac12pri1 ~]$ srvctl status service -d CDB -s rootsrv
Service rootsrv is not running.
[oracle@rac12pri1 ~]$ srvctl start service -d CDB -s rootsrv
PRCD-1084 : Failed to start service rootsrv
PRCR-1079 : Failed to start resource ora.cdb.rootsrv.svc
CRS-2800: Cannot start resource 'ora.cdb.db' as it is already in the INTERMEDIATE state on server 'rac12pri1'
CRS-2632: There are no more servers to try to place resource 'ora.cdb.rootsrv.svc' on that would satisfy its placement policy
CRS-2800: Cannot start resource 'ora.cdb.db' as it is already in the INTERMEDIATE state on server 'rac12pri2'
[oracle@rac12pri1 ~]$

So no more need for database triggers to start and stop services (this was another question I had during my talk) depending on the database’s role.

The MAA connection string

The MAA connection string as taken from the white paper and slightly adapted is this one:

MAA_TEST1 =
  (DESCRIPTION_LIST=
     (LOAD_BALANCE=off)(FAILOVER=on)
       (DESCRIPTION=
         (CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)
           (ADDRESS_LIST= (LOAD_BALANCE=on) (ADDRESS=(PROTOCOL=TCP)(HOST=rac12pri-scan)(PORT=1521)))
           (CONNECT_DATA=(SERVICE_NAME=rootsrv)))
       (DESCRIPTION= (CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)
           (ADDRESS_LIST= (LOAD_BALANCE=on) (ADDRESS=(PROTOCOL=TCP)(HOST= rac12sby-scan)(PORT=1521)))
           (CONNECT_DATA=(SERVICE_NAME=rootsrv)))
  )

Please refer to the white paper for an explanation of the various parameters.

Notice that both the primary and standby SCAN are referenced in there, both connecting to “rootsrv” which is currently active on STDBY. To get some proper timing about the connection delay I use the following little snippet:


$ time sqlplus system/secret@maa_test1 <<EOF
> select sys_context('userenv','instance_name') from dual;
> exit
> EOF

Testing on the “primary cluster” first, the local database is operating in the standby role:

[oracle@rac12pri1 ~]$ time sqlplus system/secret@maa_test1 <<EOF
select sys_context('userenv','instance_name') from dual;
exit
EOF

SQL*Plus: Release 12.1.0.2.0 Production on Sat Apr 18 10:41:25 2015

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Last Successful login time: Sat Apr 18 2015 10:33:53 +01:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL>
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
STDBY2

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real    0m0.432s
user    0m0.026s
sys     0m0.028s

[oracle@rac12pri1 ~]$ time sqlplus system/secret@maa_test1 <<EOF
select sys_context('userenv','instance_name') from dual;
exit
EOF

SQL*Plus: Release 12.1.0.2.0 Production on Sat Apr 18 10:31:25 2015

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Last Successful login time: Sat Apr 18 2015 10:31:24 +01:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL>
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
STDBY2

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real    0m0.415s
user    0m0.023s
sys     0m0.025s

That was quick-not even a second. And as you can see the connection is set up against database STDBY on the other cluster.

Next on the standby cluster:

[oracle@rac12sby1 ~]$ time sqlplus system/secret@maa_test1 <<EOF
select sys_context('userenv','instance_name') from dual;
exit
EOF

SQL*Plus: Release 12.1.0.2.0 Production on Sat Apr 18 10:33:26 2015

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Last Successful login time: Sat Apr 18 2015 10:33:22 +01:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL>
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
STDBY2

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real    0m0.613s
user    0m0.025s
sys     0m0.019s

[oracle@rac12sby1 ~]$ time sqlplus system/secret@maa_test1 <<EOF
select sys_context('userenv','instance_name') from dual;
exit
EOF

SQL*Plus: Release 12.1.0.2.0 Production on Sat Apr 18 10:33:52 2015

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Last Successful login time: Sat Apr 18 2015 10:33:50 +01:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL>
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
STDBY2

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real    0m0.199s
user    0m0.024s
sys     0m0.021s

So this was quick too. And finally on a different machine altogether (my lab server)

[oracle@lab tns]$ time sqlplus system/secret@maa_test1 <<EOF
select sys_context('userenv', 'instance_name') from dual;
exit
EOF

SQL*Plus: Release 12.1.0.2.0 Production on Sat Apr 18 08:13:50 2015

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Last Successful login time: Sat Apr 18 2015 08:13:47 +01:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL> 
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
STDBY2

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real	0m0.387s
user	0m0.020s
sys	0m0.018s

Does it change after a role reversal? I think I’ll cover that in a different post …

Appendix: setting up Data Guard for RAC

This is something I always forget so it’s here for reference. I use an identical tnsnames.ora file across all nodes, here it is:

CDB =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = rac12pri-scan)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = CDB)
    )
  ) 

STDBY =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = rac12sby-scan)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = STDBY)
    )
  ) 

# used for DG and RMAN duplicate
STDBY_NON_SCAN =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = rac12sby1-vip)(PORT = 1521))
      (ADDRESS = (PROTOCOL = TCP)(HOST = rac12sby2-vip)(PORT = 1521))
    )
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = STDBY)
    )
  )

CDB_NON_SCAN =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = rac12pri1-vip)(PORT = 1521))
      (ADDRESS = (PROTOCOL = TCP)(HOST = rac12pri2-vip)(PORT = 1521))
    )
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = CDB)
    )
  )

The non-scan TNS entries are a quick workaround to the problem that I registered the SIDs statically with the node listeners only, not the SCAN listeners. If there is a more elegant way of statically registering a database, please let me know! I couldn’t find any documentation on how to do this. I guess adding a SID_LIST_LISTENER_SCANx would do, too. Anyway, here is an example for the local node listener configuration (not the complete file contents). The ORACLE_SID and host names must be adapted for each node in the cluster.

LISTENER =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = IPC)(KEY = LISTENER))
    )
  )

# for broker
SID_LIST_LISTENER=
  (SID_LIST=
    (SID_DESC=
      (GLOBAL_DBNAME=STDBY)
      (ORACLE_HOME=/u01/app/oracle/product/12.1.0.2/dbhome_1)
      (SID_NAME=STDBY1))
    (SID_DESC=
      (GLOBAL_DBNAME=STDBY_DGMGRL)
      (ORACLE_HOME=/u01/app/oracle/product/12.1.0.2/dbhome_1)
      (SID_NAME=STDBY1))
  )

After changing the listener configuration you need to reload the local node listener.

Posted in 12c Release 1, Data Guard, Linux | Tagged: , | 6 Comments »

Little things worth knowing: what does opatchauto actually do?

Posted by Martin Bach on April 16, 2015

I recently applied system patch 20132450 to my 12.1.0.2.0 installation on a 2 node RAC system on Oracle Linux 7.1. While ensuring that OPatch is the latest version available I came across an interesting command line option in opatchauto. It is called “-generateSteps”.

[oracle@rac12sby1 ~]$ $ORACLE_HOME/OPatch/opatchauto apply -help
OPatch Automation Tool
Copyright (c) 2015, Oracle Corporation.  All rights reserved.


DESCRIPTION
    Apply a System Patch to Oracle Home. User specified the patch
    location or the current directory will be taken as the patch location.
    opatchauto must run from the GI Home as root user.

SYNTAX
<GI_HOME>/OPatch/opatchauto apply
                            [-analyze]
                            [-database <database names> ]
                            [-generateSteps]
                            [-invPtrLoc <Path to oraInst.loc> ]
                            [-jre <LOC> ] 
                            [-norestart ]
                            [-nonrolling ]
                            [-ocmrf <OCM response file location> ]
                            [-oh <OH_LIST> ]
                            [ <Patch Location> ]

OPTIONS


       -analyze
              This option runs all the required prerequisite checks to confirm
              the patchability of the system without actually patching or 
              affecting the system in any way.

       -database
              Used to specify the RDBMS home(s) to be patched. Option value 
              is oracle database name separated by comma.

       -generateSteps
              Generate the manual steps of apply session. These steps are what
              'opatchauto apply' does actually.

       -invPtrLoc
              Used to locate the oraInst.loc file when the installation used 
              the -invPtrLoc. This should be the path to the oraInst.loc file.

       -jre
              This option uses JRE (java) from the
              specified location instead of the default location
              under Oracle Home.

       -norestart
              This option tells opatchauto not to restart the Grid Infrastructure
              stack and database home resources after patching.

       -nonrolling
              This option makes the patching session run in 'nonrolling' mode.
              It is required that the stack on the local node is running while
              it must be stopped on all the remaining nodes before the patching
              session starts.

       -ocmrf 
              This option specifies the absolute path to the OCM response file. 
              It is required if the target Oracle Home doesn't have OCM 
              installed and configured.

       -oh
              This option specifies Oracle Home(s) to be patched. This can be a
              RAC home, a Grid Home or comma separated list of multiple homes.


PARAMETERS
      Patch Location
                    Path to the location for the patch. If the patch 
                    location is not specified, then the current
                    directory is taken as the patch location.


Example:
      To patch GI home and all RAC homes: 
      '<GI_HOME>/OPatch/opatchauto apply'

      To patch multiple homes:
      '<GI_HOME>/OPatch/opatchauto apply -oh <GI_HOME>,<RAC_HOME1>,<RAC_HOME2>'

      To patch databases running from RAC homes only:
      '<RAC_HOME>/OPatch/opatchauto apply -database db1,db2...dbn'

      To patch software-only installation:
      '<RAC_HOME>/OPatch/opatchauto apply -oh <RAC_HOME>' OR
      '<GI_HOME>/OPatch/opatchauto apply -oh <GI_HOME>'


opatchauto succeeded.
[oracle@rac12sby1 ~]$ 

So this could be quite interesting so I wanted to see what it does with the system patch. I don’t have a database installed in my cluster yet, it’s going to be the standby site for another cluster (rac12pri). Unfortunately opatchauto still skips RDBMS homes without a database in it in my tests so I end up using the “-oh” flag in this case. Here’s the result, formatted for better readability. In reality all calls to opatchauto are in a single line:

[root@rac12sby1 ~]# opatchauto apply /u01/patches/20132450 \
-oh /u01/app/12.1.0.2/grid -generateSteps -ocmrf /u01/patches/ocm.rsp
OPatch Automation Tool
Copyright (c) 2015, Oracle Corporation.  All rights reserved.

OPatchauto version : 12.1.0.1.5
OUI version        : 12.1.0.2.0
Running from       : /u01/app/12.1.0.2/grid

Invoking opatchauto utility "generateapplysteps"

To apply the patch, Please do the following manual actions:

Step 1 As the "oracle" user on the host "rac12sby1", please run the following commands:
  Action 1.1 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch version -oh /u01/app/12.1.0.2/grid -invPtrLoc \
/u01/app/12.1.0.2/grid/oraInst.loc -v2c 12.1.0.1.5

Step 2 As the "oracle" user on the host "rac12sby1", please run the following commands:
  Action 2.1 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch lsinventory -invPtrLoc \
/u01/app/12.1.0.2/grid/oraInst.loc -oh /u01/app/12.1.0.2/grid

Step 3 As the "oracle" user on the host "rac12sby1", please run the following commands:
  Action 3.1 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch prereq CheckComponents \
-ph /u01/patches/20132450/19769473 -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc \
-oh /u01/app/12.1.0.2/grid

  Action 3.2 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch prereq CheckComponents \
-ph /u01/patches/20132450/19769479 -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc \
-oh /u01/app/12.1.0.2/grid

  Action 3.3 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch prereq CheckComponents \
-ph /u01/patches/20132450/19769480 -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc \
-oh /u01/app/12.1.0.2/grid

  Action 3.4 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch prereq CheckComponents \
-ph /u01/patches/20132450/19872484 -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc \
-oh /u01/app/12.1.0.2/grid

Step 4 As the "oracle" user on the host "rac12sby1", please run the following commands:
  Action 4.1 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch prereq CheckConflictAgainstOH \
-ph /u01/patches/20132450/19769473 -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc \
-oh /u01/app/12.1.0.2/grid

  Action 4.2 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch prereq CheckConflictAgainstOH \
-ph /u01/patches/20132450/19769479 -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc \
-oh /u01/app/12.1.0.2/grid

  Action 4.3 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch prereq CheckConflictAgainstOH \
-ph /u01/patches/20132450/19769480 -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc \
-oh /u01/app/12.1.0.2/grid

  Action 4.4 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch prereq CheckConflictAgainstOH \
-ph /u01/patches/20132450/19872484 -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc \
-oh /u01/app/12.1.0.2/grid

Step 5 As the "root" user on the host "rac12sby1", please run the following commands:
  Action 5.1 [root@rac12sby1]#
  /u01/app/12.1.0.2/grid/perl/bin/perl -I/u01/app/12.1.0.2/grid/perl/lib \
-I/u01/app/12.1.0.2/grid/crs/install /u01/app/12.1.0.2/grid/crs/install/rootcrs.pl -prepatch

Step 6 As the "oracle" user on the host "rac12sby1", please run the following commands:
  Action 6.1 [oracle@rac12sby1]$
  echo /u01/patches/20132450/19769473 > /tmp/OraGI12Home1_patchList

  Action 6.2 [oracle@rac12sby1]$
  echo /u01/patches/20132450/19769479 >> /tmp/OraGI12Home1_patchList

  Action 6.3 [oracle@rac12sby1]$
  echo /u01/patches/20132450/19769480 >> /tmp/OraGI12Home1_patchList

  Action 6.4 [oracle@rac12sby1]$
  echo /u01/patches/20132450/19872484 >> /tmp/OraGI12Home1_patchList

  Action 6.5 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch napply -phBaseFile /tmp/OraGI12Home1_patchList \
-local  -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc -oh /u01/app/12.1.0.2/grid -silent

Step 7 As the "root" user on the host "rac12sby1", please run the following commands:
  Action 7.1 [root@rac12sby1]#
  /u01/app/12.1.0.2/grid/rdbms/install/rootadd_rdbms.sh

Step 8 As the "root" user on the host "rac12sby1", please run the following commands:
  Action 8.1 [root@rac12sby1]#
  /u01/app/12.1.0.2/grid/perl/bin/perl -I/u01/app/12.1.0.2/grid/perl/lib \
-I/u01/app/12.1.0.2/grid/crs/install /u01/app/12.1.0.2/grid/crs/install/rootcrs.pl -postpatch


Step 9 As the "oracle" user on the host "rac12sby1", please run the following commands:
  Action 9.1 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch lsinventory \
-invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc -oh /u01/app/12.1.0.2/grid | grep 19769473

  Action 9.2 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch lsinventory \
-invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc -oh /u01/app/12.1.0.2/grid | grep 19769479

  Action 9.3 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch lsinventory \
-invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc -oh /u01/app/12.1.0.2/grid | grep 19769480

  Action 9.4 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch lsinventory \
-invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc -oh /u01/app/12.1.0.2/grid | grep 19872484

You could of course use oplan which is bundled in $ORACLE_HOME/OPatch/oplan/oplan to generate a lot more detailed profile for the patch application. For a first glance at the activity opatchauto -generateSteps seems quite useful.

The steps all rang a bell from my 11.2.0.1 days when this was the procedure to patch Grid Infrastructure (thank god that’s over). What I didn’t recognise was rootadd_rdbms.sh. Looking at the file I can see it is changing ownership and permissions for oradism (important!) and some other executables in $ORACLE_HOME and fixes potential problems with “fs.file-max*” in /etc/sysctl.conf.

Posted in 12c Release 1, Linux | Tagged: , | Leave a Comment »

Little things worth knowing: direct path inserts and referential integrity

Posted by Martin Bach on April 13, 2015

This is another post to remind myself that Oracle evolves, and what I thought I knew might no longer be relevant. So double-checking instead of assuming should become a habit!

Today’s example: direct path inserts. I seemed to remember from Oracle 9i that a direct path insert ignores referential integrity. This is still confirmed in the 9i Release 2 Concepts Guide, chapter 19 “Direct Path Insert”. Quoting from there:

During direct-path INSERT operations, Oracle appends the inserted data after existing data in the table. Data is written directly into datafiles, bypassing the buffer cache. Free space in the existing data is not reused, and referential integrity constraints are ignored

That sounds a bit harsh in today’s times so it’s worth a test. On Oracle 12.1.0.2 I created a parent/child relationship, admittedly rather crude:

SQL> create table parent (id, vc) as
  2   select distinct data_object_id, subobject_name from dba_objects
  3   where data_object_id is not null;

Table created.

SQL> alter table parent add constraint pk_parent primary key (id);

Table altered.

SQL> create table child (id number, p_id number not null, vc varchar2(100),
  2  constraint pk_child primary key (id),
  3  constraint fk_parent_child foreign key (p_id) references parent (id));

Table created.

Now when I try to insert data using a direct path insert it fails:

SQL> insert /*+ append */ into child select s_child.nextval, -1, 'this will not work' from parent;
insert /*+ append */ into child select s_child.nextval, -1, 'this will not work' from parent
*
ERROR at line 1:
ORA-02291: integrity constraint (MARTIN.FK_PARENT_CHILD) violated - parent key not found

Which is brilliant and what I expected since it prevents me from writing a lot of garbage into my table. If you have developed with Oracle then you probably know about deferrable constraints. An existing constraint can’t be changed to a status of “initially deferrable”, which is why I have to drop it and then try the insert again:

SQL> alter table child drop constraint FK_PARENT_CHILD;

Table altered.

Elapsed: 00:00:00.04
SQL> alter table child add constraint FK_PARENT_CHILD foreign key (p_id) references parent (id)
  2  deferrable initially deferred;

Table altered.

Elapsed: 00:00:00.02
SQL> insert /*+ append */ into child select s_child.nextval, -1, 'this will not work' from parent;

7640 rows created.

Elapsed: 00:00:00.30
SQL> commit;
commit
*
ERROR at line 1:
ORA-02091: transaction rolled back
ORA-02291: integrity constraint (MARTIN.FK_PARENT_CHILD) violated - parent key not found

So this contraint fires as well! Good news for me. If it’s all ok then the insert will of course succeed.

Trouble is that Oracle “silently” ignores the direct path load! I haven’t initially put this into the post but thanks to Oren for adding it to it in the comments section (make sure to have a look at it).

Back to the “hot” constraint definition and inserting into the table yields the expected result.

SQL> alter table child drop constraint FK_PARENT_CHILD;

Table altered.

SQL> alter table child add constraint FK_PARENT_CHILD foreign key (p_id) references parent (id);

Table altered.

SQL> insert /*+ append */ into child select s_child.nextval, 2, 'this should work however' from parent;

7640 rows created.

SQL> commit;

Commit complete.

Summary

Again-the insert wasn’t really a “direct path insert”, see comments. It really pays off to verify and update “old knowledge” from time to time! I still prefer valid data over garbage, and when it comes to ILM I can “move” my table to get the required compression result.

References

Have fun!

Posted in 12c Release 1 | 4 Comments »

Little things worth knowing: exp/imp vs expdp and impdp for HCC in Exadata

Posted by Martin Bach on March 30, 2015

Do you know the difference between exp/imp and expdp/impdp when it comes to importing HCC compressed data in Exadata?

If not, then follow me through two examples. This is on 11.2.0.3/11.2.3.3.1 but applies to all database releases you can have on Exadata. The task at hand is to export a table (which happens to be non-partitioned and HCC compressed for query high) and import it into a different user’s schema. This is quite a common approach when migrating data from a non-Exadata system into an Exadata system. You could for example pre-create the DDL for the tables and implement HCC before even importing a single row. When importing the data, the partitions’ HCC attributes will be honoured and data will be inserted compressed. Or won’t it?

The table

The table I want to export resides on our V2 system. Since I am (trying to be) a good citizen I want to use dbfs for the dump files. Beginning with 12.1.0.2 Grid Infrastructure you can also have ACFS by the way. Let’s start by creating the directory needed and some meta-information about the source table:

SQL> create or replace directory DBM01 as '/dbfs_direct/FS1/mbach/';

Directory created.

SQL> select owner, table_name, partitioned, num_rows, compress_for from dba_tables where table_name = 'T1_QH';

OWNER                          TABLE_NAME                     PAR   NUM_ROWS COMPRESS_FOR
------------------------------ ------------------------------ --- ---------- ------------
MARTIN                         T1_QH                          NO             QUERY HIGH

SQL> select bytes/power(1024,2) m, blocks from dba_segments where segment_name = 'T1_QH';

         M     BLOCKS
---------- ----------
        72       9216

The table is 72 MB in size, and HCC compressed (I actually ensured that it was by issuing an “alter table t1_qh move;” before the export started).

Data Pump

The first export uses expdp, followed by impdp to get the data back. I am remapping the schema so I don’t have to risk overwriting my source data.

[enkdb01:oracle:DBM011] /home/oracle/mbach
> expdp martin/xxxxxx directory=DBM01 dumpfile=exp_t1_qh.dmp logfile=exp_t1_qh.log tables=t1_qh

Export: Release 11.2.0.3.0 - Production on Tue Mar 24 06:15:34 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
Starting "MARTIN"."SYS_EXPORT_TABLE_02":  martin/******** directory=DBM01 dumpfile=exp_t1_qh.dmp logfile=exp_t1_qh.log tables=t1_qh
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 72 MB
Processing object type TABLE_EXPORT/TABLE/TABLE
. . exported "MARTIN"."T1_QH"                            9.675 GB 10000000 rows
Master table "MARTIN"."SYS_EXPORT_TABLE_02" successfully loaded/unloaded
******************************************************************************
Dump file set for MARTIN.SYS_EXPORT_TABLE_02 is:
  /dbfs_direct/FS1/mbach/exp_t1_qh.dmp
Job "MARTIN"."SYS_EXPORT_TABLE_02" successfully completed at 06:17:16

The interesting bit here is that the table on disk occupies around 72 MB, and yet expdp tells me the 10000000 rows occupy 9.7 GB. Can anyone guess why?

[enkdb01:oracle:DBM011] /home/oracle/mbach
> ls -lh /dbfs_direct/FS1/mbach/exp_t1_qh.dmp
-rw-r----- 1 oracle dba 9.7G Mar 24 06:17 /dbfs_direct/FS1/mbach/exp_t1_qh.dmp

Yes, 10GB. Data apparently is not exported in its compressed form. Now this table is going to be imported:

[enkdb01:oracle:DBM011] /home/oracle/mbach
> impdp imptest/xxxxxx directory=DBM01 dumpfile=exp_t1_qh.dmp logfile=imp_t1_qh.log \
> tables=martin.t1_qh remap_schema=martin:imptest 

Import: Release 11.2.0.3.0 - Production on Tue Mar 24 06:23:44 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
Master table "IMPTEST"."SYS_IMPORT_TABLE_01" successfully loaded/unloaded
Starting "IMPTEST"."SYS_IMPORT_TABLE_01":  imptest/******** directory=DBM01
  dumpfile=exp_t1_qh.dmp logfile=imp_t1_qh.log tables=martin.t1_qh remap_schema=martin:imptest
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
. . imported "IMPTEST"."T1_QH"                           9.675 GB 10000000 rows
Job "IMPTEST"."SYS_IMPORT_TABLE_01" successfully completed at 06:29:53

This looks all right-all data back (the number of rows exported matches those imported). What about the segment size?

SQL> select owner, table_name, partitioned, num_rows, compress_for from dba_tables where table_name = 'T1_QH';

OWNER                          TABLE_NAME                     PAR   NUM_ROWS COMPRESS_FOR
------------------------------ ------------------------------ --- ---------- ------------
MARTIN                         T1_QH                          NO             QUERY HIGH
IMPTEST                        T1_QH                          NO             QUERY HIGH

2 rows selected.

SQL> select bytes/power(1024,2) m, blocks from dba_segments where segment_name = 'T1_QH';

         M     BLOCKS
---------- ----------
        72       9216
        72       9216

2 rows selected.

Identical down to the block.

Traditional Export/Import

Despite the fact that exp/imp are deprecated they are still included with 12.1.0.2, the current release at the time of writing. What if you did the same process with these instead? After all, many DBAs “grew up” with those tools and can use them in their sleep. This plus some initial deficits with Data Pump in 10g keep exp/imp high up in the list of tools we like.

Let’s export the table:

[enkdb01:oracle:DBM011] /home/oracle/mbach
> exp martin/xxxxxxx file=/dbfs_direct/FS1/mbach/exp_t1_qh_classic.dmp log=/dbfs_direct/FS1/mbach/exp_t1_qh_classic.log tables=t1_qh

Export: Release 11.2.0.3.0 - Production on Tue Mar 24 06:42:34 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Tes
Export done in US7ASCII character set and AL16UTF16 NCHAR character set

About to export specified tables via Conventional Path ...
. . exporting table                          T1_QH   10000000 rows exported
Export terminated successfully without warnings.

The files are more or less identical in size to the ones created before by Data Pump:

[enkdb01:oracle:DBM011] /home/oracle/mbach
> ls -lh /dbfs_direct/FS1/mbach/*classic*
-rw-r--r-- 1 oracle dba 9.7G Mar 24 07:01 /dbfs_direct/FS1/mbach/exp_t1_qh_classic.dmp
-rw-r--r-- 1 oracle dba  537 Mar 24 07:01 /dbfs_direct/FS1/mbach/exp_t1_qh_classic.log

What about the import?

[enkdb01:oracle:DBM011] /home/oracle/mbach
> imp imptest/xxxxxx file=/dbfs_direct/FS1/mbach/exp_t1_qh_classic.dmp log=/dbfs_direct/FS1/mbach/imp_t1_qh_classic.log \
> FROMUSER=martin TOUSER=imptest tables=t1_qh

Import: Release 11.2.0.3.0 - Production on Tue Mar 24 07:27:42 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Tes

Export file created by EXPORT:V11.02.00 via conventional path

Warning: the objects were exported by MARTIN, not by you

import done in US7ASCII character set and AL16UTF16 NCHAR character set
. importing MARTIN's objects into IMPTEST
. . importing table                        "T1_QH"

And this takes a loooong time. What’s happening in the background? I first checked what the session was doing. That’s simple-have a look at the output.

SQL> @scripts/as

   INST_ID   SID    SERIAL# USERNAME      PROG       SQL_ID         CHILD PLAN_HASH_VALUE      EXECS   AVG_ETIME OFF SQL_TEXT
---------- ----- ---------- ------------- ---------- ------------- ------ --------------- ---------- ----------- --- -----------------------------------------
         1   594         23 IMPTEST       imp@enkdb0 fgft1tcrr12ga      0               0      22723         .00 No  INSERT /*+NESTED_TABLE_SET_REFS+*/ INTO "

The #execs were interesting and indeed, after a couple of minutes that counter has gone up a lot:

   INST_ID   SID    SERIAL# USERNAME      PROG       SQL_ID         CHILD PLAN_HASH_VALUE      EXECS   AVG_ETIME OFF SQL_TEXT
---------- ----- ---------- ------------- ---------- ------------- ------ --------------- ---------- ----------- --- -----------------------------------------
         1   594         23 IMPTEST       imp@enkdb0 fgft1tcrr12ga      0               0     639818         .00 Yes INSERT /*+NESTED_TABLE_SET_REFS+*/ INTO "

You can probably guess what’s coming next. After quite some while the import finished:

> imp imptest/xxxxxxx file=/dbfs_direct/FS1/mbach/exp_t1_qh_classic.dmp log=/dbfs_direct/FS1/mbach/imp_t1_qh_classic.log \
> FROMUSER=martin TOUSER=imptest tables=t1_qh

Import: Release 11.2.0.3.0 - Production on Tue Mar 24 07:27:42 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Tes

Export file created by EXPORT:V11.02.00 via conventional path

Warning: the objects were exported by MARTIN, not by you

import done in US7ASCII character set and AL16UTF16 NCHAR character set
. importing MARTIN's objects into IMPTEST
. . importing table                        "T1_QH"   10000000 rows imported
Import terminated successfully without warnings.

And now for the reason of this blog post:

SQL> select owner, bytes/power(1024,2) m, blocks from dba_segments where segment_name = 'T1_QH';

OWNER                                   M     BLOCKS
------------------------------ ---------- ----------
MARTIN                                 72       9216
IMPTEST                             11227    1437056

The newly important table was not compressed at all when importing using the traditional path.

Posted in 11g Release 2, 12c Release 1, Exadata | Tagged: , , , , | Leave a Comment »

Understanding enhancements to block cleanouts in Exadata part 2

Posted by Martin Bach on February 25, 2015

In part 1 of the series I tried to explain (probably a bit too verbose when it came to session statistics) what the effect is of delayed block cleanout and buffered I/O. In the final example the “dirty” blocks on disk have been cleaned out in the buffer cache, greatly reducing the amount of work to be done when reading them.

Catching up with now, and direct path reads. You probably noticed that the migration to 11.2 caused your I/O patterns to change. Suitably large segments are now read using direct path read not only during parallel query but also potentially during the serial execution of a query. Since the blocks read during a direct path read do not end up in the buffer cache there is an interesting side effect to block cleanouts. The scenario is the same unrealistic yet reproducible one as the one presented in part 1 of this article.

Enter Direct Path Reads – non Exadata

To be absolutely sure I am getting results without any optimisation offered by Exadata I am running the example on different hardware. I am also using 11.2.0.4 because that’s the version I had on my lab server. The principles here apply between versions unless I am very mistaken.

I am repeating my test under realistic conditions, leaving _serial_direct_read at its default, “auto”. This means that I am almost certainly going to see direct path reads now when scanning my table. As Christian Antognini has pointed out direct path reads have an impact on the amount of work that has to be done. The test commences with an update in session 1 updating the whole table and flushing the blocks in the buffer cache to disk to ensure they have an active transaction in the ITL part of the header. The selects on session 2 show the following behaviour. The first execution takes longer as expected, the second one is faster.

SQL> select /* not_exadata */ count(*) from t1_100k;

  COUNT(*)
----------
    500000

Elapsed: 00:00:20.62

SQL> @scripts/mystats stop r=physical|cleanout|consistent|cell|table
...
------------------------------------------------------------------------------------------
2. Statistics Report
------------------------------------------------------------------------------------------

Type    Statistic Name                                                               Value
------  ----------------------------------------------------------------  ----------------
STAT    active txn count during cleanout                                            83,334
STAT    cell physical IO interconnect bytes                                    747,798,528
STAT    cleanout - number of ktugct calls                                           83,334
STAT    cleanouts and rollbacks - consistent read gets                              83,334
STAT    consistent changes                                                           1,019
STAT    consistent gets                                                            751,365
STAT    consistent gets - examination                                              667,178
STAT    consistent gets direct                                                      83,334
STAT    consistent gets from cache                                                 668,031
STAT    consistent gets from cache (fastpath)                                          850
STAT    data blocks consistent reads - undo records applied                        583,334
STAT    immediate (CR) block cleanout applications                                  83,334
STAT    physical read IO requests                                                    8,631
STAT    physical read bytes                                                    747,798,528
STAT    physical read total IO requests                                              8,631
STAT    physical read total bytes                                              747,798,528
STAT    physical read total multi block requests                                       673
STAT    physical reads                                                              91,284
STAT    physical reads cache                                                         7,950
STAT    physical reads direct                                                       83,334
STAT    table fetch by rowid                                                           170
STAT    table scan blocks gotten                                                    83,334
STAT    table scan rows gotten                                                     500,000
STAT    table scans (direct read)                                                        1
STAT    table scans (short tables)                                                       1


------------------------------------------------------------------------------------------
3. About
------------------------------------------------------------------------------------------
- MyStats v2.01 by Adrian Billington (http://www.oracle-developer.net)
- Based on the SNAP_MY_STATS utility by Jonathan Lewis

==========================================================================================
End of report
==========================================================================================

The second execution shows slightly better performance as seen before in part 1. The relevant statistics are shown here:


SQL> select /* not_exadata */ count(*) from t1_100k;

  COUNT(*)
----------
    500000

Elapsed: 00:00:04.60

SQL> @scripts/mystats stop r=physical|cleanout|consistent|cell|table
...
------------------------------------------------------------------------------------------
2. Statistics Report
------------------------------------------------------------------------------------------

Type    Statistic Name                                                               Value
------  ----------------------------------------------------------------  ----------------
STAT    active txn count during cleanout                                            83,334
STAT    cell physical IO interconnect bytes                                    682,672,128
STAT    cleanout - number of ktugct calls                                           83,334
STAT    cleanouts and rollbacks - consistent read gets                              83,334
STAT    consistent changes                                                             472
STAT    consistent gets                                                            750,250
STAT    consistent gets - examination                                              666,671
STAT    consistent gets direct                                                      83,334
STAT    consistent gets from cache                                                 666,916
STAT    consistent gets from cache (fastpath)                                          245
STAT    data blocks consistent reads - undo records applied                        583,334
STAT    immediate (CR) block cleanout applications                                  83,334
STAT    physical read IO requests                                                      681
STAT    physical read bytes                                                    682,672,128
STAT    physical read total IO requests                                                681
STAT    physical read total bytes                                              682,672,128
STAT    physical read total multi block requests                                       673
STAT    physical reads                                                              83,334
STAT    physical reads direct                                                       83,334
STAT    table fetch by rowid                                                             1
STAT    table scan blocks gotten                                                    83,334
STAT    table scan rows gotten                                                     500,000
STAT    table scans (direct read)                                                        1


------------------------------------------------------------------------------------------
3. About
------------------------------------------------------------------------------------------
- MyStats v2.01 by Adrian Billington (http://www.oracle-developer.net)
- Based on the SNAP_MY_STATS utility by Jonathan Lewis

==========================================================================================
End of report
==========================================================================================

The amount of IO is slightly less. Still all the CR processing is done for every block in the table-each has an active transaction (active txn count during cleanout). As you would expect the buffer cache has no blocks from the segment stored:

SQL> select count(*), inst_id, status from gv$bh 
  2  where objd = (select data_object_id from dba_objects where object_name = 'T1_100K') 
  3  group by inst_id, status;

  COUNT(*)    INST_ID STATUS
---------- ---------- ----------
     86284          2 free
    584983          1 free
         1          1 scur

Subsequent executions all take approximately the same amount of time. Every execution has to perform cleanouts and rollbacks-no difference to before really except that direct path reads are used to read the table. No difference here to the buffered reads, with the exception that there aren’t any blocks from T1_100K in the buffer cache.

Commit

Committing in session 1 does not have much of an effect-the blocks read by the direct path read are going to the PGA instead of the buffer cache. A quick demonstration: the same select has been executed after the transaction in session 1 has committed.

SQL> select /* not_exadata */ count(*) from t1_100k;

  COUNT(*)
----------
    500000

Elapsed: 00:00:04.87

SQL> @scripts/mystats stop r=physical|cleanout|consistent|cell|table
...
------------------------------------------------------------------------------------------
2. Statistics Report
------------------------------------------------------------------------------------------

Type    Statistic Name                                                               Value
------  ----------------------------------------------------------------  ----------------
STAT    cell physical IO interconnect bytes                                    682,672,128
STAT    cleanout - number of ktugct calls                                           83,334
STAT    cleanouts only - consistent read gets                                       83,334
STAT    commit txn count during cleanout                                            83,334
STAT    consistent changes                                                             481
STAT    consistent gets                                                            166,916
STAT    consistent gets - examination                                               83,337
STAT    consistent gets direct                                                      83,334
STAT    consistent gets from cache                                                  83,582
STAT    consistent gets from cache (fastpath)                                          245
STAT    immediate (CR) block cleanout applications                                  83,334
STAT    physical read IO requests                                                      681
STAT    physical read bytes                                                    682,672,128
STAT    physical read total IO requests                                                681
STAT    physical read total bytes                                              682,672,128
STAT    physical read total multi block requests                                       673
STAT    physical reads                                                              83,334
STAT    physical reads direct                                                       83,334
STAT    table fetch by rowid                                                             1
STAT    table scan blocks gotten                                                    83,334
STAT    table scan rows gotten                                                     500,000
STAT    table scans (direct read)                                                        1


------------------------------------------------------------------------------------------
3. About
------------------------------------------------------------------------------------------
- MyStats v2.01 by Adrian Billington (http://www.oracle-developer.net)
- Based on the SNAP_MY_STATS utility by Jonathan Lewis

==========================================================================================
End of report
==========================================================================================

As you can see the work done for the cleanout is repeated very time, despite the fact that active txn count during cleanout is not found in the output. And still, there aren’t really any blocks pertaining to T1_100K in the buffer cache.

SQL> select count(*), inst_id, status from gv$bh where
  2   objd = (select data_object_id from dba_objects where object_name = 'T1_100K')
  3  group by inst_id, status;

  COUNT(*)    INST_ID STATUS
---------- ---------- ----------
     86284          2 free
    584983          1 free
         1          1 scur

OK that should be enough for now-over to the Exadata.

Direct Path Reads – Exadata

The same test again, but this time on an X2-2 quarter rack running database 12.1.0.2 and cellsrv 12.1.2.1.0. After the update of the entire table and the flushing of blocks to disk, here are the statistics for the first and second execution.

SQL> select count(*) from t1_100k;

  COUNT(*)
----------
    500000

Elapsed: 00:00:17.41

SQL> @scripts/mystats stop r=physical|cleanout|consistent|cell|table


------------------------------------------------------------------------------------------
2. Statistics Report
------------------------------------------------------------------------------------------

Type    Statistic Name                                                               Value
------  ----------------------------------------------------------------  ----------------
STAT    active txn count during cleanout                                            83,334
STAT    cell IO uncompressed bytes                                             682,672,128
STAT    cell blocks processed by cache layer                                        83,933
STAT    cell commit cache queries                                                   83,933
STAT    cell flash cache read hits                                                  14,011
STAT    cell num smartio automem buffer allocation attempts                              1
STAT    cell physical IO bytes eligible for predicate offload                  682,672,128
STAT    cell physical IO interconnect bytes                                    804,842,256
STAT    cell physical IO interconnect bytes returned by smart scan             682,904,336
STAT    cell scans                                                                       1
STAT    cleanout - number of ktugct calls                                           83,334
STAT    cleanouts and rollbacks - consistent read gets                              83,334
STAT    consistent changes                                                             796
STAT    consistent gets                                                            917,051
STAT    consistent gets direct                                                      83,334
STAT    consistent gets examination                                                833,339
STAT    consistent gets examination (fastpath)                                      83,336
STAT    consistent gets from cache                                                 833,717
STAT    consistent gets pin                                                            378
STAT    consistent gets pin (fastpath)                                                 377
STAT    data blocks consistent reads - undo records applied                        750,002
STAT    immediate (CR) block cleanout applications                                  83,334
STAT    physical read IO requests                                                   16,147
STAT    physical read bytes                                                    804,610,048
STAT    physical read requests optimized                                            14,011
STAT    physical read total IO requests                                             16,147
STAT    physical read total bytes                                              804,610,048
STAT    physical read total bytes optimized                                    114,778,112
STAT    physical read total multi block requests                                       851
STAT    physical reads                                                              98,219
STAT    physical reads cache                                                        14,885
STAT    physical reads direct                                                       83,334
STAT    table fetch by rowid                                                             1
STAT    table scan blocks gotten                                                    83,334
STAT    table scan disk non-IMC rows gotten                                        500,000
STAT    table scan rows gotten                                                     500,000
STAT    table scans (direct read)                                                        1


------------------------------------------------------------------------------------------
3. About
------------------------------------------------------------------------------------------
- MyStats v2.01 by Adrian Billington (http://www.oracle-developer.net)
- Based on the SNAP_MY_STATS utility by Jonathan Lewis

==========================================================================================
End of report
==========================================================================================


SQL> -- second execution, still no commit in session 1
SQL> select count(*) from t1_100k;

  COUNT(*)
----------
    500000

Elapsed: 00:00:01.31

SQL> @scripts/mystats stop r=physical|cleanout|consistent|cell|table
...
------------------------------------------------------------------------------------------
2. Statistics Report
------------------------------------------------------------------------------------------

Type    Statistic Name                                                               Value
------  ----------------------------------------------------------------  ----------------
STAT    active txn count during cleanout                                            83,334
STAT    cell IO uncompressed bytes                                             682,672,128
STAT    cell blocks processed by cache layer                                        83,949
STAT    cell commit cache queries                                                   83,949
STAT    cell flash cache read hits                                                   1,269
STAT    cell num smartio automem buffer allocation attempts                              1
STAT    cell physical IO bytes eligible for predicate offload                  682,672,128
STAT    cell physical IO interconnect bytes                                    682,907,280
STAT    cell physical IO interconnect bytes returned by smart scan             682,907,280
STAT    cell scans                                                                       1
STAT    cleanout - number of ktugct calls                                           83,334
STAT    cleanouts and rollbacks - consistent read gets                              83,334
STAT    consistent changes                                                             791
STAT    consistent gets                                                            917,053
STAT    consistent gets direct                                                      83,334
STAT    consistent gets examination                                                833,339
STAT    consistent gets examination (fastpath)                                      83,337
STAT    consistent gets from cache                                                 833,719
STAT    consistent gets pin                                                            380
STAT    consistent gets pin (fastpath)                                                 380
STAT    data blocks consistent reads - undo records applied                        750,002
STAT    immediate (CR) block cleanout applications                                  83,334
STAT    physical read IO requests                                                    1,278
STAT    physical read bytes                                                    682,672,128
STAT    physical read requests optimized                                             1,269
STAT    physical read total IO requests                                              1,278
STAT    physical read total bytes                                              682,672,128
STAT    physical read total bytes optimized                                    678,494,208
STAT    physical read total multi block requests                                       885
STAT    physical reads                                                              83,334
STAT    physical reads direct                                                       83,334
STAT    table fetch by rowid                                                             1
STAT    table scan blocks gotten                                                    83,334
STAT    table scan disk non-IMC rows gotten                                        500,000
STAT    table scan rows gotten                                                     500,000
STAT    table scans (direct read)                                                        1


------------------------------------------------------------------------------------------
3. About
------------------------------------------------------------------------------------------
- MyStats v2.01 by Adrian Billington (http://www.oracle-developer.net)
- Based on the SNAP_MY_STATS utility by Jonathan Lewis

==========================================================================================
End of report
==========================================================================================

In the example table scans (direct read) indicates that the segment was read using a direct path read. Since this example was executed on an Exadata this direct path read was performed as a Smart Scan (cell scans = 1). However, the Smart Scan was not very successful: although all 83,334 table blocks were opened by the cache layer (the first one to touch a block on the cell during the Smart Scan) none of them passed the examination of the transaction layer-they have all been discarded. You can see that there was no saving by using the Smart Scan here: cell physical IO bytes eligible for predicate offload is equal to cell physical IO interconnect bytes returned by smart scan.

83,334 blocks were read during the table scan (table scan blocks gotten), all of which were read directly (physical reads direct). All of the blocks read had an active transaction (active txn count during cleanout) and all of these had to be rolled back (cleanouts and rollbacks – consistent read gets).

The same amount of work has to be done for every read.

Commit in session 1

After committing the transaction in session 1 the statistics for the select statement in session 2 are as follows (also note the execution time)

SQL> select count(*) from t1_100k;

  COUNT(*)
----------
    500000

Elapsed: 00:00:00.16

SQL> @scripts/mystats stop r=physical|cleanout|consistent|cell|table

------------------------------------------------------------------------------------------
2. Statistics Report
------------------------------------------------------------------------------------------

Type    Statistic Name                                                               Value
------  ----------------------------------------------------------------  ----------------
STAT    cell IO uncompressed bytes                                             682,672,128
STAT    cell blocks helped by commit cache                                          83,334
STAT    cell blocks processed by cache layer                                        83,334
STAT    cell blocks processed by data layer                                         83,334
STAT    cell blocks processed by txn layer                                          83,334
STAT    cell commit cache queries                                                   83,334
STAT    cell flash cache read hits                                                     663
STAT    cell num smartio automem buffer allocation attempts                              1
STAT    cell physical IO bytes eligible for predicate offload                  682,672,128
STAT    cell physical IO interconnect bytes                                     13,455,400
STAT    cell physical IO interconnect bytes returned by smart scan              13,455,400
STAT    cell scans                                                                       1
STAT    cell transactions found in commit cache                                     83,334
STAT    consistent changes                                                             791
STAT    consistent gets                                                             83,717
STAT    consistent gets direct                                                      83,334
STAT    consistent gets examination                                                      3
STAT    consistent gets examination (fastpath)                                           3
STAT    consistent gets from cache                                                     383
STAT    consistent gets pin                                                            380
STAT    consistent gets pin (fastpath)                                                 380
STAT    physical read IO requests                                                      663
STAT    physical read bytes                                                    682,672,128
STAT    physical read requests optimized                                               663
STAT    physical read total IO requests                                                663
STAT    physical read total bytes                                              682,672,128
STAT    physical read total bytes optimized                                    682,672,128
STAT    physical read total multi block requests                                       654
STAT    physical reads                                                              83,334
STAT    physical reads direct                                                       83,334
STAT    table fetch by rowid                                                             1
STAT    table scan blocks gotten                                                    83,334
STAT    table scan disk non-IMC rows gotten                                        500,000
STAT    table scan rows gotten                                                     500,000
STAT    table scans (direct read)                                                        1
STAT    table scans (short tables)                                                       1


------------------------------------------------------------------------------------------
3. About
------------------------------------------------------------------------------------------
- MyStats v2.01 by Adrian Billington (http://www.oracle-developer.net)
- Based on the SNAP_MY_STATS utility by Jonathan Lewis

==========================================================================================
End of report
==========================================================================================

There are no more active transactions found during the cleanout. The blocks however still have to be cleaned out every time the direct path read is performed. As a matter of design principle Oracle cannot reuse the blocks it has read with a direct path read (or Smart Scan for that matter) they simply are not cached.

But have a look at the execution time: quite a nice performance improvement that is. In fact there are a few things worth mentioning:

  • The segment is successfully scanned via a Smart Scan. How can you tell?
    • cell scans = 1
    • cell physical IO interconnect bytes returned by smart scan is a lot lower than cell physical IO bytes eligible for predicate offload
    • You see cell blocks processed by (cache|transaction|data) layer matching the number of blocks the table is made up of. Previously you only saw blocks processed by the cache layer
  • The commit cache made this happen
    • cell commit cache queries
    • cell blocks helped by commit cache
    • cell transactions found in commit cache

The commit cache “lives” in the cells and probably caches recently committed transactions. If a block is read by the cell and an active transaction is found the cell has a couple of options to avoid a round-trip of the block to the RDBMS layer for consistent read processing:

  1. If possible it uses the minscn optimisation. Oracle tracks the minimum SCN of all active transactions. The cell software can compare the SCN from the ITL in the block with the lowest SCN of all active transactions. If that number is greater than the SCN found in the block the transaction must have already completed and it is safe to read the block. This optimisation is not shown in the blog post, you’d see  cell blocks helped by minscn optimization increase
  2. If the first optimisation does not help the cells make use of the commit cache-visible as cell commit cache queries and if a cache hit has been scored, cell blocks helped by commit cache. While the transaction in session 1 hasn’t completed yet you could only see the commit cache queries. After the commit the queries were successful

If both of them fail the block must be sent to the RDBMS layer for consistent read processing. In that case there is no difference to the treatment of direct path read blocks outside Exadata.

Summary

The commit cache is a great enhancement Exadata offers for databases using the platform. While non-Exadata deployments will have to clean out blocks that haven’t been subject to cleanouts every time the “dirty” block is read via direct path read, this can be avoided on Exadata if the transaction has committed.

Posted in Exadata, Linux | Leave a Comment »

Understanding enhancements to block cleanouts in Exadata part 1

Posted by Martin Bach on February 23, 2015

Travel time is writing time and I have the perfect setting for a techie post. Actually I got quite excited about the subject causing the article to get a bit longer than initially anticipated. In this part you can read about block cleanouts when using buffered I/O. The next part will show how this works using direct path reads and Smart Scans.

The article ultimately aims at describing the enhancements Exadata brings to the table for direct path reads and delayed block cleanouts. Delayed block cleanouts are described in Jonathan Lewis’s “Oracle Core”, and in one of his blog posts, so here’s just a summary.

The delayed block cleanout

In a nutshell Oracle’s database writer (the process persisting your transactions to disk) is free to write blocks to disk in batches when it has to be done. A commit on its own won’t trigger a write of dirty (or modified) block to disk. If it were to trigger the write, the commit-time would be proportional to the number of blocks affected by the last transaction. The commit command however completes very quickly regardless of how much data has been affected. Unlike a rollback … It is entirely possible that a block modified by a transaction is written to disk before the transaction has been completed. A little later, once the commit has been acknowledged by Oracle there is no process that would read the blocks back into the buffer cache and clear them out-this happens later. It would also be quite inefficient to do so.

Defining “clean”

Now when database writer writes to disk it is possible that the block just written has an active transaction recorded in its header. You can see this by dumping a block with an active transaction – the ITL in the header will reference the XID, UBA and the number of rows affected plus other information. The individual rows that are part of the transaction have their lock byte set as you can see in the row directory. The number in the lb field refers back to an ITL you see further up in the block dump (don’t worry I’ll show you an example shortly).

What happened in Oracle before direct path reads became a lot more common is this:

  • A user starts a transaction, for example by updating a large portion of the table
  • In the meantime the database writer flushes some blocks to disk, including some of the ones affected by the transaction.
  • The user commits a transaction
  • The next user queries the block (let’s assume it is not in the buffer cache)
  • The second session’s foreground process reads the block
    • Realises that it has an active transaction recorded in it
    • Checks if that transaction is still active
    • And clears the block out if not
  • The block with a valid CR image is now in the buffer cache

Huh?

Sounds too abstract-I agree? Let’s have an example.

SQL> select table_name,num_rows,compression,partitioned from tabs
  2  where table_name = 'T1_100k';

TABLE_NAME                       NUM_ROWS COMPRESS PAR
------------------------------ ---------- -------- ---
T1_100K                            500000 DISABLED NO

SQL> update T1_100k set state = 'MODIFIED';

5000000 rows updated.

An easy way to simulate the flushing of blocks to disk is a slightly brute-force approach of flushing the buffer cache. Only recommended in the lab, really. Let’s have a look at what the block looks like:

SQL> select dbms_rowid.rowid_block_number(rowid), dbms_rowid.rowid_relative_fno(rowid)
  2  from t1_100k where rownum < 11;

DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID)
------------------------------------ ------------------------------------
                             3940811                                    5
                             3940811                                    5
                             3940811                                    5
                             3940811                                    5
                             3940811                                    5
                             3940811                                    5
                             3940812                                    5
                             3940812                                    5
                             3940812                                    5
                             3940812                                    5

10 rows selected.

Elapsed: 00:00:00.01
SQL> alter system dump datafile 5 block 3940811;

System altered.

Elapsed: 00:00:00.01
SQL> select value from v$diag_info where name like 'Default%';

VALUE
---------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/dbm01/dbm011/trace/dbm011_ora_94351.trc

Elapsed: 00:00:00.02

The block dump for block 3940811 on datafile 5 (users tablespace) is now in that trace file:

Block header dump:  0x017c21cb
 Object id on Block? Y
 seg/obj: 0x12a13  csc: 0x00.2191ff7  itc: 3  flg: E  typ: 1 - DATA
     brn: 0  bdba: 0x17c21c8 ver: 0x01 opc: 0
     inc: 0  exflg: 0

 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0xffff.000.00000000  0x00000000.0000.00  C---    0  scn 0x0000.02191d10
0x02   0x0009.005.00002575  0x0003688a.0ef1.1f  ----    6  fsc 0x0000.00000000
0x03   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
bdba: 0x017c21cb
data_block_dump,data header at 0x7f026361707c
===============
tsiz: 0x1f80
hsiz: 0x1e
pbl: 0x7f026361707c
     76543210
flag=--------
ntab=1
nrow=6
...
0xe:pti[0]      nrow=6  offs=0
0x12:pri[0]     offs=0x1b75
0x14:pri[1]     offs=0x176a
0x16:pri[2]     offs=0x135f
0x18:pri[3]     offs=0xf54
0x1a:pri[4]     offs=0xb49
0x1c:pri[5]     offs=0x73e
block_row_dump:
tab 0, row 0, @0x1b75
tl: 1035 fb: --H-FL-- lb: 0x2  cc: 6
col  0: [ 2]  c1 02
col  1: [999]
 31 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
...

The significant pieces of information here are in the ITL (interested transaction list) and the lock byte set, pointing to the second ITL entry. The absence of flags for ITL #2 indicates the transaction is ongoing (you can see this in v$transaction-this is a lab environment, in real databases you’d see more than 1 active transaction):

SQL> select xidusn, xidslot, xidsqn, status, start_scn from v$transaction;

    XIDUSN    XIDSLOT     XIDSQN STATUS            START_SCN
---------- ---------- ---------- ---------------- ----------
         9          5       9589 ACTIVE             35200741

This is indeed the transaction that is referenced in v$transaction and the ITL 0x02 above (9589 is decimal for 2575 in hexadecimal notation, 9 and 5 don’t need converting to decimal numbers). If you are curious now about some more internals then you might find this post by Arup Nanda interesting.

Now what happens if another session queries the table? Since the query in session 1 has not yet committed the changes in it must not be seen by any other session (that would be a dirty read otherwise. We do not use dirty reads in Oracle RDBMS). So to give the user a consistent view of the data the block must be rolled back. Before the test I made sure there wasn’t a block in the buffer cache for the object in question:

SQL> select count(*), inst_id, status from gv$bh where objd =
  2   (select data_object_id from dba_objects where object_name = 'T1_100K')
  3  group by inst_id, status;

  COUNT(*)    INST_ID STATUS
---------- ---------- ----------
    524327          1 free

Let’s test:

SQL> @mystats start

SQL> select /* test004 */ count(*) from t1_100k;

  COUNT(*)
----------
    500000

Elapsed: 00:00:13.96

@mystats stop t=1

==========================================================================================
MyStats report : 22-FEB-2015 08:42:09
==========================================================================================


------------------------------------------------------------------------------------------
1. Summary Timings
------------------------------------------------------------------------------------------

Type    Statistic Name                                                               Value
------  ----------------------------------------------------------------  ----------------
TIMER   snapshot interval (seconds)                                                  35.57
TIMER   CPU time used (seconds)                                                       3.21


------------------------------------------------------------------------------------------
2. Statistics Report
------------------------------------------------------------------------------------------

Type    Statistic Name                                                               Value
------  ----------------------------------------------------------------  ----------------
STAT    CPU used by this session                                                       327
STAT    CPU used when call started                                                     327
STAT    CR blocks created                                                           83,334
STAT    DB time                                                                      1,401
STAT    Requests to/from client                                                         16
STAT    SQL*Net roundtrips to/from client                                               16
STAT    active txn count during cleanout                                            83,334
STAT    buffer is not pinned count                                                       2
STAT    bytes received via SQL*Net from client                                      13,279
STAT    bytes sent via SQL*Net to client                                             5,106
STAT    calls to get snapshot scn: kcmgss                                              758
STAT    calls to kcmgas                                                             83,332
STAT    calls to kcmgcs                                                                 30
STAT    cell flash cache read hits                                                   8,045
STAT    cell physical IO interconnect bytes                                    748,576,768
STAT    change write time                                                               48
STAT    cleanout - number of ktugct calls                                           83,334
STAT    cleanouts and rollbacks - consistent read gets                              83,334
STAT    consistent changes                                                         591,717
STAT    consistent gets                                                            757,980
STAT    consistent gets examination                                                674,254
STAT    consistent gets examination (fastpath)                                      83,337
STAT    consistent gets from cache                                                 757,980
STAT    consistent gets pin                                                         83,726
STAT    consistent gets pin (fastpath)                                              83,055
STAT    data blocks consistent reads - undo records applied                        590,917
STAT    db block changes                                                            84,134
STAT    db block gets                                                                2,301
STAT    db block gets from cache                                                     2,301
STAT    enqueue releases                                                                 4
STAT    enqueue requests                                                                 4
STAT    execute count                                                                   13
STAT    file io wait time                                                       11,403,214
STAT    free buffer requested                                                      174,766
STAT    gc local grants                                                            174,713
STAT    global enqueue gets sync                                                        43
STAT    global enqueue releases                                                         41
STAT    heap block compress                                                          4,549
STAT    hot buffers moved to head of LRU                                                 4
STAT    immediate (CR) block cleanout applications                                  83,334
STAT    index fetch by key                                                               1
STAT    lob writes                                                                     375
STAT    lob writes unaligned                                                           375
STAT    logical read bytes from cache                                        6,228,221,952
STAT    messages sent                                                                    9
STAT    non-idle wait count                                                         17,446
STAT    non-idle wait time                                                           1,141
STAT    opened cursors cumulative                                                       13
STAT    parse count (hard)                                                               1
STAT    parse count (total)                                                             13
STAT    physical read IO requests                                                    8,715
STAT    physical read bytes                                                    748,576,768
STAT    physical read requests optimized                                             8,045
STAT    physical read total IO requests                                              8,715
STAT    physical read total bytes                                              748,576,768
STAT    physical read total bytes optimized                                     65,904,640
STAT    physical read total multi block requests                                       654
STAT    physical reads                                                              91,379
STAT    physical reads cache                                                        91,379
STAT    physical reads cache prefetch                                               82,664
STAT    recursive calls                                                              1,985
STAT    recursive cpu usage                                                              6
STAT    redo entries                                                                84,005
STAT    redo entries for lost write detection                                          671
STAT    redo size                                                                9,016,740
STAT    redo size for lost write detection                                       3,016,296
STAT    redo subscn max counts                                                      83,334
STAT    rows fetched via callback                                                        1
STAT    session cursor cache count                                                       4
STAT    session cursor cache hits                                                        9
STAT    session logical reads                                                      760,281
STAT    session pga memory                                                         -65,536
STAT    session uga memory                                                          65,488
STAT    session uga memory max                                                     148,312
STAT    shared hash latch upgrades - no wait                                         8,218
STAT    table fetch by rowid                                                             1
STAT    table scan blocks gotten                                                    83,334
STAT    table scan disk non-IMC rows gotten                                        500,000
STAT    table scan rows gotten                                                     500,000
STAT    table scans (short tables)                                                       1
STAT    temp space allocated (bytes)                                             2,097,152
STAT    user I/O wait time                                                           1,141
STAT    user calls                                                                      22
STAT    workarea executions - optimal                                                    1
STAT    workarea memory allocated                                                      -43


------------------------------------------------------------------------------------------
3. About
------------------------------------------------------------------------------------------
- MyStats v2.01 by Adrian Billington (http://www.oracle-developer.net)
- Based on the SNAP_MY_STATS utility by Jonathan Lewis

==========================================================================================
End of report
==========================================================================================

I am using Adrian Billington’s mystats script here which I can only recommend to you-it’s very decent (and that’s an understatement). In the example above it took two snapshots and calculated the change of the session counters in v$mystat during the query execution. Have a look at the cleanout% statistics here! These are the block cleanouts. As this generates redo you can see that recorded here too. This is a tricky interview question: can a select generate redo? Yes sure can! There is also a fair amount of physical I/O going on. After all the buffer cache was empty.

The second execution of the query completes a lot faster due to the absence of physical I/O. Nevertheless, the session has to clean out all these blocks and perform consistent read processing again (remember the transaction is still uncommitted as seen in active txn%):

SQL> select count(*) from t1_100k;

  COUNT(*)
----------
    500000

Elapsed: 00:00:01.76

SQL> @mystats stop t=1

==========================================================================================
MyStats report : 22-FEB-2015 08:42:59
==========================================================================================


------------------------------------------------------------------------------------------
1. Summary Timings
------------------------------------------------------------------------------------------

Type    Statistic Name                                                               Value
------  ----------------------------------------------------------------  ----------------
TIMER   snapshot interval (seconds)                                                   8.55
TIMER   CPU time used (seconds)                                                       1.75


------------------------------------------------------------------------------------------
2. Statistics Report
------------------------------------------------------------------------------------------

Type    Statistic Name                                                               Value
------  ----------------------------------------------------------------  ----------------
STAT    CPU used by this session                                                       181
STAT    CPU used when call started                                                     181
STAT    CR blocks created                                                           83,334
STAT    DB time                                                                        183
STAT    Requests to/from client                                                         16
STAT    SQL*Net roundtrips to/from client                                               16
STAT    active txn count during cleanout                                            83,334
STAT    buffer is not pinned count                                                       2
STAT    bytes received via SQL*Net from client                                      13,280
STAT    bytes sent via SQL*Net to client                                             5,177
STAT    calls to get snapshot scn: kcmgss                                              757
STAT    calls to kcmgas                                                             83,331
STAT    calls to kcmgcs                                                                 30
STAT    change write time                                                               28
STAT    cleanout - number of ktugct calls                                           83,334
STAT    cleanouts and rollbacks - consistent read gets                              83,334
STAT    consistent changes                                                         591,708
STAT    consistent gets                                                            757,980
STAT    consistent gets examination                                                674,254
STAT    consistent gets examination (fastpath)                                      83,337
STAT    consistent gets from cache                                                 757,980
STAT    consistent gets pin                                                         83,726
STAT    consistent gets pin (fastpath)                                              83,726
STAT    cursor authentications                                                           1
STAT    data blocks consistent reads - undo records applied                        590,917
STAT    db block changes                                                            84,125
STAT    db block gets                                                                2,895
STAT    db block gets from cache                                                     2,895
STAT    execute count                                                                   13
STAT    free buffer requested                                                       83,382
STAT    global enqueue gets sync                                                        38
STAT    global enqueue releases                                                         36
STAT    heap block compress                                                          4,549
STAT    immediate (CR) block cleanout applications                                  83,334
STAT    index fetch by key                                                               1
STAT    lob writes                                                                     375
STAT    lob writes unaligned                                                           375
STAT    logical read bytes from cache                                        6,233,088,000
STAT    messages sent                                                                    7
STAT    non-idle wait count                                                             16
STAT    opened cursors cumulative                                                       13
STAT    parse count (total)                                                             13
STAT    recursive calls                                                              1,982
STAT    recursive cpu usage                                                              6
STAT    redo entries                                                                83,334
STAT    redo size                                                                6,000,400
STAT    rows fetched via callback                                                        1
STAT    session cursor cache hits                                                       12
STAT    session logical reads                                                      760,875
STAT    session pga memory                                                      -1,769,472
STAT    session uga memory                                                      -1,858,472
STAT    shared hash latch upgrades - no wait                                           524
STAT    table fetch by rowid                                                             1
STAT    table scan blocks gotten                                                    83,334
STAT    table scan disk non-IMC rows gotten                                        500,000
STAT    table scan rows gotten                                                     500,000
STAT    table scans (short tables)                                                       1
STAT    user calls                                                                      22
STAT    workarea executions - optimal                                                    1
STAT    workarea memory allocated                                                      -83


------------------------------------------------------------------------------------------
3. About
------------------------------------------------------------------------------------------
- MyStats v2.01 by Adrian Billington (http://www.oracle-developer.net)
- Based on the SNAP_MY_STATS utility by Jonathan Lewis

==========================================================================================
End of report
==========================================================================================

I should have mentioned that these 2 examples were executed on an X2-2 quarter rack. To prevent Exadata from applying some clever optimisations I specifically disabled direct path reads and turned off Exadata features. Here is the view from the buffer cache again:

SQL> select count(*), inst_id, status from gv$bh where objd =
  2   (select data_object_id from dba_objects where object_name = 'T1_100K')
  3  group by inst_id, status;

  COUNT(*)    INST_ID STATUS
---------- ---------- ----------
    351704          1 free
         1          1 scur
     83334          1 xcur
     83334          1 cr

4 rows selected.

For each subsequent execution of the select statement consistent read processing has to be conducted. The number of CR blocks in the buffer cache will increase. For the third execution I can see the following:

SQL> r
  1  select count(*), inst_id, status from gv$bh where objd =
  2   (select data_object_id from dba_objects where object_name = 'T1_100K')
  3* group by inst_id, status

  COUNT(*)    INST_ID STATUS
---------- ---------- ----------
    267526          1 free
         1          1 scur
     83334          1 xcur
    166668          1 cr

4 rows selected.

Things improve when the transaction commits.

The impact of a commit

What about a commit now? Committing in session 1 and running the query again in session 2 shows a very good response time. Please remember that I forced buffered reads. The next part deals with this case when using direct path reads instead (those are not buffered). The segment size would have qualified for direct path reads otherwise.

SQL> select /* test004 */ count(*) from t1_100k;

  COUNT(*)
----------
    500000

Elapsed: 00:00:00.18

==========================================================================================
MyStats report : 22-FEB-2015 09:40:08
==========================================================================================


------------------------------------------------------------------------------------------
1. Summary Timings
------------------------------------------------------------------------------------------

Type    Statistic Name                                                               Value
------  ----------------------------------------------------------------  ----------------
TIMER   snapshot interval (seconds)                                                  19.73
TIMER   CPU time used (seconds)                                                       0.18


------------------------------------------------------------------------------------------
2. Statistics Report
------------------------------------------------------------------------------------------

Type    Statistic Name                                                               Value
------  ----------------------------------------------------------------  ----------------
STAT    CPU used by this session                                                        25
STAT    CPU used when call started                                                      25
STAT    DB time                                                                         25
STAT    Heatmap SegLevel - Full Table Scan                                               1
STAT    Requests to/from client                                                         16
STAT    SQL*Net roundtrips to/from client                                               16
STAT    buffer is not pinned count                                                       2
STAT    bytes received via SQL*Net from client                                      13,280
STAT    bytes sent via SQL*Net to client                                             5,107
STAT    calls to get snapshot scn: kcmgss                                              757
STAT    calls to kcmgcs                                                                 30
STAT    consistent changes                                                             800
STAT    consistent gets                                                             83,729
STAT    consistent gets examination                                                      3
STAT    consistent gets examination (fastpath)                                           3
STAT    consistent gets from cache                                                  83,729
STAT    consistent gets pin                                                         83,726
STAT    consistent gets pin (fastpath)                                              83,726
STAT    db block changes                                                               800
STAT    db block gets                                                                2,301
STAT    db block gets from cache                                                     2,301
STAT    enqueue releases                                                                 3
STAT    enqueue requests                                                                 3
STAT    execute count                                                                   13
STAT    free buffer requested                                                           53
STAT    global enqueue gets sync                                                        39
STAT    global enqueue releases                                                         37
STAT    index fetch by key                                                               1
STAT    lob writes                                                                     375
STAT    lob writes unaligned                                                           375
STAT    logical read bytes from cache                                          704,757,760
STAT    no work - consistent read gets                                              83,334
STAT    non-idle wait count                                                             16
STAT    opened cursors cumulative                                                       13
STAT    parse count (total)                                                             13
STAT    recursive calls                                                              1,984
STAT    recursive cpu usage                                                              7
STAT    rows fetched via callback                                                        1
STAT    session cursor cache count                                                       4
STAT    session cursor cache hits                                                       10
STAT    session logical reads                                                       86,030
STAT    session pga memory                                                          65,536
STAT    session pga memory max                                                      65,536
STAT    session uga memory                                                          65,488
STAT    session uga memory max                                                     148,312
STAT    shared hash latch upgrades - no wait                                           525
STAT    table fetch by rowid                                                             1
STAT    table scan blocks gotten                                                    83,334
STAT    table scan disk non-IMC rows gotten                                        500,000
STAT    table scan rows gotten                                                     500,000
STAT    table scans (short tables)                                                       1
STAT    temp space allocated (bytes)                                             2,097,152
STAT    user calls                                                                      22
STAT    workarea executions - optimal                                                    1
STAT    workarea memory allocated                                                      -40


------------------------------------------------------------------------------------------
3. About
------------------------------------------------------------------------------------------
- MyStats v2.01 by Adrian Billington (http://www.oracle-developer.net)
- Based on the SNAP_MY_STATS utility by Jonathan Lewis

==========================================================================================
End of report
==========================================================================================

Access to blocks in the buffer cache is quick! There was no need to clean out any blocks anymore. The fastest way to do something is indeed not having to do it at all. Mind you, the block on disk still has the active transaction recorded!

SQL> alter system dump datafile 5 block 3940811;

System altered.

Elapsed: 00:00:00.01
SQL> select value from v$diag_info where name like 'Default%';

VALUE
------------------------------------------------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/dbm01/dbm011/trace/dbm011_ora_100297.trc

SQL> exit

$ vi /u01/app/oracle/diag/rdbms/dbm01/dbm011/trace/dbm011_ora_100297.trc

Block header dump:  0x017c21cb
 Object id on Block? Y
 seg/obj: 0x12a13  csc: 0x00.2317620  itc: 3  flg: E  typ: 1 - DATA
     brn: 0  bdba: 0x17c21c8 ver: 0x01 opc: 0
     inc: 0  exflg: 0

 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0xffff.000.00000000  0x00000000.0000.00  C---    0  scn 0x0000.02191d10
0x02   0x0009.005.00002575  0x0003688a.0ef1.1f  --U-    6  fsc 0x0000.0232cfca
0x03   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
bdba: 0x017c21cb
data_block_dump,data header at 0x7fc9f3cb407c
...

$ grep lb -B1 /u01/app/oracle/diag/rdbms/dbm01/dbm011/trace/dbm011_ora_100297.trc
tab 0, row 0, @0x1b75
tl: 1035 fb: --H-FL-- lb: 0x2  cc: 6
--
tab 0, row 1, @0x176a
tl: 1035 fb: --H-FL-- lb: 0x2  cc: 6
--
tab 0, row 2, @0x135f
tl: 1035 fb: --H-FL-- lb: 0x2  cc: 6
--
tab 0, row 3, @0xf54
tl: 1035 fb: --H-FL-- lb: 0x2  cc: 6
--
tab 0, row 4, @0xb49
tl: 1035 fb: --H-FL-- lb: 0x2  cc: 6
--
tab 0, row 5, @0x73e
tl: 1035 fb: --H-FL-- lb: 0x2  cc: 6
--

The transaction recorded in the ITL slot 2 is gone though:

SQL> select xidusn, xidslot, xidsqn, status, start_scn from v$transaction
  2  where xidusn = 9 and xidslot = 5 and xidsqn = 9589;

no rows selected

Did you notice the Flag in the second ITL entry? That wasn’t there before. It indicates a fast commit according to Core Oracle by Jonathan Lewis.

Summary

Block cleanouts are simple with buffered I/O. Dirty blocks can be rolled back and kept in the buffer cache and can be accessed without having to undergo block cleanout.

Posted in Exadata, Linux | 1 Comment »

What happens in ASM if usable_file_mb is negative and you lose a failgroup

Posted by Martin Bach on February 16, 2015

Having read the excellent post “Demystifying ASM REQUIRED_MIRROR_FREE_MB and USABLE_FILE_MB” again by Harald von Breederode I wanted to see what happens if you create a setup where your usable_file_mb is negative and you actually have to rebalance after a fatal failgroup error. I am using 12.1.0.2.0 on Oracle Linux 6.6/UEK3 in a KVM in case anyone is interested. I/O times aren’t stellar on that environment. It’s Oracle Restart, not clustered ASM.

Note: this post is only applicable if you are using ASM for data protection, e.g. normal or high redundancy. External redundancy is a different thing: if you lose a failgroup in a disk group with external redundancy defined then you are toast. The disk group will dismount on all ASM instances. No disk group = nothing to write to = crash of dependent databases.

Setup

Here is my setup. I created the VM with 2 volume groups, one for the operating system, and another one for Oracle. After the installation of the database and Grid Infrastructure I have a disk group named DATA comprised of 4 disks, each in their own failgroup (which is both the default and intentional):

ASMCMD> lsdsk -k
Total_MB  Free_MB  OS_MB  Name       Failgroup  Failgroup_Type  Library  Label  UDID  Product  Redund   Path
    5119     4131   5119  DATA_0000  DATA_0000  REGULAR         System                         UNKNOWN  /dev/vdc1
    5119     4127   5119  DATA_0001  DATA_0001  REGULAR         System                         UNKNOWN  /dev/vdd1
    5119     4128   5119  DATA_0002  DATA_0002  REGULAR         System                         UNKNOWN  /dev/vde1
    5119     4139   5119  DATA_0003  DATA_0003  REGULAR         System                         UNKNOWN  /dev/vdf1
ASMCMD> lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  NORMAL  N         512   4096  1048576     20476    16525             5119            5703              0             N  DATA/
ASMCMD>

My database uses the datafiles as shown here:

SQL> select name, bytes from v$datafile;

NAME                                                              BYTES
------------------------------------------------------------ ----------
+DATA/NCDB/DATAFILE/system.258.871538097                      817889280
+DATA/NCDB/DATAFILE/sysaux.257.871538033                      629145600
+DATA/NCDB/DATAFILE/undotbs1.260.871538187                     62914560
+DATA/NCDB/DATAFILE/users.259.871538183                         5242880

ASMCMD must take the file information from v$asm_file-the “space” column seems to take the normal redundancy into account:


SQL> select block_size, blocks, bytes, space, redundancy from v$asm_file where file_number = 259 and incarnation = 871538183;

BLOCK_SIZE     BLOCKS      BYTES      SPACE REDUND
---------- ---------- ---------- ---------- ------
      8192        641    5251072   12582912 MIRROR

As others have explained already the “space” column also takes into account ASM metadata that is of course mirrored too. So to me those numbers make sense. If you look up to the asmcmd output you will notice that the useable_file_mb is 5703MB.

This is where it goes wrong

Let’s cause trouble for the sake of this blog post and to trigger failure. The following is STRONGLY discouraged in your environment! I have put this here so you can see the effect of “negative free space” without having to endure this yourself.

I am deliberately oversubscribing space in the data disk group, which is definitely Not A Good Thing(tm). Please monitor usable_file_mb for each disk group carefully and add storage when needed (you might be unpleasantly surprised how long it can take to add storage so request it early). You will see why it shouldn’t be negative-ever-shortly:

SQL> create tablespace iamtoolargetofit datafile size 6g;

Tablespace created.

[oracle@server1 ~]$ asmcmd lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  NORMAL  N         512   4096  1048576     20476     4212             5119            -453              0             N  DATA/

The new tablespace is bigger than usable_file_mb and turned that number into the negative. Now if I were to lose a failgroup (the term “disk” is a bit misleading here) I’d be in Big Trouble. Since this is my controlled lab environment and I wanted to see what happens I carried on.

On the host I use virsh detach-disk to remove an arbitrary of my ASM disks. Note from further up the repair timer was 0 for all disks. In other words, there won’t be a grace period to allow for transient failures, the disk will be dropped straight away.

# virsh detach-disk --domain server_12c_asm --target /var/lib/.../server1_asm_data02.qcow2 --current
Disk detached successfully

This removed disk /dev/vdd on the guest. This has not gone unnoticed. From the ASM alert.log:

NOTE: Standard client NCDB:NCDB:ASM registered, osid 7732, mbr 0x1, asmb 7728 (reg:3524888596)
2015-02-15 06:01:56.858000 -05:00
NOTE: client NCDB:NCDB:ASM mounted group 1 (DATA)
2015-02-15 08:23:29.188000 -05:00
NOTE: process _user9772_+asm (9772) initiating offline of disk 1.3916023153 (DATA_0001) with mask 0x7e in group 1 (DATA) with client assisting
NOTE: checking PST: grp = 1
GMON checking disk modes for group 1 at 10 for pid 23, osid 9772
NOTE: checking PST for grp 1 done.
NOTE: initiating PST update: grp 1 (DATA), dsk = 1/0xe969c571, mask = 0x6a, op = clear
GMON updating disk modes for group 1 at 11 for pid 23, osid 9772
WARNING: Write Failed. group:1 disk:1 AU:1 offset:1044480 size:4096
path:/dev/vdd1
 incarnation:0xe969c571 synchronous result:'I/O error'
 subsys:System krq:0x7fb991c5e9e8 bufp:0x7fb991c5f000 osderr1:0x0 osderr2:0x0
 IO elapsed time: 0 usec Time waited on I/O: 0 usec
WARNING: found another non-responsive disk 1.3916023153 (DATA_0001) that will be offlined
NOTE: group DATA: updated PST location: disk 0000 (PST copy 0)
NOTE: group DATA: updated PST location: disk 0002 (PST copy 1)
NOTE: group DATA: updated PST location: disk 0003 (PST copy 2)
NOTE: PST update grp = 1 completed successfully

...

NOTE: process _b000_+asm (11355) initiating offline of disk 1.3916023153 (DATA_0001) with mask 0x7e in group 1 (DATA) without client assisting
NOTE: sending set offline flag message (1976690176) to 1 disk(s) in group 1
Errors in file /u01/app/oracle/diag/asm/+asm/+ASM/trace/+ASM_gmon_5836.trc:
ORA-27072: File I/O error
Linux-x86_64 Error: 5: Input/output error
Additional information: 4
Additional information: 4088
Additional information: 4294967295
Errors in file /u01/app/oracle/diag/asm/+asm/+ASM/trace/+ASM_gmon_5836.trc:
ORA-27072: File I/O error
Linux-x86_64 Error: 5: Input/output error
Additional information: 4
Additional information: 4088
Additional information: 4294967295
Errors in file /u01/app/oracle/diag/asm/+asm/+ASM/trace/+ASM_gmon_5836.trc:
ORA-27072: File I/O error
Linux-x86_64 Error: 5: Input/output error
Additional information: 4
Additional information: 4088
Additional information: 4294967295
Errors in file /u01/app/oracle/diag/asm/+asm/+ASM/trace/+ASM_gmon_5836.trc:

...


WARNING: Disk 1 (DATA_0001) in group 1 mode 0x15 is now being offlined
NOTE: initiating PST update: grp 1 (DATA), dsk = 1/0xe969c571, mask = 0x6a, op = clear
GMON updating disk modes for group 1 at 12 for pid 25, osid 11355

...

SUCCESS: PST-initiated drop disk in group 1(3645453725))
2015-02-15 08:24:04.910000 -05:00
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: starting rebalance of group 1/0xd949359d (DATA) at power 1
Starting background process ARB0
ARB0 started with pid=26, OS id=11364
NOTE: assigning ARB0 to group 1/0xd949359d (DATA) with 1 parallel I/O
NOTE: F1X0 on disk 3 (fmt 2) relocated at fcn 0.26087: AU 0 -> AU 4059
NOTE: 02/15/15 08:24:04 DATA.F1X0 copy 2 relocating from 1:10 to 2:10 at FCN 0.26087
NOTE: 02/15/15 08:24:04 DATA.F1X0 copy 3 relocating from 2:10 to 3:4059 at FCN 0.26087
NOTE: F1B1 fcn on disk 0 synced at fcn 0.26087
NOTE: F1B1 fcn on disk 2 synced at fcn 0.26087
2015-02-15 08:24:28.853000 -05:00
NOTE: restored redundancy of control and online logs in DATA
2015-02-15 08:24:33.219000 -05:00

...

And that went on for a little while. As with any rebalance you can see the progress in v$asm_operation:

SQL> r
  1* select * from v$asm_operation

GROUP_NUMBER OPERA PASS      STAT      POWER     ACTUAL      SOFAR   EST_WORK   EST_RATE EST_MINUTES ERROR_CODE                                       CON_ID
------------ ----- --------- ---- ---------- ---------- ---------- ---------- ---------- ----------- -------------------------------------------- ----------
           1 REBAL RESYNC    DONE         11         11          0          0          0           0                                                       0
           1 REBAL REBALANCE RUN          11         11        559       4529      12417           0                                                       0
           1 REBAL COMPACT   WAIT         11         11          0          0          0           0                                                       0

You can see detailed information about the disk dropping in asmcmd too:

ASMCMD> lsdsk -kpt
Total_MB  Free_MB  OS_MB  Name                Failgroup  Failgroup_Type  Library  Label  UDID  Product  Redund   Group_Num  Disk_Num      Incarn  Mount_Stat  Header_Stat  Mode_Stat  State    Create_Date  Mount_Date  Repair_Timer  Path
    5119     3237      0  _DROPPED_0001_DATA  DATA_0001  REGULAR         System                         UNKNOWN          1         1  3916023153  MISSING     UNKNOWN      OFFLINE    FORCING  15-FEB-15    15-FEB-15   0
    5119      323   5119  DATA_0000           DATA_0000  REGULAR         System                         UNKNOWN          1         0  3916023154  CACHED      MEMBER       ONLINE     NORMAL   15-FEB-15    15-FEB-15   0             /dev/vdc1
    5119      324   5119  DATA_0002           DATA_0002  REGULAR         System                         UNKNOWN          1         2  3916023152  CACHED      MEMBER       ONLINE     NORMAL   15-FEB-15    15-FEB-15   0             /dev/vde1
    5119      328   5119  DATA_0003           DATA_0003  REGULAR         System                         UNKNOWN          1         3  3916023156  CACHED      MEMBER       ONLINE     NORMAL   15-FEB-15    15-FEB-15   0             /dev/vdf1
ASMCMD>

Missing, forcing, offline all don’t sound too good… What is worse was this:

ASMCMD> lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  NORMAL  Y         512   4096  1048576     15357        0             5119           -2559              1             N  DATA/

And then the inevitable happened:

2015-02-15 08:31:55.491000 -05:00
ERROR: ORA-15041 thrown in ARB0 for group number 1
Errors in file /u01/app/oracle/diag/asm/+asm/+ASM/trace/+ASM_arb0_11547.trc:
ORA-15041: diskgroup "DATA" space exhausted
WARNING: Resync encountered ORA-15041; continuing rebalance
NOTE: stopping process ARB0
NOTE: requesting all-instance membership refresh for group=1
WARNING: rebalance not completed for group 1/0xd949359d (DATA)
GMON updating for reconfiguration, group 1 at 20 for pid 25, osid 11773
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATA

ORA-15041 is the dreaded error: space exhausted. This was to be expected, but I had to see it with my own eyes.

The ASM rebalance is not complete:

SQL> r
  1* select * from v$asm_operation

GROUP_NUMBER OPERA PASS      STAT      POWER     ACTUAL      SOFAR   EST_WORK   EST_RATE EST_MINUTES ERROR_CODE                                       CON_ID
------------ ----- --------- ---- ---------- ---------- ---------- ---------- ---------- ----------- -------------------------------------------- ----------
           1 REBAL RESYNC    DONE         11                                                                                                               0
           1 REBAL REBALANCE ERRS         11                                                         ORA-15041                                             0
           1 REBAL COMPACT   WAIT         11                                                                                                               0

Interestingly the database remained up-but I can’t do anything with it as these examples demonstrate:

SQL> alter tablespace IAMTOOLARGETOFIT add datafile size 100m;
alter tablespace IAMTOOLARGETOFIT add datafile size 100m
*
ERROR at line 1:
ORA-01119: error in creating database file '+DATA'
ORA-17502: ksfdcre:4 Failed to create file +DATA
ORA-15041: diskgroup "DATA" space exhausted



SQL> create table t as select * from dba_objects;
create table t as select * from dba_objects
*
ERROR at line 1:
ORA-01652: unable to extend temp segment by 128 in tablespace USERS

Errors in file /u01/app/oracle/diag/rdbms/ncdb/NCDB/trace/NCDB_arc3_12194.trc:
ORA-19504: failed to create file "+DATA"
ORA-17502: ksfdcre:4 Failed to create file +DATA
ORA-15041: diskgroup "DATA" space exhausted
ARC3: Error 19504 Creating archive log file to '+DATA'
2015-02-15 08:47:31.410000 -05:00
Unable to create archive log file '+DATA'

You are now in a bit of a situation … So be careful when you are on negative values for usable_file_mb-a loss of only one failgroup does not cause the disk group to be dismounted in ASM normal redundancy, but you are likewise ending up in a very undesirable place.

Posted in 12c Release 1, Linux | Leave a Comment »

How to resolve the text behind v$views?

Posted by Martin Bach on January 21, 2015

This is a common problem I have and I never write it down (except now). For example, today I wanted to know what the valid parameters for _serial_direct_read were:

SQL> select * from v$parameter_valid_values where name ='_serial_direct_read';

no rows selected

OK so if Oracle doesn’t tell me then maybe I can work it out? Getting the view_text has worked in the past:

SQL> select view_name, text_vc from dba_views where view_name = '%PARAMETER_VALID_VALUES'

VIEW_NAME                          TEXT_VC
---------------------------------- ----------------------------------------------------------------------------------------------------
V_$PARAMETER_VALID_VALUES          select "NUM","NAME","ORDINAL","VALUE","ISDEFAULT","CON_ID"
                                   from v$parameter_valid_values
GV_$PARAMETER_VALID_VALUES         select "INST_ID","NUM","NAME","ORDINAL","VALUE","ISDEFAULT","CON_ID"
                                   from gv$parameter_valid_values

I’m sure I did the step wrong, but I couldn’t find what the lower case thingie was.

SQL> desc "v$parameter_valid_values"
ERROR:
ORA-04043: object "v$parameter_valid_values" does not exist

SQL> sho user
USER is "SYS"
SQL> desc "gv$parameter_valid_values"
ERROR:
ORA-04043: object "gv$parameter_valid_values" does not exist

SQL> select * from dba_views where view_name = '"gv$parameter_valid_values"';

no rows selected

SQL> select * from dba_objects where object_name = '"gv$parameter_valid_values"';

no rows selected

Yes, I’m pretty sure I got something wrong along the way.

Solutions

One possibility is to use dbms_xplan.display_cursor() – easy!

SQL> select * from v$parameter_valid_values where name = '_serial_direct_read';

no rows selected

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
SQL_ID  9hkygnf02nd8y, child number 0
-------------------------------------
select * from v$parameter_valid_values where name =
'_serial_direct_read'

Plan hash value: 1012408093

-------------------------------------------------------------------------
| Id  | Operation        | Name            | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                 |       |       |     1 (100)|
|*  1 |  FIXED TABLE FULL| X$KSPVLD_VALUES |     1 |    49 |     0   (0)|
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(("NAME_KSPVLD_VALUES"='_serial_direct_read' AND
              TRANSLATE("NAME_KSPVLD_VALUES",'_','#') NOT LIKE '#%' AND
              "INST_ID"=USERENV('INSTANCE')))

Now if I translate this I can write a query that shows me what I need. It also demonstrates that – just like v$parameter – underscore parameters aren’t shown in this view.

SQL> @desc X$KSPVLD_VALUES
           Name                            Null?    Type
           ------------------------------- -------- ----------------------------
    1      ADDR                                     RAW(8)
    2      INDX                                     NUMBER
    3      INST_ID                                  NUMBER
    4      CON_ID                                   NUMBER
    5      PARNO_KSPVLD_VALUES                      NUMBER
    6      NAME_KSPVLD_VALUES                       VARCHAR2(64)
    7      ORDINAL_KSPVLD_VALUES                    NUMBER
    8      VALUE_KSPVLD_VALUES                      VARCHAR2(255)
    9      ISDEFAULT_KSPVLD_VALUES                  VARCHAR2(64)

SQL> select PARNO_KSPVLD_VALUES, NAME_KSPVLD_VALUES, ORDINAL_KSPVLD_VALUES, VALUE_KSPVLD_VALUES
  2  from X$KSPVLD_VALUES where NAME_KSPVLD_VALUES ='_serial_direct_read';

PARNO_KSPVLD_VALUES NAME_KSPVLD_VALUES             ORDINAL_KSPVLD_VALUES VALUE_KSPVLD_VALUES
------------------- ------------------------------ --------------------- ------------------------------
               2873 _serial_direct_read                                1 ALWAYS
               2873 _serial_direct_read                                2 AUTO
               2873 _serial_direct_read                                3 NEVER
               2873 _serial_direct_read                                4 TRUE
               2873 _serial_direct_read                                5 FALSE

There you go!

Another way is to use the 12c functionality in DBMS_UTILITY.EXPAND_SQL_TEXT. Reusing the example by Tom Kyte:

SQL> var x clob.

SQL> exec dbms_utility.expand_sql_text( -
  2   input_sql_text => 'select * from V$PARAMETER_VALID_VALUES', -
  3   output_sql_text => :x)

print :x

X
--------------------------------------------------------------------------------------------------------
SELECT "A1"."NUM" "NUM","A1"."NAME" "NAME","A1"."ORDINAL" "ORDINAL","A1"."VALUE" "VALUE",
"A1"."ISDEFAULT" "ISDEFAULT","A1"."CON_ID" "CON_ID" FROM  (SELECT "A2"."NUM" "NUM","A2"."NAME"
"NAME","A2"."ORDINAL" "ORDINAL","A2"."VALUE" "VALUE","A2"."ISDEFAULT" "ISDEFAULT","A2"."CON_ID" "CON_ID"
FROM  (SELECT "A3"."INST_ID" "INST_ID","A3"."PARNO_KSPVLD_VALUES" "NUM","A3"."NAME_KSPVLD_VALUES"
"NAME","A3"."ORDINAL_KSPVLD_VALUES" "ORDINAL","A3"."VALUE_KSPVLD_VALUES"
"VALUE","A3"."ISDEFAULT_KSPVLD_VALUES" "ISDEFAULT","A3"."CON_ID" "CON_ID" FROM SYS."X$KSPVLD_VALUES" "A3"
WHERE TRANSLATE("A3"."NAME_KSPVLD_VALUES",'_','#') NOT LIKE '#%') "A2" WHERE
"A2"."INST_ID"=USERENV('INSTANCE')) "A1"

This seems to have worked in earlier versions too, one example is on Jonathan Lewis’ blog.

Update: the most obvious solution to this was to use v$fixed_view_definition! The view must have dropped at the cold end of my brain’s LRU list. As others have pointed out (thanks everyone for your comments!), this would be the way to query the object:

SQL> select VIEW_DEFINITION from V$FIXED_VIEW_DEFINITION where view_name like 'GV$PARAMETER_VALID_VALUES';

VIEW_DEFINITION
-------------------------------------------------------------------------------------------------------
SELECT INST_ID, PARNO_KSPVLD_VALUES, NAME_KSPVLD_VALUES, ORDINAL_KSPVLD_VALUES, VALUE_KSPVLD_VALUES,
 ISDEFAULT_KSPVLD_VALUES, CON_ID
 FROM X$KSPVLD_VALUES WHERE TRANSLATE(NAME_KSPVLD_VALUES,'_','#') NOT LIKE '#%'

Summary

It’s probably not what Oracle intended but DBMS_UTILITY.EXPAND_SQL_TEXT() worked really well. I came across the DBMS_XPLAN.DISPLAY_CURSOR() output by chance when I ran my diagnostic script at the wrong time but it, too, does the job.

Or, I could have used Tanel Poder’s script I didn’t know about until now:


SQL> @pvalid _serial_direct_read
Display valid values for multioption parameters matching "_serial_direct_read"...

  PAR# PARAMETER                                                 ORD VALUE                          DEFAULT
------ -------------------------------------------------- ---------- ------------------------------ -------
  2873 _serial_direct_read                                         1 ALWAYS
       _serial_direct_read                                         2 AUTO
       _serial_direct_read                                         3 NEVER
       _serial_direct_read                                         4 TRUE
       _serial_direct_read                                         5 FALSE

Posted in 12c Release 1 | 5 Comments »

Adaptive plans and v$sql_plan and related views

Posted by Martin Bach on January 13, 2015

Adaptive plans are one of the coolest new optimiser features in Oracle 12c. If you haven’t seen or heard about them in detail I recommend the following resources:

There is a caveat with this though: if your tuning script relies on pulling information from v$sql_plan and related views, you get more information than you might want. I found out about this while working on our 12c New Features training class. This, and a lot more, will be part of it. Stay tuned :)

Consider the following example. I will use the following query in this article:

SELECT
  /* statement002 */
  /*+ gather_plan_statistics monitor */
  TRUNC(order_date, 'mm'),
  COUNT(TRUNC(order_date, 'mm'))
FROM orders o,
  order_items oi
WHERE oi.ORDER_ID    = o.order_id
AND o.CUSTOMER_CLASS = 'Prime'
AND o.WAREHOUSE_ID   = 501
AND o.ORDER_DATE BETWEEN DATE '2012-01-01' AND DATE '2012-07-01'
GROUP BY TRUNC(order_date, 'mm')
ORDER BY 1;

These tables are part of the Swingbench SOE (order entry) schema. I have inflated the order_items table to twice its size for a total of 171,579,632 rows.

When executing this query on the x4-2 half rack in the lab I get this (sorry for the wide output!):

TRUNC(ORDER_DATE, COUNT(TRUNC(ORDER_DATE,'MM'))
----------------- -----------------------------
20120101 00:00:00                           472
20120201 00:00:00                           580
20120301 00:00:00                           614
20120401 00:00:00                           578

SQL> @x
Display execution plan for last statement for this session from library cache...

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  3wb68w2kj6gdd, child number 0
-------------------------------------
SELECT   /* statement002 */   /*+ gather_plan_statistics monitor */
TRUNC(order_date, 'mm'),   COUNT(TRUNC(order_date, 'mm')) FROM orders
o,   order_items oi WHERE oi.ORDER_ID    = o.order_id AND
o.CUSTOMER_CLASS = 'Prime' AND o.WAREHOUSE_ID   = 501 AND o.ORDER_DATE
BETWEEN DATE '2012-01-01' AND DATE '2012-07-01' GROUP BY
TRUNC(order_date, 'mm') ORDER BY 1

Plan hash value: 812470616

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                      | Name             | Starts | E-Rows |E-Bytes| Cost (%CPU)| Pstart| Pstop | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                               |                  |      1 |        |       | 29766 (100)|       |       |      4 |00:00:00.06 |   29354 |     17 |       |       |          |
|   1 |  SORT ORDER BY                                 |                  |      1 |    404 | 14948 | 29766   (1)|       |       |      4 |00:00:00.06 |   29354 |     17 |  2048 |  2048 | 2048  (0)|
|   2 |   HASH GROUP BY                                |                  |      1 |    404 | 14948 | 29766   (1)|       |       |      4 |00:00:00.06 |   29354 |     17 |  1394K|  1394K|  634K (0)|
|*  3 |    FILTER                                      |                  |      1 |        |       |            |       |       |   2244 |00:00:00.05 |   29354 |     17 |       |       |          |
|   4 |     NESTED LOOPS                               |                  |      1 |   2996 |   108K| 29764   (1)|       |       |   2244 |00:00:00.05 |   29354 |     17 |       |       |          |
|*  5 |      TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| ORDERS           |      1 |    434 | 13454 | 28462   (1)| ROWID | ROWID |    362 |00:00:00.03 |   28261 |     17 |       |       |          |
|*  6 |       INDEX RANGE SCAN                         | ORD_WAREHOUSE_IX |      1 |  28624 |       |    90   (0)|       |       |  28441 |00:00:00.02 |      90 |      0 |  1025K|  1025K|          |
|*  7 |      INDEX RANGE SCAN                          | ITEM_ORDER_IX    |    362 |      7 |    42 |     3   (0)|       |       |   2244 |00:00:00.01 |    1093 |      0 |  1025K|  1025K|          |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter(TIMESTAMP' 2012-07-01 00:00:00'>=TIMESTAMP' 2012-01-01 00:00:00')
   5 - filter(("O"."CUSTOMER_CLASS"='Prime' AND "O"."ORDER_DATE">=TIMESTAMP' 2012-01-01 00:00:00' AND "O"."ORDER_DATE"<=TIMESTAMP' 2012-07-01 00:00:00'))
   6 - access("O"."WAREHOUSE_ID"=501)
   7 - access("OI"."ORDER_ID"="O"."ORDER_ID")

   - dynamic statistics used: dynamic sampling (level=2)
   - this is an adaptive plan
   - 2 Sql Plan Directives used for this statement

@x is one of the many useful scripts written by Tanel Poder. I recommend you download the tpt_public.zip and get familiar with what they do-they are just great.

The Execution Plan

Please take a note at the output of the execution plan. The query is an adaptive plan (see notes). The join between orders and order_items is performed using a nested loop (step 4 and the following ones). You can also use v$sql to show that this plan is an adaptive plan:

SQL> select sql_id, child_number, executions, is_reoptimizable, is_resolved_adaptive_plan
  2  from v$sql where sql_id = '3wb68w2kj6gdd';

SQL_ID        CHILD_NUMBER EXECUTIONS I I
------------- ------------ ---------- - -
3wb68w2kj6gdd            0          1 Y Y

It is indeed already resolved. Using Jonathan Lewis’s recent notes on reoptimisation I checked what the optimiser worked out:

SQL> select sql_id, child_number, hint_id, hint_text, reparse
  2  from v$sql_reoptimization_hints where sql_id = '3wb68w2kj6gdd' and child_number = 0;

SQL_ID        CHILD_NUMBER    HINT_ID HINT_TEXT                                             REPARSE
------------- ------------ ---------- -------------------------------------------------- ----------
3wb68w2kj6gdd            0          1 OPT_ESTIMATE (@"SEL$1" GROUP_BY ROWS=4.000000 )             1
3wb68w2kj6gdd            0          2 OPT_ESTIMATE (@"SEL$1" JOIN ("OI"@"SEL$1" "O"@"SEL          1
                                      $1") ROWS=1.000000 )

Nice! That also provides some insights that could be useful. You can map the hint_text to v$sql_plan.object_alias, it is the query block name. I have actually executed slight variations of the query in preparation …

Back to my problem

In the past I have used v$sql_plan (indirectly, via scripts written by far more clever people than me) to work each step in the plan. v$sql_plan has an ID, PARENT_ID and DEPTH that make it easier to work out where in the plan you are. Using my old approach I got stuck, consider this:

SQL> r
  1  select
  2    lpad(' ',sp.depth*1,' ')
  3    || sp.operation AS operation,
  4    sp.OPTIONS,
  5    sp.object#,
  6    sp.object_name,
  7    sp.object_alias,
  8    sp.object_type
  9  FROM v$sql_plan sp
 10* where sql_id = '3wb68w2kj6gdd' and child_number = 0

OPERATION                                OPTIONS                                OBJECT# OBJECT_NAME               OBJECT_ALI OBJECT_TYP
---------------------------------------- ----------------------------------- ---------- ------------------------- ---------- ----------
SELECT STATEMENT
 SORT                                    ORDER BY
  HASH                                   GROUP BY
   FILTER
    HASH JOIN
     NESTED LOOPS
      STATISTICS COLLECTOR
       TABLE ACCESS                      BY GLOBAL INDEX ROWID BATCHED            28865 ORDERS                    O@SEL$1    TABLE
        INDEX                            RANGE SCAN                               29329 ORD_WAREHOUSE_IX          O@SEL$1    INDEX
      INDEX                              RANGE SCAN                               29302 ITEM_ORDER_IX             OI@SEL$1   INDEX
     INDEX                               FAST FULL SCAN                           29302 ITEM_ORDER_IX             OI@SEL$1   INDEX

11 rows selected.

If you compare this with the DBMS_XPLAN-output from above then you notice there is a lot more information in the query against v$sql_plan …In fact, that’s the same output as what you get when calling DBMS_XPLAN.DISPLAY_CURSOR(… format => ‘ADAPTIVE’):

SQL> select * from table(dbms_xplan.display_cursor('3wb68w2kj6gdd',0,'+ADAPTIVE'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  3wb68w2kj6gdd, child number 0
-------------------------------------
SELECT   /* statement002 */   /*+ gather_plan_statistics monitor */
TRUNC(order_date, 'mm'),   COUNT(TRUNC(order_date, 'mm')) FROM orders
o,   order_items oi WHERE oi.ORDER_ID    = o.order_id AND
o.CUSTOMER_CLASS = 'Prime' AND o.WAREHOUSE_ID   = 501 AND o.ORDER_DATE
BETWEEN DATE '2012-01-01' AND DATE '2012-07-01' GROUP BY
TRUNC(order_date, 'mm') ORDER BY 1

Plan hash value: 812470616

---------------------------------------------------------------------------------------------------------------------------------------
|   Id  | Operation                                        | Name             | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------------------------
|     0 | SELECT STATEMENT                                 |                  |       |       | 29766 (100)|          |       |       |
|     1 |  SORT ORDER BY                                   |                  |   404 | 14948 | 29766   (1)| 00:00:02 |       |       |
|     2 |   HASH GROUP BY                                  |                  |   404 | 14948 | 29766   (1)| 00:00:02 |       |       |
|  *  3 |    FILTER                                        |                  |       |       |            |          |       |       |
|- *  4 |     HASH JOIN                                    |                  |  2996 |   108K| 29764   (1)| 00:00:02 |       |       |
|     5 |      NESTED LOOPS                                |                  |  2996 |   108K| 29764   (1)| 00:00:02 |       |       |
|-    6 |       STATISTICS COLLECTOR                       |                  |       |       |            |          |       |       |
|  *  7 |        TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| ORDERS           |   434 | 13454 | 28462   (1)| 00:00:02 | ROWID | ROWID |
|  *  8 |         INDEX RANGE SCAN                         | ORD_WAREHOUSE_IX | 28624 |       |    90   (0)| 00:00:01 |       |       |
|  *  9 |       INDEX RANGE SCAN                           | ITEM_ORDER_IX    |     7 |    42 |     3   (0)| 00:00:01 |       |       |
|-   10 |      INDEX FAST FULL SCAN                        | ITEM_ORDER_IX    |     7 |    42 |     3   (0)| 00:00:01 |       |       |
---------------------------------------------------------------------------------------------------------------------------------------
...
Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - this is an adaptive plan (rows marked '-' are inactive)
   - 2 Sql Plan Directives used for this statement

Note the line: this is an adaptive plan (rows marked ‘-‘ are inactive). But how does DBMS_XPLAN know that these lines are hidden? There doesn’t seem to be a view v$sql_plan_hidden_lines. I tried a few things and eventually traced the call to DBMS_XPLAN.DISPLAY_CURSOR. In the trace I found the trick Oracle uses:

=====================
PARSING IN CURSOR #140127730386096 len=298 dep=2 uid=75 oct=3 lid=75 tim=767822822383 hv=580989905 ad='e9e6f7570' sqlid='fg4skgcja2cyj'
SELECT EXTRACTVALUE(VALUE(D), '/row/@op'), EXTRACTVALUE(VALUE(D), '/row/@dis'), EXTRACTVALUE(VALUE(D), '/row/@par'), EXTRACTVALUE(VALUE(D),
'/row/@prt'), EXTRACTVALUE(VALUE(D), '/row/@dep'), EXTRACTVALUE(VALUE(D), '/row/@skp') FROM TABLE(XMLSEQUENCE(EXTRACT(XMLTYPE(:B1 ), 
'/*/display_map/row'))) D
END OF STMT
...
STAT #140127732564416 id=1 cnt=11 pid=0 pos=1 obj=0 op='SORT ORDER BY (cr=0 pr=0 pw=0 time=691 us cost=1 size=8935 card=1)'
STAT #140127732564416 id=2 cnt=11 pid=1 pos=1 obj=0 op='VIEW  (cr=0 pr=0 pw=0 time=1223 us cost=0 size=8935 card=1)'
STAT #140127732564416 id=3 cnt=11 pid=2 pos=1 obj=0 op='NESTED LOOPS  (cr=0 pr=0 pw=0 time=1116 us cost=0 size=957 card=1)'
STAT #140127732564416 id=4 cnt=11 pid=3 pos=1 obj=0 op='FIXED TABLE FIXED INDEX X$KQLFXPL (ind:4) (cr=0 pr=0 pw=0 time=902 us cost=0 size=853 card=1)'
STAT #140127732564416 id=5 cnt=11 pid=3 pos=2 obj=0 op='FIXED TABLE FIXED INDEX X$KGLCURSOR_CHILD (ind:2) (cr=0 pr=0 pw=0 time=134 us cost=0 size=104 card=1)'

So that’s it! The X$ views map to v$sql and v$sql_plan unless I am very mistaken. V$SQL_PLAN only has 1 column that contains XML-other_xml. Using this information I thought there has to be something in there … and indeed, there is:

SQL> select xmltype(other_xml)
  2  from v$sql_plan
  3  where sql_id = '3wb68w2kj6gdd' and child_number = 0
  4  and other_xml is not null;

XMLTYPE(OTHER_XML)
---------------------------------------------------------------------------------------------------------------------------
<other_xml>
  <info type="db_version">12.1.0.2</info>
  <info type="parse_schema"><![CDATA["SOE"]]></info>
  <info type="dynamic_sampling" note="y">2</info>
  <info type="plan_hash_full">166760258</info>
  <info type="plan_hash">812470616</info>
  <info type="plan_hash_2">3729130925</info>
  <info type="adaptive_plan" note="y">yes</info>
  <spd>
    <cv>0</cv>
    <cu>2</cu>
  </spd>
  <outline_data>
    <hint><![CDATA[IGNORE_OPTIM_EMBEDDED_HINTS]]></hint>
    <hint><![CDATA[OPTIMIZER_FEATURES_ENABLE('12.1.0.2')]]></hint>
    <hint><![CDATA[DB_VERSION('12.1.0.2')]]></hint>
    <hint><![CDATA[ALL_ROWS]]></hint>
    <hint><![CDATA[OUTLINE_LEAF(@"SEL$1")]]></hint>
    <hint><![CDATA[INDEX_RS_ASC(@"SEL$1" "O"@"SEL$1" ("ORDERS"."WAREHOUSE_ID" "ORDERS"."ORDER_STATUS"))]]></hint>
    <hint><![CDATA[BATCH_TABLE_ACCESS_BY_ROWID(@"SEL$1" "O"@"SEL$1")]]></hint>
    <hint><![CDATA[INDEX(@"SEL$1" "OI"@"SEL$1" ("ORDER_ITEMS"."ORDER_ID"))]]></hint>
    <hint><![CDATA[LEADING(@"SEL$1" "O"@"SEL$1" "OI"@"SEL$1")]]></hint>
    <hint><![CDATA[USE_NL(@"SEL$1" "OI"@"SEL$1")]]></hint>
    <hint><![CDATA[USE_HASH_AGGREGATION(@"SEL$1")]]></hint>
  </outline_data>
  <display_map>
    <row op="1" dis="1" par="0" prt="0" dep="1" skp="0"/>
    <row op="2" dis="2" par="1" prt="0" dep="2" skp="0"/>
    <row op="3" dis="3" par="2" prt="0" dep="3" skp="0"/>
    <row op="4" dis="3" par="3" prt="0" dep="3" skp="1"/>
    <row op="5" dis="4" par="3" prt="0" dep="4" skp="0"/>
    <row op="6" dis="4" par="4" prt="0" dep="4" skp="1"/>
    <row op="7" dis="5" par="4" prt="5" dep="5" skp="0"/>
    <row op="8" dis="6" par="5" prt="0" dep="6" skp="0"/>
    <row op="9" dis="7" par="4" prt="0" dep="5" skp="0"/>
    <row op="10" dis="7" par="3" prt="0" dep="3" skp="1"/>
  </display_map>
</other_xml>

The SQL statement from the trace was not much use to me, the following seemed better suited to work out what was happening. I added what I think the abbreviations stand for:

WITH display_map AS
  (SELECT X.*
  FROM v$sql_plan,
    XMLTABLE ( '/other_xml/display_map/row' passing XMLTYPE(other_xml ) COLUMNS 
      op  NUMBER PATH '@op',    -- operation
      dis NUMBER PATH '@dis',   -- display
      par NUMBER PATH '@par',   -- parent
      prt NUMBER PATH '@prt',   -- ?
      dep NUMBER PATH '@dep',   -- depth
      skp NUMBER PATH '@skp' )  -- skip
  AS X
  WHERE sql_id     = '&sql_id'
  AND child_number = &sql_child
  AND other_xml   IS NOT NULL
  )
SELECT * from display_map;

Enter value for sql_id: 3wb68w2kj6gdd
Enter value for sql_child: 0

        OP        DIS        PAR        PRT        DEP        SKP
---------- ---------- ---------- ---------- ---------- ----------
         1          1          0          0          1          0
         2          2          1          0          2          0
         3          3          2          0          3          0
         4          3          3          0          3          1
         5          4          3          0          4          0
         6          4          4          0          4          1
         7          5          4          5          5          0
         8          6          5          0          6          0
         9          7          4          0          5          0
        10          7          3          0          3          1

10 rows selected.

Well-that’s a starting point. Now all I have to do is join the display map to v$sql_plan cleverly. After a little bit of fiddling with the query this seems to work:

WITH display_map AS
  (SELECT X.*
  FROM v$sql_plan,
    XMLTABLE ( '/other_xml/display_map/row' passing XMLTYPE(other_xml ) COLUMNS 
      op  NUMBER PATH '@op',    -- operation
      dis NUMBER PATH '@dis',   -- display
      par NUMBER PATH '@par',   -- parent
      prt NUMBER PATH '@prt',   -- ?
      dep NUMBER PATH '@dep',   -- depth
      skp NUMBER PATH '@skp' )  -- skip
  AS X
  WHERE sql_id     = '&sql_id'
  AND child_number = &sql_child
  AND other_xml   IS NOT NULL
  )
SELECT 
  -- new ID, depth, parent etc from display_map
  NVL(m.dis, 0) AS new_id,
  m.par         AS new_parent,
  m.dep         AS new_depth,
  -- plan formatting, as usual
  lpad(' ',m.dep*1,' ')
  || sp.operation AS operation,
  sp.OPTIONS,
  sp.object#,
  sp.object_name,
  sp.object_alias,
  sp.object_type
FROM v$sql_plan sp
LEFT OUTER JOIN display_map m
ON (sp.id = m.op)
WHERE sp.sql_Id        = '&sql_id'
AND sp.child_number    = &sql_child
AND NVL(m.skp,0)      <> 1
ORDER BY NVL(dis,0);

    NEW_ID NEW_PARENT  NEW_DEPTH OPERATION                      OPTIONS                                OBJECT# OBJECT_NAME               OBJECT_ALI OBJECT_TYP
---------- ---------- ---------- ------------------------------ ----------------------------------- ---------- ------------------------- ---------- ----------
         0                       SELECT STATEMENT
         1          0          1  SORT                          ORDER BY
         2          1          2   HASH                         GROUP BY
         3          2          3    FILTER
         4          3          4     NESTED LOOPS
         5          4          5      TABLE ACCESS              BY GLOBAL INDEX ROWID BATCHED            28865 ORDERS                    O@SEL$1    TABLE
         6          5          6       INDEX                    RANGE SCAN                               29329 ORD_WAREHOUSE_IX          O@SEL$1    INDEX
         7          4          5      INDEX                     RANGE SCAN                               29302 ITEM_ORDER_IX             OI@SEL$1   INDEX

8 rows selected.

This seems to match what Oracle produces:

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  3wb68w2kj6gdd, child number 0
-------------------------------------
SELECT   /* statement002 */   /*+ gather_plan_statistics monitor */
TRUNC(order_date, 'mm'),   COUNT(TRUNC(order_date, 'mm')) FROM orders
o,   order_items oi WHERE oi.ORDER_ID    = o.order_id AND
o.CUSTOMER_CLASS = 'Prime' AND o.WAREHOUSE_ID   = 501 AND o.ORDER_DATE
BETWEEN DATE '2012-01-01' AND DATE '2012-07-01' GROUP BY
TRUNC(order_date, 'mm') ORDER BY 1

Plan hash value: 812470616

-----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                      | Name             | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                               |                  |       |       | 29766 (100)|          |       |       |
|   1 |  SORT ORDER BY                                 |                  |   404 | 14948 | 29766   (1)| 00:00:02 |       |       |
|   2 |   HASH GROUP BY                                |                  |   404 | 14948 | 29766   (1)| 00:00:02 |       |       |
|*  3 |    FILTER                                      |                  |       |       |            |          |       |       |
|   4 |     NESTED LOOPS                               |                  |  2996 |   108K| 29764   (1)| 00:00:02 |       |       |
|*  5 |      TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| ORDERS           |   434 | 13454 | 28462   (1)| 00:00:02 | ROWID | ROWID |
|*  6 |       INDEX RANGE SCAN                         | ORD_WAREHOUSE_IX | 28624 |       |    90   (0)| 00:00:01 |       |       |
|*  7 |      INDEX RANGE SCAN                          | ITEM_ORDER_IX    |     7 |    42 |     3   (0)| 00:00:01 |       |       |
-----------------------------------------------------------------------------------------------------------------------------------

With this done it should be possible to add other diagnostic information too, just join the additional views and add the relevant columns. Hope this helps! I haven’t performed extensive testing on the approach but wanted to put it out here for the more clever people to tell me where I’m wrong.

Posted in 12c Release 1 | Leave a Comment »

 
Follow

Get every new post delivered to your Inbox.

Join 2,652 other followers