Category Archives: 18c

Making some more sense of direct path reads during primary key lookups

After having published my first article of 2019 I have received feedback I felt like including. With a nod to @fritshoogland, @ChrisAntognini and @FranckPachot.

In the previous post I showed you output of Tanel Poder’s ashtop.sql as proof that direct path reads can occur even if all you do is look up data by primary key. This script touches v$active_session_history, and I’m not getting tired of mentioning that you need to license the system in scope for Enterprise Edition and the Diagnostics Pack to do so.

What I haven’t shown you in more detail in the previous article is what causes these direct path reads. This is what I’ll try to do next. I have started by repeating my test case but added the gather_plan_statistics hint to the statement, resulting in a different SQL ID. Yesterday I noticed that Safari doesn’t show the scroll bar at the bottom of this code snippet other browsers show, I hope output isn’t garbled. I’m using the universal excuse here “it works on my laptop”.

SQL> @xia a66axyw2g6yjb
Display Execution plan in advanced format for sqlid a66axyw2g6yjb

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  a66axyw2g6yjb, child number 0
-------------------------------------
select /*+ gather_plan_statistics CLOB standalone */ *  from
martin.orders_clob  where order_id = :1

Plan hash value: 3284193906

----------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                |      1 |        |       |     3 (100)|          |      0 |00:00:00.01 |       3 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ORDERS_CLOB    |      1 |      1 |   137 |     3   (0)| 00:00:01 |      0 |00:00:00.01 |       3 |
|*  2 |   INDEX UNIQUE SCAN         | PK_ORDERS_CLOB |      1 |      1 |       |     2   (0)| 00:00:01 |      0 |00:00:00.01 |       3 |
----------------------------------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$1 / ORDERS_CLOB@SEL$1
   2 - SEL$1 / ORDERS_CLOB@SEL$1

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('18.1.0')
      DB_VERSION('18.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      INDEX_RS_ASC(@"SEL$1" "ORDERS_CLOB"@"SEL$1" ("ORDERS_CLOB"."ORDER_ID"))
      END_OUTLINE_DATA
  */

Peeked Binds (identified by position):
--------------------------------------

   1 - :1 (NUMBER): 519990

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ORDER_ID"=:1)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - "ORDER_ID"[NUMBER,22], "ORDERS_CLOB"."ORDER_CLOB"[LOB,4000], "ORDERS_CLOB"."ORDER_DATE"[TIMESTAMP WITH LOCAL TIME
       ZONE,11], "ORDERS_CLOB"."ORDER_MODE"[VARCHAR2,8], "ORDERS_CLOB"."CUSTOMER_ID"[NUMBER,22],
       "ORDERS_CLOB"."ORDER_STATUS"[NUMBER,22], "ORDERS_CLOB"."ORDER_TOTAL"[NUMBER,22], "ORDERS_CLOB"."SALES_REP_ID"[NUMBER,22],
       "ORDERS_CLOB"."PROMOTION_ID"[NUMBER,22], "ORDERS_CLOB"."WAREHOUSE_ID"[NUMBER,22], "ORDERS_CLOB"."DELIVERY_TYPE"[VARCHAR2,15],
       "ORDERS_CLOB"."COST_OF_DELIVERY"[NUMBER,22], "ORDERS_CLOB"."WAIT_TILL_ALL_AVAILABLE"[VARCHAR2,15],
       "ORDERS_CLOB"."DELIVERY_ADDRESS_ID"[NUMBER,22], "ORDERS_CLOB"."CUSTOMER_CLASS"[VARCHAR2,30],
       "ORDERS_CLOB"."CARD_ID"[NUMBER,22], "ORDERS_CLOB"."INVOICE_ADDRESS_ID"[NUMBER,22]
   2 - "ORDERS_CLOB".ROWID[ROWID,10], "ORDER_ID"[NUMBER,22]


57 rows selected.

After a few minutes of executing the statement repeatedly, direct path reads are clearly visible again

SELECT
    COUNT(*),
    event,
    session_state
FROM
    v$active_session_history ash
WHERE
    sql_id = 'a66axyw2g6yjb'
    AND sample_time > SYSDATE - 15 / 1440
GROUP BY
    event,
    session_state
ORDER BY
    1 DESC;

  COUNT(*) EVENT                                    SESSION
---------- ---------------------------------------- -------
      1592                                          ON CPU
        24 direct path read                         WAITING
         6 cursor: pin S                            WAITING
         4 SQL*Net more data to client              WAITING

So far so good. Getting back to the original question: what’s causing these? ASH provides the answer to that question as well.

SELECT
    COUNT(*),
    event,
    session_state,
    sql_plan_operation
    || ' '
    || sql_plan_options AS what,
    CASE
        WHEN wait_class IN(
            'Application',
            'Cluster',
            'Concurrency',
            'User I/O'
        )THEN object_name
        ELSE 'undef'
    END AS obj
FROM
    v$active_session_history ash,
    dba_objects o
WHERE
    ash.current_obj# = o.object_id
    AND sql_id = 'a66axyw2g6yjb'
GROUP BY
    event,
    session_state,
    sql_plan_operation
    || ' '
    || sql_plan_options,
    CASE
        WHEN wait_class IN(
            'Application',
            'Cluster',
            'Concurrency',
            'User I/O'
        )THEN object_name
        ELSE 'undef'
    END
ORDER BY 1 desc;

  COUNT(*) EVENT                                    SESSION WHAT                           OBJ
---------- ---------------------------------------- ------- ------------------------------ ------------------------------
      3006                                          ON CPU  SELECT STATEMENT               undef
       223                                          ON CPU  INDEX UNIQUE SCAN              undef
       126                                          ON CPU  TABLE ACCESS BY INDEX ROWID    undef
        50 direct path read                         WAITING SELECT STATEMENT               SYS_LOB0000081921C00002$$
         8 cursor: pin S                            WAITING SELECT STATEMENT               SYS_LOB0000081921C00002$$
         8 SQL*Net more data to client              WAITING SELECT STATEMENT               undef
         1                                          ON CPU                                 undef

7 rows selected.

I think you might be getting the idea by now … I can link the segment back to DBA_LOBS, and … it’s the LOB column of ORDERS_CLOB.

SELECT
    table_name,
    column_name,
    segment_name,
    cache,
    securefile
FROM
    dba_lobs
WHERE
    segment_name = 'SYS_LOB0000081921C00002$$';

TABLE_NAME      COLUMN_NAME     SEGMENT_NAME                   CACHE      SEC
--------------- --------------- ------------------------------ ---------- ---
ORDERS_CLOB     ORDER_CLOB      SYS_LOB0000081921C00002$$      NO         YES

QED.

Another option is to query session statistics. I looked at one of the sessions using snapper4.sql but the output was too wide to paste it as text. I’ll resort to the traditional way of doing this:

SELECT
name,
value
FROM
v$sesstat st
NATURAL JOIN v$statname sn
WHERE
st.sid = 128
AND st.value <> 0
AND REGEXP_LIKE(name, '(physical|securefile).*read.*')
ORDER BY
sn.name;

NAME VALUE
----------------------------------------- ----------
physical read IO requests 106836
physical read bytes 875200512
physical read total IO requests 106836
physical read total bytes 875200512
physical reads 106836
physical reads direct 106836
physical reads direct (lob) 106836
securefile direct read bytes 875200512
securefile direct read ops 106836
9 rows selected.

I admit this was a rather manual way of extracting this piece of information, you might be able to get the same data with an ASH report (provided the number of I/O samples against that segment is significant enough to show up). In this context I’d like to recommend planx.sql you can find on Carlos Sierra’s and Mauro Pagano’s github repository. It’s small, lightweight, and incredibly useful.

Advertisements

Little things worth knowing: parallel Data Pump export in table mode

I haven’t used Data Pump in a little while but recently needed to do a bit of work involving this very useful utility to export a single table. I know that it is possible to export data in parallel using expdp, but I can’t recall the syntax for doing so off the top of my head when I need it. This post describes a potential approach to exporting a table in parallel. In the next post I will demonstrate an interesting case where using parallelism didn’t help me speed up the export. All of this was tested on 12.2 and 18.4.0, the examples I am sharing originate from my 18.4.0 single instance database (without ASM) running on Linux.

The setup

My lab environment is a bit limited when it comes to storage, so I’ll have to do with small-ish tables. The basic principles should still apply for larger segments though. Please note that my tables aren’t partitioned to keep the individual segment size as large as possible. 

My data is once more based on Swingbench’s order entry schema. The table I’ll use in my first example is the well-known ORDER_ITEMS table:

$ sql soe

SQLcl: Release 17.3.0 Production on Wed Dec 12 18:12:00 2018

Copyright (c) 1982, 2018, Oracle. All rights reserved.

Password? (*******?) *********
Last Successful login time: Wed Dec 12 2018 18:12:01 +01:00

Connected to:
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
Version 18.4.0.0.0

SQL> info soe.order_items
TABLE: ORDER_ITEMS
LAST ANALYZED:2018-12-11 22:19:40.0
ROWS :68442528
SAMPLE SIZE :68442528
INMEMORY :DISABLED
COMMENTS :

Columns
NAME DATA TYPE NULL DEFAULT COMMENTS
*LINE_ITEM_ID NUMBER(3,0) Yes
*ORDER_ID NUMBER(12,0) Yes
PRODUCT_ID NUMBER(6,0) Yes
UNIT_PRICE NUMBER(8,2) Yes
QUANTITY NUMBER(8,0) Yes
DISPATCH_DATE DATE Yes
RETURN_DATE DATE Yes
GIFT_WRAP VARCHAR2(20 BYTE) Yes
CONDITION VARCHAR2(20 BYTE) Yes
SUPPLIER_ID NUMBER(6,0) Yes
ESTIMATED_DELIVERY DATE Yes

Indexes
INDEX_NAME UNIQUENESS STATUS FUNCIDX_STATUS COLUMNS
SOE.ITEM_ORDER_IX NONUNIQUE VALID ORDER_ID
SOE.ORDER_ITEMS_PK UNIQUE VALID ORDER_ID, LINE_ITEM_ID
SOE.ITEM_PRODUCT_IX NONUNIQUE VALID PRODUCT_ID

Running an export without any parallelism

I’ll use this example as the baseline, exporting the table without parallelism. This example is quite basic, and you will probably have to adjust it to suit your needs. I am assuming basic familiarity with the Data Pump toolset, if not, head over to the documentation and read up on the concepts. The non-parallel expdp call is shown here:

$ expdp martin directory=exptest logfile=exp_order_items_noparallel.log \
> dumpfile=exp_order_items_noparallel.dmp tables=soe.order_items

[...]

Export: Release 18.0.0.0.0 - Production on Wed Dec 12 19:00:38 2018
Version 18.4.0.0.0

Copyright (c) 1982, 2018, Oracle and/or its affiliates. All rights reserved.

Connected to: Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
Starting "MARTIN"."SYS_EXPORT_TABLE_01": martin
directory=exptest logfile=exp_order_items_noparallel.log
dumpfile=exp_order_items_noparallel.dmp tables=soe.order_items
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Processing object type TABLE_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Processing object type TABLE_EXPORT/TABLE/STATISTICS/MARKER
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/INDEX/INDEX
Processing object type TABLE_EXPORT/TABLE/CONSTRAINT/CONSTRAINT
. . exported "SOE"."ORDER_ITEMS" 3.601 GB 68442528 rows
Master table "MARTIN"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
***************************************************************************
Dump file set for MARTIN.SYS_EXPORT_TABLE_01 is:
/u01/data_pump/ORCL/exp_order_items_noparallel.dmp
Job "MARTIN"."SYS_EXPORT_TABLE_01" successfully completed at Wed Dec 12 19:01:22 2018 elapsed 0 00:00:43

So it takes about 43 seconds for expdp to create the file.

Adding parallelism

The next attempt I made was to introduce parallelism. For expdp to actually use parallelism in the first place, you need to have multiple dump files to write to. This is accomplished by using the %U placeholder in the file name, as shown here:

$ expdp martin directory=exptest logfile=exp_order_items.log \
> dumpfile=exp_order_items_%U.dmp parallel=4 tables=soe.order_items

Export: Release 18.0.0.0.0 - Production on Wed Dec 12 19:56:42 2018
Version 18.4.0.0.0

Copyright (c) 1982, 2018, Oracle and/or its affiliates. All rights reserved.

Connected to: Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
Starting "MARTIN"."SYS_EXPORT_TABLE_01": martin directory=exptest
logfile=exp_order_items.log dumpfile=exp_order_items_%U.dmp parallel=4
tables=soe.order_items
Processing object type TABLE_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Processing object type TABLE_EXPORT/TABLE/INDEX/INDEX
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/CONSTRAINT/CONSTRAINT
Processing object type TABLE_EXPORT/TABLE/STATISTICS/MARKER
. . exported "SOE"."ORDER_ITEMS" 3.601 GB 68442528 rows
Master table "MARTIN"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded

Dump file set for MARTIN.SYS_EXPORT_TABLE_01 is:
/u01/data_pump/ORCL/exp_order_items_01.dmp
/u01/data_pump/ORCL/exp_order_items_02.dmp
/u01/data_pump/ORCL/exp_order_items_03.dmp
/u01/data_pump/ORCL/exp_order_items_04.dmp
/u01/data_pump/ORCL/exp_order_items_05.dmp
Job "MARTIN"."SYS_EXPORT_TABLE_01" successfully completed at Wed Dec 12 19:57:02 2018 elapsed 0 00:00:20

So this isn’t too bad: using parallel 4 I managed to cut the time it takes to export the table in half. If you read the Data Pump white paper you can see that the job of the worker processes is not just the data unloading, they also have to create metadata for the object currently exported. I supposed this is the reason why there are 5 dump files. You can see they aren’t all the same size:

$ ls -lh /u01/data_pump/ORCL/exp_order_items_0[12345].dmp
-rw-r----- 1 oracle oinstall 1.2G Dec 12 19:57 /u01/data_pump/ORCL/exp_order_items_01.dmp
-rw-r----- 1 oracle oinstall 16K Dec 12 19:56 /u01/data_pump/ORCL/exp_order_items_02.dmp
-rw-r----- 1 oracle oinstall 192K Dec 12 19:57 /u01/data_pump/ORCL/exp_order_items_03.dmp
-rw-r----- 1 oracle oinstall 1.3G Dec 12 19:57 /u01/data_pump/ORCL/exp_order_items_04.dmp
-rw-r----- 1 oracle oinstall 1.3G Dec 12 19:57 /u01/data_pump/ORCL/exp_order_items_05.dmp

What I really like about the Data Pump API is the ability to attach to a job and see what it’s doing. While the export was running, I had a quick look at it:

$ expdp martin attach=SYS_EXPORT_TABLE_01

Export: Release 18.0.0.0.0 - Production on Wed Dec 12 19:56:54 2018
Version 18.4.0.0.0

Copyright (c) 1982, 2018, Oracle and/or its affiliates. All rights reserved.
[...]
Connected to: Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production

Job: SYS_EXPORT_TABLE_01
Owner: MARTIN
Operation: EXPORT
Creator Privs: TRUE
GUID: 7CD567F0F8D057DDE0530164A8C0392F
Start Time: Wednesday, 12 December, 2018 19:56:43
Mode: TABLE
Instance: ORCL
Max Parallelism: 4
Timezone: +02:00
Timezone version: 31
Endianness: LITTLE
NLS character set: AL32UTF8
NLS NCHAR character set: AL16UTF16
EXPORT Job Parameters:
Parameter Name Parameter Value:
CLIENT_COMMAND martin directory=exptest
logfile=exp_order_items.log dumpfile=exp_order_items_%U.dmp
parallel=4 tables=soe.order_items
TRACE 0
State: EXECUTING
Bytes Processed: 0
Current Parallelism: 4
Job Error Count: 0
Job heartbeat: 5
Dump File: /u01/data_pump/ORCL/exp_order_items_%u.dmp
Dump File: /u01/data_pump/ORCL/exp_order_items_01.dmp
bytes written: 8,192
Dump File: /u01/data_pump/ORCL/exp_order_items_02.dmp
bytes written: 16,384
Dump File: /u01/data_pump/ORCL/exp_order_items_03.dmp
bytes written: 16,384
Dump File: /u01/data_pump/ORCL/exp_order_items_04.dmp
bytes written: 4,096
Dump File: /u01/data_pump/ORCL/exp_order_items_05.dmp
bytes written: 4,096

Worker 1 Status:
Instance ID: 1
Instance name: ORCL
Host name: server1
Object start time: Wednesday, 12 December, 2018 19:56:46
Object status at: Wednesday, 12 December, 2018 19:56:47
Process Name: DW00
State: WORK WAITING

Worker 2 Status:
Instance ID: 1
Instance name: ORCL
Host name: server1
Object start time: Wednesday, 00 Sat, 0000 0:00:00
Object status at: Wednesday, 12 December, 2018 15
Process Name: DW01
State: WORK WAITING

Worker 3 Status:
Instance ID: 1
Instance name: ORCL
Host name: server1
Object start time: Wednesday, 12 December, 2018 19:56:47
Object status at: Wednesday, 12 December, 2018 19:56:48
Process Name: DW02
State: EXECUTING
Object Schema: SOE
Object Name: ORDER_ITEMS
Object Type: TABLE_EXPORT/TABLE/TABLE_DATA
Completed Objects: 1
Total Objects: 1
Completed Rows: 33,417,535
Worker Parallelism: 3

Worker 4 Status:
Instance ID: 1
Instance name: ORCL
Host name: server1
Access method: direct_path
Object start time: Wednesday, 12 December, 2018 19:56:45
Object status at: Wednesday, 12 December, 2018 19:56:49
Process Name: DW03
State: WORK WAITING

The output is quite interesting. It confirms that I’m running with parallel 4 and it also shows that only 1 file is really being written to by worker 3. It does so in parallel. If you recall from the ls command earlier, there were 3 files of roughly 1.3 GB each. They surely have been written to by the 3rd worker process.

Summary

I have been able to export a table in parallel using Data Pump, and by doing so I cut the time it took to export the table in half. This is a great feature which I am certainly going to use regularly.