Martins Blog

Trying to explain complex things in simple terms

Mysterious new Oracle compression type

Posted by Martin Bach on May 13, 2014

As part of our research for our joint presentation at OGH.nl and E4 my colleague Frits Hoogland made an interesting discovery. He verified the compression format for updated rows previously compressed using Hybrid Columnar Compression (HCC). In that particular example we researched HCC data on Exadata storage. As you may know, reading HCC data in its compressed form is limited to Exadata, Pillar Axiom, and the ZFS Storage Appliance (ZFSSA).

Background

So far I took it for granted that Oracle HCC compressed data can’t be updated in place. As has been shown in many posts and books (for example by Kerry Osborne) an update on a row that has been compressed using HCC will trigger the move of that particular row to a new block. The new block is flagged for OLTP compression. OLTP compression is a variation of the BASIC compression or de-duplication that Jonathan Lewis expertly covered in a four part series hosted by All Things Oracle. The link to the first part of it can be found here. Even if you think you don’t need to bother with BASIC compression you are mistaken: it is still very relevant, especially in the Exadata/HCC context. Before reading on about block changes etc. it might be a good idea to refer to Kerry’s blog post mentioned before. If you want to really dive into the matter I suggest two more posts as reference:

This almost starts to feel like a scientific article where you can’t see the text for all the footnotes but credit is due!

An interesting discovery

So as it happened Frits discovered a new compression type when updating rows in a HCC compressed table in 11.2.0.3.22. My examples that follow are on 12.1.0.1.0. Consider this table:

  
SQL> CREATE TABLE t1_wp
  2  enable row movement
  3  column store
  4  compress for query low
  5  AS
  6  WITH v1 AS
  7  (SELECT rownum n FROM dual CONNECT BY level <= 10000)
  8  SELECT  rownum id,
  9    rpad(rownum,500) t_pad,
 10    TRUNC(sysdate) - 180 + TRUNC((rownum-1)/3)/86400 date_created,
 11    TRUNC(sysdate) - 180 + TRUNC((rownum-1)/3)/86400 + dbms_random.value(1800, 86400)/86400 date_completed,
 12    CASE
 13	 WHEN mod(rownum,100000) = 0
 14	 THEN CAST('RARE' AS VARCHAR2(12))
 15	 WHEN mod(rownum,10000) = 0
 16	 THEN CAST('FAIRLY RARE' AS VARCHAR2(12))
 17	 WHEN mod(rownum,1000) = 0
 18	 THEN CAST('NOT RARE' AS VARCHAR2(12))
 19	 WHEN mod(rownum,100) = 0
 20	 THEN CAST('COMMON' AS	 VARCHAR2(12))
 21	 ELSE CAST('THE REST' AS VARCHAR2(12))
 22    END state,
 23    TO_CHAR(mod(rownum, 4)) spcol
 24  FROM v1,
 25    v1
 26  WHERE rownum <= 1e6;

It basically created 1 million rows with HCC compress for query low. I chose Query Low because it’s the easiest to work with as its compression unit size is roughly 32k. The compressed table is of small size.

SQL> select bytes/power(1024,2) m, blocks from user_segments
  2  where segment_name = 'T1_WP';

         M     BLOCKS
---------- ----------
        16       2048

Now what I am after is an update on a specific CU. I want to know the minimum and maximum ID per block:

select min(id),max(id),blockn from (
 select id,DBMS_ROWID.ROWID_RELATIVE_FNO(rowid) , 
   DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid) as blockn 
   from martin.t1_wp
) group by blockn order by blockn;

   MIN(ID)    MAX(ID)	  BLOCKN
---------- ---------- ----------
         1       1964     262147
      1965       4194     262150
      4195       6424     262154
      6425       7148     262158
      7149       9375     262161
      9376      11598     262165
     11599      13795     262169
     13796      14970     262173
     14971      17127     262177
     17128      19270     262181
...

And so forth for a total 501 rows.

What happens now if I update IDs 1 to 10? But before I answer the question I’d like to show you the block dump of block 262147:

SQL> select rowid, dbms_compression.get_compression_type(user, 'T1_WP', rowid) compressionType 
  2  from t1_wp where id between 1 and 10;

ROWID              COMPRESSIONTYPE
------------------ ---------------
AAAO7sAALAABAADAAA               8
AAAO7sAALAABAADAAB               8
AAAO7sAALAABAADAAC               8
AAAO7sAALAABAADAAD               8
AAAO7sAALAABAADAAE               8
AAAO7sAALAABAADAAF               8
AAAO7sAALAABAADAAG               8
AAAO7sAALAABAADAAH               8
AAAO7sAALAABAADAAI               8
AAAO7sAALAABAADAAJ               8

10 rows selected.

As you can see the compression type is 8 or query low. Let’s look at the dump:

Block header dump:  0x02c40003
 Object id on Block? Y
 seg/obj: 0xeeec  csc: 0x99a.535752ba  itc: 3  flg: E  typ: 1 - DATA
     brn: 0  bdba: 0x2c40000 ver: 0x01 opc: 0
     inc: 0  exflg: 0

 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0xffff.000.00000000  0x00000000.0000.00  C---    0  scn 0x099a.535752ba
0x02   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
0x03   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
bdba: 0x02c40003
data_block_dump,data header at 0x7f50921c447c
===============
tsiz: 0x1f80
hsiz: 0x1c
pbl: 0x7f50921c447c
     76543210
flag=-0------
ntab=1
nrow=1
frre=-1
fsbo=0x1c
fseo=0x30
avsp=0x14
tosp=0x14
        r0_9ir2=0x0
        mec_kdbh9ir2=0x0
                      76543210
        shcf_kdbh9ir2=----------
                  76543210
        flag_9ir2=--R-----      Archive compression: Y
                fcls_9ir2[0]={ }
0x16:pti[0]     nrow=1  offs=0
0x1a:pri[0]     offs=0x30
block_row_dump:
tab 0, row 0, @0x30
tl: 8016 fb: --H-F--N lb: 0x0  cc: 1
nrid:  0x02c40004.0
col  0: [8004]
Compression level: 01 (Query Low)
 Length of CU row: 8004
kdzhrh: ------PC- CBLK: 3 Start Slot: 00
 NUMP: 03
 PNUM: 00 POFF: 7964 PRID: 0x02c40004.0
 PNUM: 01 POFF: 15980 PRID: 0x02c40005.0
 PNUM: 02 POFF: 23996 PRID: 0x02c40006.0
*---------
CU header:
CU version: 0   CU magic number: 0x4b445a30
CU checksum: 0x147384ec
CU total length: 28107
CU flags: NC-U-CRD-OP
ncols: 6
nrows: 1964
algo: 0
CU decomp length: 27815   len/value length: 1048622
row pieces per row: 1
num deleted rows: 0
START_CU:
 00 00 1f 44 0f 03 00 00 00 03 00 00 1f 1c 02 c4 00 04 00 00 00 00 3e 6c 02
 c4 00 05 00 00 00 00 5d bc 02 c4 00 06 00 00 00 4b 44 5a 30 ec 84 73 14 00
 00 6d cb eb 06 00 06 07 ac 00 10 00 2e 01 00 00 00 00 00 00 00 00 00 00 00
 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
...

And now for the update, which is a simple “update t1_wp set spcol = ‘UPDATED’ where id between 1 and 10′” affecting exactly 10 rows.

Checking the ROWIDs again:

SQL> select rowid, dbms_compression.get_compression_type(user, 'T1_WP', rowid) compressionType
  2  from t1_wp where id between 1 and 10;

ROWID		   COMPRESSIONTYPE
------------------ ---------------
AAAO7sAALAABAfeAAA               1
AAAO7sAALAABAfeAAB               1
AAAO7sAALAABAfeAAC               1
AAAO7sAALAABAfeAAD               1
AAAO7sAALAABAfeAAE               1
AAAO7sAALAABAfeAAF               1
AAAO7sAALAABAfeAAG               1
AAAO7sAALAABAfeAAH               1
AAAO7sAALAABAfeAAI               1
AAAO7sAALAABAfeAAJ               1

10 rows selected.

So that’s interesting-the rowids have changed, and so has the compression algorithm. What if I commit?


SQL> commit;

Commit complete

SQL> select rowid, dbms_compression.get_compression_type(user, 'T1_WP', rowid) compressionType
  2  from t1_wp where id between 1 and 10;

ROWID              COMPRESSIONTYPE
------------------ ---------------
AAAO7sAALAABAfeAAA              64
AAAO7sAALAABAfeAAB              64
AAAO7sAALAABAfeAAC              64
AAAO7sAALAABAfeAAD              64
AAAO7sAALAABAfeAAE              64
AAAO7sAALAABAfeAAF              64
AAAO7sAALAABAfeAAG              64
AAAO7sAALAABAfeAAH              64
AAAO7sAALAABAfeAAI              64
AAAO7sAALAABAfeAAJ              64

10 rows selected.

There it is, the new mysterious compression type 64. To my knowledge it is not documented in 11.2, but it is in 12.1, as “DBMS_COMPRESSION.COMP_BLOCK”. What does it look like? Picking ROWID for id = 10 (AAAO7sAALAABAfeAAJ)

SQL> select id from t1_wp where rowid = 'AAAO7sAALAABAfeAAJ';

        ID
----------
        10

SQL> alter system dump datafile 7 block 46401502;

Block header dump:  0x02c407de
 Object id on Block? Y
 seg/obj: 0xeeec  csc: 0x99a.53575da9  itc: 2  flg: E  typ: 1 - DATA
     brn: 0  bdba: 0x2c40781 ver: 0x01 opc: 0
     inc: 0  exflg: 0

 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x000a.008.00038f01  0x00000f5d.720e.0a  --U-   15  fsc 0x0000.53576117
0x02   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
bdba: 0x02c407de
data_block_dump,data header at 0x7fbc909ea264
===============
tsiz: 0x1f98
hsiz: 0x4c
pbl: 0x7fbc909ea264
     76543210
flag=-0----X-
ntab=2
nrow=20
frre=-1
fsbo=0x4c
fseo=0xbe
avsp=0x72
tosp=0x72
        r0_9ir2=0x1
        mec_kdbh9ir2=0x1
                      76543210
        shcf_kdbh9ir2=----------
                  76543210
        flag_9ir2=--R-LNOC      Archive compression: N
                fcls_9ir2[0]={ }
                perm_9ir2[6]={ 2 3 4 5 0 1 }
0x1c:pti[0]     nrow=5  offs=0
0x20:pti[1]     nrow=15 offs=5
0x24:pri[0]     offs=0x1f5c
0x26:pri[1]     offs=0x1f70
...
block_row_dump:
tab 0, row 0, @0x1f5c
tl: 20 fb: --H-FL-- lb: 0x0  cc: 2
col  0: [ 8]  54 48 45 20 52 45 53 54
col  1: [ 7]  55 50 44 41 54 45 44
bindmp: 00 0d 02 d0 54 48 45 20 52 45 53 54 cf 55 50 44 41 54 45 44
tab 0, row 1, @0x1f70
tl: 10 fb: --H-FL-- lb: 0x0  cc: 1
col  0: [ 7]  78 71 0b 0d 01 01 04
bindmp: 00 03 cf 78 71 0b 0d 01 01 04
...
tab 0, row 4, @0x1f8e
tl: 10 fb: --H-FL-- lb: 0x0  cc: 1
col  0: [ 7]  78 71 0b 0d 01 01 01
bindmp: 00 03 cf 78 71 0b 0d 01 01 01
tab 1, row 0, @0x1d55
tl: 519 fb: --H-FL-- lb: 0x1  cc: 6
col  0: [ 8]  54 48 45 20 52 45 53 54
col  1: [ 7]  55 50 44 41 54 45 44
col  2: [ 2]  c1 02
col  3: [500]
 31 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
...
col  4: [ 7]  78 71 0b 0d 01 01 01
col  5: [ 7]  78 71 0b 0d 0b 09 01
bindmp: 2c 01 05 00 ca c1 02 fa 01 f4 31 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 04 cf 78
 71 0b 0d 0b 09 01
tab 1, row 1, @0x1b4e
....

I have not come to a conclusion about this yet. There are similarities to the BASIC compression block dump in that there are 2 tables (ntab=2), and the first one looks a bit like a symbol table to me. It also uses the familiar perm_9ir2 field to let us know it changed the column order. The flag is new though, or at least to me. The only reference I could find to flag=-0—-X- was in the bug database, interestingly related to a OLTP-compressed block. The tl of the first row in t1 does not compute either, something I have only seen with BASIC/OLTP compressed data blocks.

The problem is that I can’t use the interpretation of BASIC compression in the bindump. Take the bindmp of tab 1, row 0:

bindmp: 2c 01 05 00 ca c1 02 fa 01 f4 31 20 20 20 20 20 20 20 20 20 20 20 20…

Going by the rules I know about the first byte is the flag byte (2c = –H-FL–), followed by the lock byte (01), followed by the number of columns stored at this location (05) and then the way of explaining the bindmp for OLTP/BASIC compressed data falls apart. What is interesting though that there seems to be a pattern:

> for line in $(grep bindmp orcl_ora_30181.trc); do  echo ${line:0:64}; done
bindmp: 00 0d 02 d0 54 48 45 20 52 45 53 54 cf 55 50 44 41 54 45
bindmp: 00 03 cf 78 71 0b 0d 01 01 04
bindmp: 00 03 cf 78 71 0b 0d 01 01 03
bindmp: 00 03 cf 78 71 0b 0d 01 01 02
bindmp: 00 03 cf 78 71 0b 0d 01 01 01
bindmp: 2c 01 05 00 ca c1 02 fa 01 f4 31 20 20 20 20 20 20 20 20
bindmp: 2c 01 05 00 ca c1 03 fa 01 f4 32 20 20 20 20 20 20 20 20
bindmp: 2c 01 05 00 ca c1 04 fa 01 f4 33 20 20 20 20 20 20 20 20
bindmp: 2c 01 05 00 ca c1 05 fa 01 f4 34 20 20 20 20 20 20 20 20
bindmp: 2c 01 05 00 ca c1 06 fa 01 f4 35 20 20 20 20 20 20 20 20
bindmp: 2c 01 05 00 ca c1 07 fa 01 f4 36 20 20 20 20 20 20 20 20
bindmp: 2c 01 05 00 ca c1 08 fa 01 f4 37 20 20 20 20 20 20 20 20
bindmp: 2c 01 05 00 ca c1 09 fa 01 f4 38 20 20 20 20 20 20 20 20
bindmp: 2c 01 05 00 ca c1 0a fa 01 f4 39 20 20 20 20 20 20 20 20
bindmp: 2c 01 05 00 ca c1 0b fa 01 f4 31 30 20 20 20 20 20 20 20
bindmp: 2c 01 05 00 ca c1 0c fa 01 f4 31 31 20 20 20 20 20 20 20
bindmp: 2c 01 05 00 ca c1 0d fa 01 f4 31 32 20 20 20 20 20 20 20
bindmp: 2c 01 05 00 ca c1 0e fa 01 f4 31 33 20 20 20 20 20 20 20
bindmp: 2c 01 06 d0 54 48 45 20 52 45 53 54 cf 55 50 44 41 54 45
bindmp: 2c 01 06 d0 54 48 45 20 52 45 53 54 cf 55 50 44 41 54 45

In another part of this article I will try to work out what exactly is represented by the bindmp.

About these ads

5 Responses to “Mysterious new Oracle compression type”

  1. Hi Martin,

    Compression type 64 is actually mentioned in the Oracle documentation nowadays:

    http://docs.oracle.com/cd/E11882_01/appdev.112/e40758/d_compress.htm#ARPLS72966

    Oracle calls simply “Compression”, not that it is of any help :-)

    Klaas-jan

  2. Martin,

    I thought I’d published some details on how to read the OLTP-compressed row, but like the CU notes they were incomplete.

    2c 01 05 00 ca c1 02 fa 01 f4 31 20 20 20 20 20 20 … 20 20 20 20 20 20 20 20 04 cf 78 71 0b 0d 0b 09 01

    From the 4th byte onwards

    0 — use token zero (any value up to 250 will be a token id)
    ca — next column is a real column of length 2 (length are 200 + actual length)
    c1 02 — value = 1
    fa 0a f4 — next column is a real column of length 500 (lengths over 249 (?) use 3 bytes, starting with fa)
    31 20 … 20 — column value
    04 — use token 4
    cf — next column is a real column of length 7 (0xcf = 207)
    78 71 0b 0d 0b 09 01 — value

    • Martin Bach said

      Thanks Jonathan!

      We thought it was some kind of OLTP compressed block. I wonder where the difference is between this compression type and the “BASIC/OLTP” compression type. They are reported differently in dbms_compression.get_compression_type. Maybe this new type has been introduced to better indicate that the rows in it were initially stored in a CU? There are some flags worth looking at (flag=-0—-X-, r0_9ir2=0×1, flag_9ir2) that might give us a clue. I will update the post should I find something.

      And again thanks for clarifying on how to read the bindmp.

      Martin

      • flashdba said

        Hey Martin

        I can’t offer a technical explanation but I do have a theory which is more based on the commercials.

        Oracle Storage users are allowed to use Hybrid Columnar Compression. This is not a separate license, but rather a (generic) feature for which permission is granted if you use Exadata, SuperCluster, ZFS Storage Appliance, etc. The important fact is that you do not need to purchase the Advanced Compression Option license at $11,500 list price per licensable core.

        I think it’s a potential issue for Oracle that any update on an HCC row results in some of the table being left in a state of OLTP compression, which is actually a technology that requires the ACO license. Until now, Oracle had no way of knowing if a customer was using OLTP without the license or simply using HCC for which no license is required.

        It therefore seems a sensible hypothesis that the new compression type of 64 is actually a differentiator so that Oracle can correctly audit customers use of compression…?

        -flash

    • Martin,
      A correction to my note – I suggested 249 as the greatest length before Oracle switches to 3 bytes for length; I forgot to allow for the 200 which is the starting point anyway, so the switch comes with 50 bytes (49 bytes => f9, 50 bytes => fa 00 32).
      Looking forward to any further details you uncover.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

Join 2,417 other followers

%d bloggers like this: