As part of our research for our joint presentation at OGH.nl and E4 my colleague Frits Hoogland made an interesting discovery. He verified the compression format for updated rows previously compressed using Hybrid Columnar Compression (HCC). In that particular example we researched HCC data on Exadata storage. As you may know, reading HCC data in its compressed form is limited to Exadata, Pillar Axiom, and the ZFS Storage Appliance (ZFSSA).
Background
So far I took it for granted that Oracle HCC compressed data can’t be updated in place. As has been shown in many posts and books (for example by Kerry Osborne) an update on a row that has been compressed using HCC will trigger the move of that particular row to a new block. The new block is flagged for OLTP compression. OLTP compression is a variation of the BASIC compression or de-duplication that Jonathan Lewis expertly covered in a four part series hosted by All Things Oracle. The link to the first part of it can be found here. Even if you think you don’t need to bother with BASIC compression you are mistaken: it is still very relevant, especially in the Exadata/HCC context. Before reading on about block changes etc. it might be a good idea to refer to Kerry’s blog post mentioned before. If you want to really dive into the matter I suggest two more posts as reference:
- An introduction to Compression Units by Jonathan Lewis
- An update to the above post (thanks for that by the way)
This almost starts to feel like a scientific article where you can’t see the text for all the footnotes but credit is due!
An interesting discovery
So as it happened Frits discovered a new compression type when updating rows in a HCC compressed table in 11.2.0.3.22. My examples that follow are on 12.1.0.1.0. Consider this table:
SQL> CREATE TABLE t1_wp 2 enable row movement 3 column store 4 compress for query low 5 AS 6 WITH v1 AS 7 (SELECT rownum n FROM dual CONNECT BY level <= 10000) 8 SELECT rownum id, 9 rpad(rownum,500) t_pad, 10 TRUNC(sysdate) - 180 + TRUNC((rownum-1)/3)/86400 date_created, 11 TRUNC(sysdate) - 180 + TRUNC((rownum-1)/3)/86400 + dbms_random.value(1800, 86400)/86400 date_completed, 12 CASE 13 WHEN mod(rownum,100000) = 0 14 THEN CAST('RARE' AS VARCHAR2(12)) 15 WHEN mod(rownum,10000) = 0 16 THEN CAST('FAIRLY RARE' AS VARCHAR2(12)) 17 WHEN mod(rownum,1000) = 0 18 THEN CAST('NOT RARE' AS VARCHAR2(12)) 19 WHEN mod(rownum,100) = 0 20 THEN CAST('COMMON' AS VARCHAR2(12)) 21 ELSE CAST('THE REST' AS VARCHAR2(12)) 22 END state, 23 TO_CHAR(mod(rownum, 4)) spcol 24 FROM v1, 25 v1 26 WHERE rownum <= 1e6;
It basically created 1 million rows with HCC compress for query low. I chose Query Low because it’s the easiest to work with as its compression unit size is roughly 32k. The compressed table is of small size.
SQL> select bytes/power(1024,2) m, blocks from user_segments 2 where segment_name = 'T1_WP'; M BLOCKS ---------- ---------- 16 2048
Now what I am after is an update on a specific CU. I want to know the minimum and maximum ID per block:
select min(id),max(id),blockn from ( select id,DBMS_ROWID.ROWID_RELATIVE_FNO(rowid) , DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid) as blockn from martin.t1_wp ) group by blockn order by blockn; MIN(ID) MAX(ID) BLOCKN ---------- ---------- ---------- 1 1964 262147 1965 4194 262150 4195 6424 262154 6425 7148 262158 7149 9375 262161 9376 11598 262165 11599 13795 262169 13796 14970 262173 14971 17127 262177 17128 19270 262181 ...
And so forth for a total 501 rows.
What happens now if I update IDs 1 to 10? But before I answer the question I’d like to show you the block dump of block 262147:
SQL> select rowid, dbms_compression.get_compression_type(user, 'T1_WP', rowid) compressionType 2 from t1_wp where id between 1 and 10; ROWID COMPRESSIONTYPE ------------------ --------------- AAAO7sAALAABAADAAA 8 AAAO7sAALAABAADAAB 8 AAAO7sAALAABAADAAC 8 AAAO7sAALAABAADAAD 8 AAAO7sAALAABAADAAE 8 AAAO7sAALAABAADAAF 8 AAAO7sAALAABAADAAG 8 AAAO7sAALAABAADAAH 8 AAAO7sAALAABAADAAI 8 AAAO7sAALAABAADAAJ 8 10 rows selected.
As you can see the compression type is 8 or query low. Let’s look at the dump:
Block header dump: 0x02c40003 Object id on Block? Y seg/obj: 0xeeec csc: 0x99a.535752ba itc: 3 flg: E typ: 1 - DATA brn: 0 bdba: 0x2c40000 ver: 0x01 opc: 0 inc: 0 exflg: 0 Itl Xid Uba Flag Lck Scn/Fsc 0x01 0xffff.000.00000000 0x00000000.0000.00 C--- 0 scn 0x099a.535752ba 0x02 0x0000.000.00000000 0x00000000.0000.00 ---- 0 fsc 0x0000.00000000 0x03 0x0000.000.00000000 0x00000000.0000.00 ---- 0 fsc 0x0000.00000000 bdba: 0x02c40003 data_block_dump,data header at 0x7f50921c447c =============== tsiz: 0x1f80 hsiz: 0x1c pbl: 0x7f50921c447c 76543210 flag=-0------ ntab=1 nrow=1 frre=-1 fsbo=0x1c fseo=0x30 avsp=0x14 tosp=0x14 r0_9ir2=0x0 mec_kdbh9ir2=0x0 76543210 shcf_kdbh9ir2=---------- 76543210 flag_9ir2=--R----- Archive compression: Y fcls_9ir2[0]={ } 0x16:pti[0] nrow=1 offs=0 0x1a:pri[0] offs=0x30 block_row_dump: tab 0, row 0, @0x30 tl: 8016 fb: --H-F--N lb: 0x0 cc: 1 nrid: 0x02c40004.0 col 0: [8004] Compression level: 01 (Query Low) Length of CU row: 8004 kdzhrh: ------PC- CBLK: 3 Start Slot: 00 NUMP: 03 PNUM: 00 POFF: 7964 PRID: 0x02c40004.0 PNUM: 01 POFF: 15980 PRID: 0x02c40005.0 PNUM: 02 POFF: 23996 PRID: 0x02c40006.0 *--------- CU header: CU version: 0 CU magic number: 0x4b445a30 CU checksum: 0x147384ec CU total length: 28107 CU flags: NC-U-CRD-OP ncols: 6 nrows: 1964 algo: 0 CU decomp length: 27815 len/value length: 1048622 row pieces per row: 1 num deleted rows: 0 START_CU: 00 00 1f 44 0f 03 00 00 00 03 00 00 1f 1c 02 c4 00 04 00 00 00 00 3e 6c 02 c4 00 05 00 00 00 00 5d bc 02 c4 00 06 00 00 00 4b 44 5a 30 ec 84 73 14 00 00 6d cb eb 06 00 06 07 ac 00 10 00 2e 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ...
And now for the update, which is a simple “update t1_wp set spcol = ‘UPDATED’ where id between 1 and 10′” affecting exactly 10 rows.
Checking the ROWIDs again:
SQL> select rowid, dbms_compression.get_compression_type(user, 'T1_WP', rowid) compressionType 2 from t1_wp where id between 1 and 10; ROWID COMPRESSIONTYPE ------------------ --------------- AAAO7sAALAABAfeAAA 1 AAAO7sAALAABAfeAAB 1 AAAO7sAALAABAfeAAC 1 AAAO7sAALAABAfeAAD 1 AAAO7sAALAABAfeAAE 1 AAAO7sAALAABAfeAAF 1 AAAO7sAALAABAfeAAG 1 AAAO7sAALAABAfeAAH 1 AAAO7sAALAABAfeAAI 1 AAAO7sAALAABAfeAAJ 1 10 rows selected.
So that’s interesting-the rowids have changed, and so has the compression algorithm. What if I commit?
SQL> commit; Commit complete SQL> select rowid, dbms_compression.get_compression_type(user, 'T1_WP', rowid) compressionType 2 from t1_wp where id between 1 and 10; ROWID COMPRESSIONTYPE ------------------ --------------- AAAO7sAALAABAfeAAA 64 AAAO7sAALAABAfeAAB 64 AAAO7sAALAABAfeAAC 64 AAAO7sAALAABAfeAAD 64 AAAO7sAALAABAfeAAE 64 AAAO7sAALAABAfeAAF 64 AAAO7sAALAABAfeAAG 64 AAAO7sAALAABAfeAAH 64 AAAO7sAALAABAfeAAI 64 AAAO7sAALAABAfeAAJ 64 10 rows selected.
There it is, the new mysterious compression type 64. To my knowledge it is not documented in 11.2, but it is in 12.1, as “DBMS_COMPRESSION.COMP_BLOCK”. What does it look like? Picking ROWID for id = 10 (AAAO7sAALAABAfeAAJ)
SQL> select id from t1_wp where rowid = 'AAAO7sAALAABAfeAAJ'; ID ---------- 10 SQL> alter system dump datafile 7 block 46401502; Block header dump: 0x02c407de Object id on Block? Y seg/obj: 0xeeec csc: 0x99a.53575da9 itc: 2 flg: E typ: 1 - DATA brn: 0 bdba: 0x2c40781 ver: 0x01 opc: 0 inc: 0 exflg: 0 Itl Xid Uba Flag Lck Scn/Fsc 0x01 0x000a.008.00038f01 0x00000f5d.720e.0a --U- 15 fsc 0x0000.53576117 0x02 0x0000.000.00000000 0x00000000.0000.00 ---- 0 fsc 0x0000.00000000 bdba: 0x02c407de data_block_dump,data header at 0x7fbc909ea264 =============== tsiz: 0x1f98 hsiz: 0x4c pbl: 0x7fbc909ea264 76543210 flag=-0----X- ntab=2 nrow=20 frre=-1 fsbo=0x4c fseo=0xbe avsp=0x72 tosp=0x72 r0_9ir2=0x1 mec_kdbh9ir2=0x1 76543210 shcf_kdbh9ir2=---------- 76543210 flag_9ir2=--R-LNOC Archive compression: N fcls_9ir2[0]={ } perm_9ir2[6]={ 2 3 4 5 0 1 } 0x1c:pti[0] nrow=5 offs=0 0x20:pti[1] nrow=15 offs=5 0x24:pri[0] offs=0x1f5c 0x26:pri[1] offs=0x1f70 ... block_row_dump: tab 0, row 0, @0x1f5c tl: 20 fb: --H-FL-- lb: 0x0 cc: 2 col 0: [ 8] 54 48 45 20 52 45 53 54 col 1: [ 7] 55 50 44 41 54 45 44 bindmp: 00 0d 02 d0 54 48 45 20 52 45 53 54 cf 55 50 44 41 54 45 44 tab 0, row 1, @0x1f70 tl: 10 fb: --H-FL-- lb: 0x0 cc: 1 col 0: [ 7] 78 71 0b 0d 01 01 04 bindmp: 00 03 cf 78 71 0b 0d 01 01 04 ... tab 0, row 4, @0x1f8e tl: 10 fb: --H-FL-- lb: 0x0 cc: 1 col 0: [ 7] 78 71 0b 0d 01 01 01 bindmp: 00 03 cf 78 71 0b 0d 01 01 01 tab 1, row 0, @0x1d55 tl: 519 fb: --H-FL-- lb: 0x1 cc: 6 col 0: [ 8] 54 48 45 20 52 45 53 54 col 1: [ 7] 55 50 44 41 54 45 44 col 2: [ 2] c1 02 col 3: [500] 31 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 ... col 4: [ 7] 78 71 0b 0d 01 01 01 col 5: [ 7] 78 71 0b 0d 0b 09 01 bindmp: 2c 01 05 00 ca c1 02 fa 01 f4 31 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 04 cf 78 71 0b 0d 0b 09 01 tab 1, row 1, @0x1b4e ....
I have not come to a conclusion about this yet. There are similarities to the BASIC compression block dump in that there are 2 tables (ntab=2), and the first one looks a bit like a symbol table to me. It also uses the familiar perm_9ir2 field to let us know it changed the column order. The flag is new though, or at least to me. The only reference I could find to flag=-0—-X- was in the bug database, interestingly related to a OLTP-compressed block. The tl of the first row in t1 does not compute either, something I have only seen with BASIC/OLTP compressed data blocks.
The problem is that I can’t use the interpretation of BASIC compression in the bindump. Take the bindmp of tab 1, row 0:
bindmp: 2c 01 05 00 ca c1 02 fa 01 f4 31 20 20 20 20 20 20 20 20 20 20 20 20…
Going by the rules I know about the first byte is the flag byte (2c = –H-FL–), followed by the lock byte (01), followed by the number of columns stored at this location (05) and then the way of explaining the bindmp for OLTP/BASIC compressed data falls apart. What is interesting though that there seems to be a pattern:
> for line in $(grep bindmp orcl_ora_30181.trc); do echo ${line:0:64}; done bindmp: 00 0d 02 d0 54 48 45 20 52 45 53 54 cf 55 50 44 41 54 45 bindmp: 00 03 cf 78 71 0b 0d 01 01 04 bindmp: 00 03 cf 78 71 0b 0d 01 01 03 bindmp: 00 03 cf 78 71 0b 0d 01 01 02 bindmp: 00 03 cf 78 71 0b 0d 01 01 01 bindmp: 2c 01 05 00 ca c1 02 fa 01 f4 31 20 20 20 20 20 20 20 20 bindmp: 2c 01 05 00 ca c1 03 fa 01 f4 32 20 20 20 20 20 20 20 20 bindmp: 2c 01 05 00 ca c1 04 fa 01 f4 33 20 20 20 20 20 20 20 20 bindmp: 2c 01 05 00 ca c1 05 fa 01 f4 34 20 20 20 20 20 20 20 20 bindmp: 2c 01 05 00 ca c1 06 fa 01 f4 35 20 20 20 20 20 20 20 20 bindmp: 2c 01 05 00 ca c1 07 fa 01 f4 36 20 20 20 20 20 20 20 20 bindmp: 2c 01 05 00 ca c1 08 fa 01 f4 37 20 20 20 20 20 20 20 20 bindmp: 2c 01 05 00 ca c1 09 fa 01 f4 38 20 20 20 20 20 20 20 20 bindmp: 2c 01 05 00 ca c1 0a fa 01 f4 39 20 20 20 20 20 20 20 20 bindmp: 2c 01 05 00 ca c1 0b fa 01 f4 31 30 20 20 20 20 20 20 20 bindmp: 2c 01 05 00 ca c1 0c fa 01 f4 31 31 20 20 20 20 20 20 20 bindmp: 2c 01 05 00 ca c1 0d fa 01 f4 31 32 20 20 20 20 20 20 20 bindmp: 2c 01 05 00 ca c1 0e fa 01 f4 31 33 20 20 20 20 20 20 20 bindmp: 2c 01 06 d0 54 48 45 20 52 45 53 54 cf 55 50 44 41 54 45 bindmp: 2c 01 06 d0 54 48 45 20 52 45 53 54 cf 55 50 44 41 54 45
In another part of this article I will try to work out what exactly is represented by the bindmp.
Responses
Hi Martin,
Compression type 64 is actually mentioned in the Oracle documentation nowadays:
http://docs.oracle.com/cd/E11882_01/appdev.112/e40758/d_compress.htm#ARPLS72966
Oracle calls simply “Compression”, not that it is of any help :-)
Klaas-jan
Martin,
I thought I’d published some details on how to read the OLTP-compressed row, but like the CU notes they were incomplete.
2c 01 05 00 ca c1 02 fa 01 f4 31 20 20 20 20 20 20 … 20 20 20 20 20 20 20 20 04 cf 78 71 0b 0d 0b 09 01
From the 4th byte onwards
0 — use token zero (any value up to 250 will be a token id)
ca — next column is a real column of length 2 (length are 200 + actual length)
c1 02 — value = 1
fa 0a f4 — next column is a real column of length 500 (lengths over 249 (?) use 3 bytes, starting with fa)
31 20 … 20 — column value
04 — use token 4
cf — next column is a real column of length 7 (0xcf = 207)
78 71 0b 0d 0b 09 01 — value
Thanks Jonathan!
We thought it was some kind of OLTP compressed block. I wonder where the difference is between this compression type and the “BASIC/OLTP” compression type. They are reported differently in dbms_compression.get_compression_type. Maybe this new type has been introduced to better indicate that the rows in it were initially stored in a CU? There are some flags worth looking at (flag=-0—-X-, r0_9ir2=0x1, flag_9ir2) that might give us a clue. I will update the post should I find something.
And again thanks for clarifying on how to read the bindmp.
Martin
Hey Martin
I can’t offer a technical explanation but I do have a theory which is more based on the commercials.
Oracle Storage users are allowed to use Hybrid Columnar Compression. This is not a separate license, but rather a (generic) feature for which permission is granted if you use Exadata, SuperCluster, ZFS Storage Appliance, etc. The important fact is that you do not need to purchase the Advanced Compression Option license at $11,500 list price per licensable core.
I think it’s a potential issue for Oracle that any update on an HCC row results in some of the table being left in a state of OLTP compression, which is actually a technology that requires the ACO license. Until now, Oracle had no way of knowing if a customer was using OLTP without the license or simply using HCC for which no license is required.
It therefore seems a sensible hypothesis that the new compression type of 64 is actually a differentiator so that Oracle can correctly audit customers use of compression…?
-flash
Martin,
A correction to my note – I suggested 249 as the greatest length before Oracle switches to 3 bytes for length; I forgot to allow for the 200 which is the starting point anyway, so the switch comes with 50 bytes (49 bytes => f9, 50 bytes => fa 00 32).
Looking forward to any further details you uncover.