Some time ago Oracle announced that RDBMS 12.1.0.2 has built-in support for JSON processing. A little later it was also mentioned that you have support for JSON in the Exadata storage servers for offloading. This is probably a lot more exciting to users of JSON than it is to me as I’m not a developer. However, whenever an announcement such as the one I’m referring to is made I would like to see for myself how much of it is implemented in software. Like I said, I’m not a developer so apologies for a silly example: what I’m showing you here can probably done differently and is not the best use of an Exadata. But all I really wanted to test is if JSON support actually exists. I am using cellsrv 12.1.2.1.0 and RDBMS 12.1.0.2.2 for this test.
JSON
I have to say I struggled a little bit to understand the use case for JSON and therefore did what probably everyone does and consulted the official documentation and oracle-base.com for Tim’s views on JSON. Here’s a summary of links I found useful to get started:
- http://oracle-base.com/articles/12c/json-support-in-oracle-database-12cr1.php
- http://www.w3schools.com/json/json_intro.asp
- http://docs.oracle.com/database/121/NEWFT/chapter12102.htm#BGBGADCC
The Test
Ok so that was enough to get me started. I needed data, and a table to store this in. It appeared to me that an apache log could be a useful source for JSON records, so I converted my webserver’s log file to JSON using libee0 on OpenSuSE (yes I know, but it’s a great virtualisation platform). The converted file was named httpd_access_log.json and had records such as these:
{"host": "192.168.100.1", "identity": "", "user": "", "date": "05/Feb/2015:12:13:05 +0100", "request": "HEAD /ol70 HTTP/1.1", "status": "404", "size": "", "f1": "", "useragent": "Python-urllib/2.7"} {"host": "192.168.100.1", "identity": "", "user": "", "date": "05/Feb/2015:12:13:25 +0100", "request": "GET / HTTP/1.1", "status": "403", "size": "989", "f1": "", "useragent": "Mozilla/5.0 (X11; Linux x86_64; rv:35.0) Gecko/20100101 Firefox/35.0"}
Sorry for the wide output-it’s an Apache log…
I then created the table to store the data. JSON appears to be pretty unstructured, so this will do:
SQL> create table jsontest (id number, 2 jdata clob, 3 constraint jsontest check (jdata is json) 4 ) 5 lob (jdata) store as securefile ( 6 enable storage in row 7 ); SQL> create sequence s_jsontest; Sequence created
If you look closely then you’ll see that the JSON data is stored in an inline CLOB-that’s one of the pre-requisites for offloading LOBs in 12c.
Loading JSON
Now I needed a way to get the data into the table. I think I could have used SQLLDR but since I have rusty perl scripting skills I gave DBD::Oracle on 12.1.0.2 a go. The following script inserts records slow-by-slow or row-by-row into the table and is probably not the most efficient way to do this. But one of the reasons I blog is so that I don’t have to remember everything. If you ever wondered how to write a DBI/DBD::Oracle script here’s a starting point. Note the emphasis on “starting point” since the script has been trimmed for readability-essential error checking is not shown. Whenever you work with data make sure that your error handling is top-notch!
#!/usr/bin/perl use strict; use warnings; use DBI; use DBD::Oracle; use Getopt::Long; # these will be set by GetOpt::Long my $service; # has to be in tnsnames.ora my $username; my $jsonfile; GetOptions ( "service=s" => \$service, "user=s" => \$username, "jsonfile=s" => \$jsonfile ); die "usage: load_json.pl --service <servicename> --jsonfile [--user username] " if (!defined ($service ) || !defined ($jsonfile)); die "$jsonfile is not a file" unless ( -f $jsonfile ); print "connecting to service $service as user $username to load file $jsonfile\n"; # about to start... my $dbh = DBI->connect ("dbi:Oracle:$service", "$username", "someCleverPasswordOrCatName") or die ("Cannot connect to service $service: DBI:errstr!"); print "connection to the database established, trying to load data...\n"; # prepare a cursor to loop over all entries in the file my $sth = $dbh->prepare(q{ insert into jsontest (id, jdata) values(s_jsontest.nextval, :json) }); if (! open JSON, "$jsonfile") { print "cannot open $jsonfile: $!\n"; $dbh->disconnect(); die "Cannot continue\n"; } while (<JSON>) { chomp; $sth->bind_param(":json", $_); $sth->execute(); } $dbh->disconnect(); close JSON; print "done\n";
This script read the file and inserted all the data into the table. Again, essential error checking must be added, the script is far from being complete. You also need to set the Perl environment variables to the perl installation in $ORACLE_HOME for it to find the DBI and DBD::Oracle drivers.
Offloading or not?
It turned out that the data I inserted was of course not enough to trigger a direct path read that could turn into a Smart Scan. A little inflation of the table was needed. Once that was done I started to get my feet wet with JSON queries:
SQL> select jdata from jsontest where rownum < 6; JDATA -------------------------------------------------------------------------------- {"host": "192.168.100.1", "identity": "", "user": "", "date": "05/Feb/2015:12:26 {"host": "192.168.100.156", "identity": "", "user": "", "date": "05/Feb/2015:12: {"host": "192.168.100.156", "identity": "", "user": "", "date": "05/Feb/2015:12: {"host": "192.168.100.156", "identity": "", "user": "", "date": "05/Feb/2015:12: {"host": "192.168.100.156", "identity": "", "user": "", "date": "05/Feb/2015:12:
Interesting. Here are a few more examples with my data set. Again, refer to oracle-base.com and the official documentation set for more information about JSON and querying it in the database. It’s by no means an Exadata only feature.
SQL> select count(*) from jsontest where json_exists(jsontest.jdata, '$.host' false on error); COUNT(*) ---------- 2195968 SQL> select count(*) from jsontest where not json_exists(jsontest.jdata, '$.host' false on error); COUNT(*) ---------- 0
And finally, here is proof that you can offload JSON data in Exadata; at least for some of the operations it should be possible judging by the information in v$sqlfn_metadata:
SQL> select name,offloadable from v$sqlfn_metadata where name like '%JSON%' 2 order by offloadable,name; NAME OFF -------------------------------------------------- --- JSON_ARRAY NO JSON_ARRAYAGG NO JSON_EQUAL NO JSON_OBJECT NO JSON_OBJECTAGG NO JSON_QUERY NO JSON_SERIALIZE NO JSON_TEXTCONTAINS2 NO JSON_VALUE NO JSON YES JSON YES JSON_EXISTS YES JSON_QUERY YES JSON_VALUE YES 14 rows selected.
The two entries named “JSON” are most likely “is JSON” and “is not JSON”.
And now with real data volumes on a real system using JSON_EXISTS:
SQL> select /*+ monitor am_I_offloaded */ count(*) 2 from jsontest where json_exists(jsontest.jdata, '$.host' false on error); COUNT(*) ----------- 2195968 Elapsed: 00:00:04.96 SQL> select * from table(dbms_xplan.display_cursor); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------------------- SQL_ID 6j73xcww7hmcw, child number 0 ------------------------------------- select /*+ monitor am_I_offloaded */ count(*) from jsontest where json_exists(jsontest.jdata, '$.host' false on error) Plan hash value: 568818393 --------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 91078 (100)| | | 1 | SORT AGGREGATE | | 1 | 610 | | | |* 2 | TABLE ACCESS STORAGE FULL| JSONTEST | 21960 | 12M| 91078 (1)| 00:00:04 | --------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - storage(JSON_EXISTS2("JSONTEST"."JDATA" FORMAT JSON , '$.host' FALSE ON ERROR)=1) filter(JSON_EXISTS2("JSONTEST"."JDATA" FORMAT JSON , '$.host' FALSE ON ERROR)=1)
So the execution plan looks promising-I can see “table access storage full” and a storage() predicate. Looking at V$SQL I get:
SQL> select sql_id, child_number, 2 case when io_cell_offload_eligible_bytes = 0 then 'NO' else 'YES' end offloaded, 3 io_cell_offload_eligible_bytes/power(1024,2) offload_eligible_mb, 4 io_interconnect_bytes/power(1024,2) interconnect_mb, 5 io_cell_offload_returned_bytes/power(1024,2) returned_mb, 6 io_cell_offload_returned_bytes/io_cell_offload_eligible_bytes*100 offload_pct 7 from v$sql where sql_id = '6j73xcww7hmcw'; SQL_ID CHILD_NUMBER OFF OFFLOAD_ELIGIBLE_MB INTERCONNECT_MB RETURNED_MB OFFLOAD_PCT ------------- ------------ --- ------------------- --------------- ----------- ----------- 6j73xcww7hmcw 0 YES 2606.695313 1191.731941 1191.724129 45.71781455
And to avoid any doubt, I have the SQL Trace as well:
PARSING IN CURSOR #140370400430072 len=120 dep=0 uid=65 oct=3 lid=65 tim=1784977054418 hv=1750582781 ad='5bfcebed8' sqlid='bfwd4t5n5gjgx' select /*+ monitor am_I_offloaded */ count(*) from jsontest where json_exists(jsontest.jdata, '$.host' false on error) END OF STMT PARSE #140370400430072:c=103984,e=272239,p=909,cr=968,cu=0,mis=1,r=0,dep=0,og=1,plh=568818393,tim=1784977054417 EXEC #140370400430072:c=0,e=105,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=568818393,tim=1784977054587 WAIT #140370400430072: nam='SQL*Net message to client' ela= 4 driver id=1650815232 #bytes=1 p3=0 obj#=96305 tim=1784977054666 WAIT #140370400430072: nam='reliable message' ela= 826 channel context=27059892880 channel handle=27196561216 broadcast message=26855409216 obj#=96305 tim=1784977055727 WAIT #140370400430072: nam='enq: KO - fast object checkpoint' ela= 159 name|mode=1263468550 2=65629 0=1 obj#=96305 tim=1784977055942 WAIT #140370400430072: nam='enq: KO - fast object checkpoint' ela= 229 name|mode=1263468545 2=65629 0=2 obj#=96305 tim=1784977056265 WAIT #140370400430072: nam='cell smart table scan' ela= 196 cellhash#=3249924569 p2=0 p3=0 obj#=96298 tim=1784977057370 WAIT #140370400430072: nam='cell smart table scan' ela= 171 cellhash#=822451848 p2=0 p3=0 obj#=96298 tim=1784977057884 WAIT #140370400430072: nam='cell smart table scan' ela= 188 cellhash#=674246789 p2=0 p3=0 obj#=96298 tim=1784977058461 WAIT #140370400430072: nam='cell smart table scan' ela= 321 cellhash#=3249924569 p2=0 p3=0 obj#=96298 tim=1784977061623 WAIT #140370400430072: nam='cell smart table scan' ela= 224 cellhash#=822451848 p2=0 p3=0 obj#=96298 tim=1784977062053 WAIT #140370400430072: nam='cell smart table scan' ela= 254 cellhash#=674246789 p2=0 p3=0 obj#=96298 tim=1784977062487 WAIT #140370400430072: nam='cell smart table scan' ela= 7 cellhash#=3249924569 p2=0 p3=0 obj#=96298 tim=1784977062969 WAIT #140370400430072: nam='cell smart table scan' ela= 25 cellhash#=822451848 p2=0 p3=0 obj#=96298 tim=1784977063016 WAIT #140370400430072: nam='cell smart table scan' ela= 81 cellhash#=674246789 p2=0 p3=0 obj#=96298 tim=1784977063115 WAIT #140370400430072: nam='cell smart table scan' ela= 1134 cellhash#=674246789 p2=0 p3=0 obj#=96298 tim=1784977065442 WAIT #140370400430072: nam='cell smart table scan' ela= 6 cellhash#=3249924569 p2=0 p3=0 obj#=96298 tim=1784977065883 WAIT #140370400430072: nam='cell smart table scan' ela= 14 cellhash#=822451848 p2=0 p3=0 obj#=96298 tim=1784977065917 WAIT #140370400430072: nam='cell smart table scan' ela= 105 cellhash#=674246789 p2=0 p3=0 obj#=96298 tim=1784977066037 WAIT #140370400430072: nam='cell smart table scan' ela= 12 cellhash#=674246789 p2=0 p3=0 obj#=96298 tim=1784977066207 WAIT #140370400430072: nam='cell smart table scan' ela= 6605 cellhash#=3249924569 p2=0 p3=0 obj#=96298 tim=1784977072866 WAIT #140370400430072: nam='cell smart table scan' ela= 27 cellhash#=674246789 p2=0 p3=0 obj#=96298 tim=1784977073877 WAIT #140370400430072: nam='cell smart table scan' ela= 29 cellhash#=3249924569 p2=0 p3=0 obj#=96298 tim=1784977074903 WAIT #140370400430072: nam='cell smart table scan' ela= 907 cellhash#=674246789 p2=0 p3=0 obj#=96298 tim=1784977077783 WAIT #140370400430072: nam='cell smart table scan' ela= 28 cellhash#=674246789 p2=0 p3=0 obj#=96298 tim=1784977078753 WAIT #140370400430072: nam='cell smart table scan' ela= 24 cellhash#=3249924569 p2=0 p3=0 obj#=96298 tim=1784977080860 WAIT #140370400430072: nam='cell smart table scan' ela= 1077 cellhash#=674246789 p2=0 p3=0 obj#=96298 tim=1784977082935 ...
Summary
So yet, it would appear as if JSON is offloaded.