Martins Blog

Trying to explain complex things in simple terms

UKOUG Conference-session review part 3

Posted by Martin Bach on December 4, 2009

The final installment of this series deals with Tom Kyte’s top 10 features in 11.2 and Wolfgang Breitling’s “seeding statistics”.

Tom Kyte – Top 10 new 11.2 features

1) dbms_parallel_execution:

I was a little late so didn’t get the essence of this. Seems to automatically do what Tom described as “poor man’s parallelism” in his effective oracle by design book. Take a huge table and instruct Oracle to subdivide it into non overlapping rowid ranges and then give it a command to work on these ranges in parallel. All of this was possible before, but required manual skripting-now it should happen without that hassle. As always, this turns an atomic update into a number of atomic operations. Need to roll back: restore or flashback database.

2) More analytics:

listagg: create a whatever separated list
nth_value: return n-th field of a partition (as in max(cola) over (partition …))

3) Execute privilege on directory

Allows us to execute code (with ext table preprocessor) in a directory
Wasn’t too clear if we we need to get the output from the preprocessor to standard out?
Leaves a number of questions, most importantly: what about security? That preprocessor executes anything within the privileges of the oracle user. What about rm -rf $ORACLE_HOME if some malicious code replaces your own script?

4) Recursive subquery factoring

As Tom said: the recusrive subquery factoring is easier to understand if you don’t know connect_by yet. Otherwise the whole syntax is confusing at first, which I have to admit, and neither does it look more elegantly. I personally like the connect_by more, even though I haven’t used it for 22 years – apparently the feature has been in the database since version 2 of the kernel. The syntax of the new recursive subquery factoring can be found in the SQL Language Reference as part of tbe select statement

With the new syntax, you get the first (root) element, called anchor member. After you’ve got it, you add the recursive members. Main use: now ANSI compatible, don’t need to change the syntax for compatibility with db2 and sybase. Don’t know why this is hyped but never mind.

5) Improved time travel – flashback query

Retrospective view on flashback query which uses undo to get read consistent views on data. Reflecting on this, 11.1 FDA-didn’t really make practical sense since FDA won’t allow you to modify the table with operations that don’t generate undo, such as adding a column. Still would like to find out how much overhead that generates, background process is fbda. What about query performance as well?

Anyway, in 11.2 this feature has been beefed up and allows for a lot more management operations on a table in a flash data archive. We’ll see if that is any good, I’m sure that one critical operation we need won’t be supported.

6) File watchers

An event will be generated if something happens to a directory (such as a new file stored). You can make use of the file watchers even outside the oracle account (which is a _very_ good idea!).  The whole concept works with the help of the scheduler and looks quite nice. But I remember somebody on the oracle-l list had frustrating problems with the feature.

7) Deferred Segment Creation

I remember Nuno Sato complaining about all the Peoplesoft and SAP systems out ther that create tens of thousands of objects in the database without ever populating them. Even though storage is cheap, 5000 initial extents of > 64k do waste space.

Now the creation of the initial extent can be deferred until the segment is actually used. Finally SAP and Peoplesoft’s 100000000 (slighly exaggerated) tables won’t waste my space, good! And don’t forget that indexes defined on the table will also be allocated an initial extent! How does it work? It’s not enabled by default, you switch it on on a session or system wide level:

alter system set deferred_segment_creation=true; — done

This feature also makes it easy to find out if tables are actually used-a 0 extents segment won’t contain data. Once an extent has been allocated to a table, it will always keep one at least one extent, even if it’s been emptied (0 lines) via truncate. The “alter table move” command also won’t get rid of that minimum extent.

8) Flash Cache

Now that is really cool! The feature is availble for Enterprise Linux or Sun only. What a surprise here! Think of it as an extended buffer cache: use flash disk (SSD), which one would hope to be fast to store blocks aged out from the buffer cache, it’s sort of a level 2 cache. The SSD keeps the block on disk which really means in its own memory. The “true” copy of the block will still be on magnetic disk only-during crash recovery, only the block on disk will be used. Flash cache can speed reads up considerably by getting “warm” blocks from the SSD rather than magnetic disk, and even if it’s not as fast as getting it from physical memory on the server, it could potentially be really fast. As with any memory structure, if you shut the database down, you lose it all. Another reason never to shut a database down :)

9) Parallel improvements

Usefulness of parallel query decreases as you add more users to the system. The parallel query feature assumes that very few users (one actually) makes use of all the resources on a machine. This dates back to the time when a server usually had very few CPUs and very little memory compared to today. Thinking of database size aorund 100G, 2 CPUs and 128M of memory (early ninetees). Now in 11.2 Oracle decided they need to play well with others. So in 11.2, no parallel plan will be generated if the serial plan exceeds parallel_min_time_threshold, a value the dba sets. Otherwise it could come up with a more appropriate DOP.

Parallel statement queuing: prevent user from issueing large dop flooding the server. Parallel queries will be queued.

in memory parallel execution: remember that up to 11.1 parallel query meant checkpoint table to disk and use direct io bypassing the buffer cache completely. 11.2 can try to fit an extremely small table in memory and do it all in memory. If the table is very large there are 2 options – old 7.1.6 way (checkpoint to disk), then read. Otherwise might try to partition the disk for each node of a RAC into each node’s buffer cache.

10) Edition based redefinition: in SE and EE!

This feature has probably been discussed to death, but didn’t know it in detail so I was curious. Tom didn’t explain how it worked with tables and indexes so I assume I’ll have to wait until 12.1 for this to be useful. Or maybe we need to use dbms_redefinition lots more and don’t do the switchover to the changed table until we activate the new version of the code. But that requires a lot of testing, doesn’t look uniform. This happened a lot in the past, remember OMF in 9i? Only made sense in conjunction with ASM in 10.1. It also reminds me a bit of the Flashback Data Archive feature which is a lot more useful now in 11.2.

There are lots of problems with online changes of pl/sql code-create or replace requires everyone to finish the execution of the application. Application upgrade requires downtime, even today! “Grant on prod has same effect as create or replace” for a procedure: couldn’t reproduce this.

From 11.2 on every database will have a version (edition) ora$base as baseline, and you can’t avoid it. I couldn’t see any reference to these from 10046 traces, but my testing was quite limited. Editions are created by the DBAs, and you set the current edition on the session level while testing. In a nutshell, everyone continues to use the production edition while the new edition (version) of the PL/SQL code is deployed. Testers can access the new code without interrupting production use, and once approved, the DBA sets the edition to be active database-wide.

Wolfgang Breitling “Seeding statistics”

Last presentation of the day for me, and only a small attendance in hall 9. Wolfgang ranks on the same level as Jonathan Lewis when it comes to the optimiser so I was quite keen to see him. Interestingly, even though having lived years and years in Canada he hasn’t lost his accent and some of the idioms he used made more sense to me than David who’s a native speaker. Simply apply a literal translation to it from German to English and you’ll see :)

Wolfgang explained in a number of test cases how to “hack” the statistics of a table to make theoptimiser chose the optimal plan. The reason the users don’t simply gather stats on the table(s) involved in the query is simple: it’s a packaged application with a lot of red tape around it.

As much as I admired his talk, I have to say that modifying statistics this way needs to be properly documented and shared, otherwise other team members will have a hard time understanding what’s going on.

At this time unfortunately I had to make a run for the train station to catch the 18:30 train to Euston on my 3.5 hour journey back to Brighton. I so would have wished to stay for drinks but that wasn’t possible this year. It’s firmly on the calendar for next year!

2 Responses to “UKOUG Conference-session review part 3”

  1. Wolfgang Breitling said

    “I have to say that modifying statistics this way needs to be properly documented and shared, otherwise other team members will have a hard time understanding what’s going on.”

    I fully agree with your comment and I’ll make it a point to stress that in future presentations. Not only do the changes and the reasons for them and, if possible, the reason why (we think) they have the desired effect, have to be documented but the seeding has to be revisited after upgrades. Not only major upgrades but also point releases and even patches. Point in case: the custom frequency histogram I presented in the 2nd example fails to produce the expected cardinality estimate for the values outside the histogram in 11g or even after an upgrade to 10.2.0.4. The “fix” does work retroactively in 9i and all 10g releases as well.

    • Martin said

      Wolfgang, thanks for passing by! I think that comment of mine immediately sprung to mind when thinking about potential implementation in my environment. Not all the staff know the importance of statistics to the optimiser; it’s by no means criticism to the presentation which was very interesting.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: