Martins Blog

Trying to explain complex things in simple terms

Archive for May, 2013

I’m speaking at Enkitec Extreme Exadata Expo E4

Posted by Martin Bach on May 22, 2013

Just a short notice to those interested that I’m very proud to be in the lineup for Enkitec’s Extreme Exadata Expo. The event takes place August 5-6, 2013 and is held in the Four Seasons Resort & Spa, Irving, Texas. There is plenty of time for you to register.

I was really sorry I missed out last year but this time I’m glad to participate and attend!

official site

The list of great speakers includes too many to name here-you should see for yourself about who is coming to Dallas this August and why this event is unmissable.

I’m hoping to see you there!

Posted in Exadata, Public Appearances | Tagged: , , | Leave a Comment »

Linux large pages and non-uniform memory distribution

Posted by Martin Bach on May 17, 2013

In my last post about large pages in 11.2.0.3 I promised a little more background information on how large pages and NUMA are related.

Background and some history about processor architecture

For quite some time now the CPUs you get from AMD and Intel both are NUMA, or better: cache coherent NUMA CPUs. They all have their own “local” memory directly attached to them, in other words the memory distribution is not uniform across all CPUs. This isn’t really new, Sequent has pioneered this concept on x86 a long time ago but that’s in a different context. You really should read Scaling Oracle 8i by James Morle which has a lot of excellent content related to NUMA in it, with contributions from Kevin Closson. It doesn’t matter that it reads “8i” most of it is as relevant today as it was then.

So what is the big deal about NUMA architecture anyway? To explain NUMA and why it is important to all of us a little more background information is on order.

Some time ago processor designers and architects of industry standard hardware could no longer ignore the fact that a front side bus (FSB) proved to be a bottleneck. There were two reasons for this: it was a) too slow and b) too much data had to go over it. As one direct consequence DRAM memory has been directly attached to the CPUs. AMD has done this first with it’s Opteron processors in its AMD64 micro architecture, followed by Intel’s Nehalem micro architecture. By removing the requirement of data retrieved from DRAM to travel across a slow bus latencies could be removed.

Now imagine that every processor has a number of memory channels to which DDR3 (DDR4 could arrive soon!) SDRAM is attached to. In a dual socket system, each socket is responsible for half the memory of the system. To allow the other socket to access the corresponding other half of memory some kind of interconnect between processors is needed. Intel has opted for the Quick Path Interconnect, AMD (and IBM for p-Series) use Hyper Transport. This is (comparatively) simple when you have few sockets, up to 4 each socket can directly connect to every other without any tricks. For 8 sockets it becomes more difficult. If every socket can directly communicate with its peers the system is said to be glue-less which is beneficial. The last production glue-less system Intel released was based on the Westmere architecture. Sandy Bridge (current until approximately Q3/2013) didn’t have an eight-way glue-less variant, and this is exactly why you get Westmere-EX in the X3-8, and not Sandy Bridge as in the X3-2.

Anyway, your system will have local and remote memory. For most of us, we are not going to notice this at all since there is little point in enabling NUMA on systems with two sockets. Oracle still recommends that you only enable NUMA on 8 way systems, and this is probably the reason the oracle-validated and preinstall RPMs add “numa=off” to the kernel command line in your GRUB boot loader.

Read the rest of this entry »

Posted in Linux, VLDB | 6 Comments »

More on use_large_pages in Linux and 11.2.0.3

Posted by Martin Bach on May 13, 2013

Large Pages in Linux are a really interesting topic for me as I really like Linux and trying to understand how it works. Large pages can be very beneficial for systems with large SGAs and even more so for those with large SGA and lots of user sessions connected.

I have previously written about the benefits and usage of large pages in Linux here:

So now as you may know there is a change to the init.ora parameter “use_large_pages” in 11.2.0.3. The parameter can take these values:

SQL> select value,isdefault
  2  from V$PARAMETER_VALID_VALUES
  3* where name = 'use_large_pages'

VALUE		     ISDEFAULT
-------------------- --------------------
TRUE		     TRUE
AUTO		     FALSE
ONLY		     FALSE
FALSE		     FALSE

There is a new value named “auto” that didn’t exist prior to 11.2.0.3. The intention is to create large pages at instance startup if possible, even if /etc/sysctl.conf doesn’t have an entry for vm.nr_hugepages at all. The risk though is that-as with dynamic creation of large pages by echoing values into /proc/sys/vm/nr_hugepages-is that you get fewer than you expect. Maybe even 0. Read the rest of this entry »

Posted in Linux | 8 Comments »