Martins Blog

Trying to explain complex things in simple terms

Archive for the ‘Oracle’ Category

Oracle (Database and Middleware) related posts

Linux large pages and non-uniform memory distribution

Posted by Martin Bach on May 17, 2013

In my last post about large pages in 11.2.0.3 I promised a little more background information on how large pages and NUMA are related.

Background and some history about processor architecture

For quite some time now the CPUs you get from AMD and Intel both are NUMA, or better: cache coherent NUMA CPUs. They all have their own “local” memory directly attached to them, in other words the memory distribution is not uniform across all CPUs. This isn’t really new, Sequent has pioneered this concept on x86 a long time ago but that’s in a different context. You really should read Scaling Oracle 8i by James Morle which has a lot of excellent content related to NUMA in it, with contributions from Kevin Closson. It doesn’t matter that it reads “8i” most of it is as relevant today as it was then.

So what is the big deal about NUMA architecture anyway? To explain NUMA and why it is important to all of us a little more background information is on order.

Some time ago processor designers and architects of industry standard hardware could no longer ignore the fact that a front side bus (FSB) proved to be a bottleneck. There were two reasons for this: it was a) too slow and b) too much data had to go over it. As one direct consequence DRAM memory has been directly attached to the CPUs. AMD has done this first with it’s Opteron processors in its AMD64 micro architecture, followed by Intel’s Nehalem micro architecture. By removing the requirement of data retrieved from DRAM to travel across a slow bus latencies could be removed.

Now imagine that every processor has a number of memory channels to which DDR3 (DDR4 could arrive soon!) SDRAM is attached to. In a dual socket system, each socket is responsible for half the memory of the system. To allow the other socket to access the corresponding other half of memory some kind of interconnect between processors is needed. Intel has opted for the Quick Path Interconnect, AMD (and IBM for p-Series) use Hyper Transport. This is (comparatively) simple when you have few sockets, up to 4 each socket can directly connect to every other without any tricks. For 8 sockets it becomes more difficult. If every socket can directly communicate with its peers the system is said to be glue-less which is beneficial. The last production glue-less system Intel released was based on the Westmere architecture. Sandy Bridge (current until approximately Q3/2013) didn’t have an eight-way glue-less variant, and this is exactly why you get Westmere-EX in the X3-8, and not Sandy Bridge as in the X3-2.

Anyway, your system will have local and remote memory. For most of us, we are not going to notice this at all since there is little point in enabling NUMA on systems with two sockets. Oracle still recommends that you only enable NUMA on 8 way systems, and this is probably the reason the oracle-validated and preinstall RPMs add “numa=off” to the kernel command line in your GRUB boot loader.

Read the rest of this entry »

Posted in Linux, VLDB | 3 Comments »

4k sector size and Grid Infrastructure 11.2 installation gotcha

Posted by Martin Bach on April 29, 2013

Some days are just too good to be true :) I ran into an interesting problem trying to install 11.2.0.3.0 Grid Infrastructure for a two node cluster. The storage was presented via iSCSI which turned out to be a blessing and inspiration for this blog post. So far I haven’t found out yet how to create “shareable” LUNs in KVM the same way I did successfully with Xen. I wouldn’t recommend general purpose iSCSI for anything besides lab setups though. If you want network based storage, go and use 10GBit/s Ethernet and either use FCoE or (direct) NFS.

Here is my setup. Storage is presented in 3 targets using tgtd on the host:

  1. Target 1 contains 3×2 GB LUNs for OCR and voting disks in normal redundancy.
  2. Target 2 contains 3×10 GB LUNs for +DATA
  3. Target 2 contains 3×10 GB LUNs for +RECO

iSCSI initiators are Oracle Linux 6.4 on KVM with the host running OpenSuSE 12.3 providing the iSCSI targets. Yes, I know I’m probably the only Oracle DBA running SuSE, but to my defence I have a similar system with Oracle Linux 6.4 throughout and both work.

So besides the weird host OS there is nothing special. Since I’m lazy sometimes and don’t particularly like udev I decided to use ASMLib for device name persistence on the iSCSI LUNs. This turned out to be crucial, otherwise I’d never had written this post.

Read the rest of this entry »

Posted in 11g Release 2, KVM | Tagged: , | 3 Comments »

Grid Infrastructure And Database High Availability Deep Dive Seminars 2013

Posted by Martin Bach on April 24, 2013

So this is a little bit of a plug for myself and Enkitec but I’m running my Grid Infrastructure And Database High Availability Deep Dive Seminars again for Oracle University. This time these events are online, so no need to come to a classroom at all.

Here is the short description of the course:

Providing a highly available database architecture fit for today’s fast changing requirements can be a complex task. Many technologies are available to provide resilience, each with its own advantages and possible disadvantages. This seminar begins with an overview of available HA technologies (hard and soft partitioning of servers, cold failover clusters, RAC and RAC One Node) and complementary tools and techniques to provide recovery from site failure (Data Guard or storage replication).

In the second part of the seminar, we look at Grid Infrastructure in great detail. Oracle Grid Infrastructure is the latest incarnation of the Clusterware HA framework which successfully powers every single 10g and 11g RAC installation. Despite its widespread implementation, many of its features are still not well understood by its users. We focus on Grid Infrastructure, what it is, what it does and how it can be put to best use, including the creation of an active/passive cold failover cluster for web and database resources.

If you are interested I would like to invite you to head over to the Oracle University website here which has a more extensive synopsis and all the detail you need:

http://education.oracle.com/pls/web_prod-plq-dad/db_pages.getCourseDesc?dc=D81641_2043034

UPDATE: I received several emails and comments that the above link does not work. I couldn’t reproduce this until today. It appears to be an issue with the country selection. If you have USA selected in the top right corner the link won’t work, switching to United Kingdom (my preference) will fetch the course detail. I don’t quite understand as to why that is the case since the class is virtual and not depending on a country…

I hope to hear from you during the course!

Posted in 11g Release 2, Linux, Public Appearances | 7 Comments »

Limiting the Degree of Parallelism via Resource Manager and a gotcha

Posted by Martin Bach on April 23, 2013

This might be something very obvious for the reader but I had an interesting revelation recently when implementing parallel_degree_limit_p1 in a resource consumer group. My aim was to prevent users mapped to a resource consumer group from executing any query in parallel. The environment is fictional, but let’s assume that it is possible that maintenance operations for example leave indexes and tables decorated with a parallel x attribute. Another common case is the restriction of PQ resource to users to prevent them from using all the machine’s resources.

This can happen when you perform an index rebuild for example in parallel to speed the operation up. However the DOP will stay the same with the index after the maintenance operation, and you have to explicitly set it back:

SQL> alter index I_T1$ID1 rebuild parallel 4;

Index altered.

SQL> select index_name,degree from ind where index_name = 'I_T1$ID1';

INDEX_NAME		       DEGREE
------------------------------ ----------------------------------------
I_T1$ID1		       4

SQL>

Read the rest of this entry »

Posted in 11g Release 2, Performance | Tagged: , , , , | 4 Comments »

Get a feel for enterprise block level replication using drbd

Posted by Martin Bach on July 2, 2012

I didn’t really have a lot of exposure to block-level replication on the storage level before an engagement in the banking industry. I’m an Oracle DBA, and I always thought: why would I want to use anything but Oracle technology for replicating my data from one data centre to another? I need to be in control! I want to see what’s happening. Why would I prefer storage replication over Data Guard?

For a great many sites Data Guard is indeed all you need. Especially if you don’t have a storage array with a replication option. But many large enterprises have historically used large storage area networks with many enterprise features, including block level replication from array to array. They all come under their own name, and all major storage vendors have them. With the risk of speaking too generally, all of the block level replication allows you to somehow copy data from array A in data centre A to array B in data centre B. The data centres are usually geographically dispersed so as to avoid the impact of catastrophes. The storage replication happens without any DBA intervention or even visibility, harking back to the 90s mantra of “storage administrator does storage, system administrator does the OS and the database administrator works on the database”. I have written about this in the context of Exadata before. Read the rest of this entry »

Posted in Linux, Oracle | Tagged: , | 4 Comments »

How to set up data guard broker for RAC

Posted by Martin Bach on June 29, 2012

This is pretty much a note to myself on how to set up Data Guard broker for RAC 11.2.0.2+. The tests have been performed on Oracle Linux 5.5 with the Red Hat Kernel. Oracle was 11.2.0.2. Sadly my lab server didn’t support more than 2 RAC nodes, so everything has been done on the same cluster. It shouldn’t make a difference though. If it does, please let me know).

WARNING: there are some rather deep changes to the cluster here, be sure to have proper change control around making such amendments as it can cause outages! Nuff said.

Unfortunately I didn’t take notes of the configuration as it was before, so the post is going to be a lot shorter and less dramatic, but it’s useful as a reference (I hope) nevertheless. Now what’s the situation? Imagine you have a two node RAC cluster with separation of duties in place-”grid” owns the GRID_HOME, while “oracle” owns the RDBMS binaries. Imagine further you have two RAC database, ORCL and STDBY. STDBY has only just been duplicated for standby, so there is nothing in place which links the two together.

Read the rest of this entry »

Posted in 11g Release 2, Data Guard, Linux | 2 Comments »

Little things worth knowing-static and dynamic listener registration

Posted by Martin Bach on June 20, 2012

As part of  a recent project to remove a vulnerability in relation to CVE-2012-1675 it became apparent that there are certain misconceptions around dynamic and static listener registration which are hard to get rid of. The below is applicable for single instance Oracle only!

Now let’s start with a quiz: what does the following output imply:

$ lsnrctl status

LSNRCTL for Linux: Version 11.2.0.2.0 - Production on 20-JUN-2012 11:22:01

Copyright (c) 1991, 2010, Oracle.  All rights reserved.

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=server1)(PORT=1571))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 11.2.0.2.0 - Production
Start Date                15-JUN-2012 11:14:27
Uptime                    5 days 0 hr. 7 min. 34 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/oracle/product/11.2/dbhome_1/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/server1/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=server1)(PORT=1571)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1571)))
Services Summary...
Service "ORA11202" has 1 instance(s).
  Instance "ORA11202", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully

Read the rest of this entry »

Posted in Oracle | 3 Comments »

How to use vi-style editing in SQL*Plus

Posted by Martin Bach on June 14, 2012

This post is nothing new, and I created it after a little discussion on twitter about how to use readline support in SQL*Plus. The idea is not new, and I have compiled and used rlwrap for quite some time.

At the time, Frits Hoogland asked me why I didn’t use the EPEL package-and I had to admit to myself that I didn’t know the Extra Package for Enterprise Linux repository at all. But there is more to rlwrap and Linux I didn’t know, but first things first.

Installing rlwrap from EPEL

This is really simple-you can either add the EPEL repository to your /etc/yum.repos.d/ directory or simply download the rlwrap package and install it via RPM. A simple wget on your host does the trick. You can set environment variables when you’d like to use a proxy as shown here:

$ export http_proxy=http://your.proxy.server:proxyPort/
$ export https_proxy=https://your.proxy.server:proxyPort/

Depending your release of Enterprise Linux, you can find the rlwrap package here:

Then wget should download the file for you, at the time  of writing 0.37 was current.

Read the rest of this entry »

Posted in Linux, Oracle | Tagged: , | 5 Comments »

Kernel UEK 2 on Oracle Linux 6.2 fixed lab server memory loss

Posted by Martin Bach on June 13, 2012

A few days ago I wrote about my new lab server and the misfortune with kernel UEK (aka 2.6.32 + backports). It simply wouldn’t recognise the memory in the server:

# free -m
             total       used       free     shared    buffers     cached
Mem:          3385        426       2958          0          9        233
-/+ buffers/cache:        184       3200
Swap:          511          0        511

Ouch. Today I gave it another go, especially since my new M4 SSD has arrived. My first idea was to upgrade to UEK2. And indeed, following the instructions on Wim Coekaerts’s blog (see references), it worked:

[root@ol62 ~]# uname -a
Linux ol62.localdomain 2.6.39-100.7.1.el6uek.x86_64 #1 SMP Wed May 16 04:04:37 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
[root@ol62 ~]# free -m
             total       used       free     shared    buffers     cached
Mem:         32221        495      31725          0          5         34
-/+ buffers/cache:        456      31764
Swap:          511          0        511

Note the 2.6.39-100.7.1! It’s actually past that and version 3.x, but to preserve compatibility with a lot of software parsing the kernel revision number in 3 tuples Oracle decided to stick with 2.6.39. But then the big distributions don’t really follow the mainstream kernel numbers anyway.

Now if anyone could tell me if UEK2 is out of beta? I know it’s not supported for the database yet, but it’s a cool kernel release and I can finally play around with the “perf” utility Kevin Closson and Frits Hoogland have mentioned so much about recently. Read the rest of this entry »

Posted in 11g Release 2, Linux | Tagged: | 4 Comments »

Performance testing with Virident PCIe SCM

Posted by Martin Bach on May 25, 2012

Thanks to the kind introduction from Kevin Closson I was given the opportunity to benchmark the Virident PCIe flash cards. I have written a little review of the testing conducted, mainly using SLOB. To my great surprise Virident gave me access to a Westmere-EP system running a top of the line 2s12c24t system with lots of memory.

In summary the testing shows that the “flash revolution” is happening, and that there are lots of vendors out there building solutions for HPC and Oracle database workloads alike. Have a look at the attached PDF for the full story if you are interested. When looking at the numbers please bear in mind it was a two socket system! I’m confident the server could not max out the cards.

Full article:

Virident testing martin bach consulting

Posted in 11g Release 2, Automatic Storage Management, Linux | 1 Comment »

 
Follow

Get every new post delivered to your Inbox.

Join 1,154 other followers