I had the great pleasure to spend the better part of last week at the Norwegian Oracle User Group’s spring conference. Martin Nash and I helped promote the Real Application Cluster platform on the attendees’ laptop in a program called RAC Attack. RAC Attack has its home on the wikibooks website http://racattack.org where the whole program is documented and available for self-study. The purpose of the hands-on labs which Jeremy Schneider started a few years ago is to allow users to get practical experience installing Oracle Linux, Grid Infrastructure and the RDBMS binaries before creating a two node database. Following the database creation a practical session ensues which explains certain HA concepts with RAC such as session failover. We are planning on greatly enhancing the lab session as we go along. If you have any suggestions about what you would like to see covered by us then please let us know!
Some of you may have seen on twitter that I was working on understanding collectl. So why did I start with this? First of all, I was after a tool that records a lot of information on a Linux box. It can also play information back, but this is out of scope of this introduction.
In the past I have used nmon to do similar things, and still love it for what it does. Especially in conjunction with the nmon-analyzer, an Excel plug in it can create very impressive reports. How does collectl compare?
Getting collectl is quite easy-get it from sourceforge: http://sourceforge.net/projects/collectl/
The project website including very good documentation is available from sourceforge as well, but uses a slightly different URL: http://collectl.sourceforge.net/
I suggest you get the archive-independent RPM and install it on your system. This is all you need to get started! The impatient could type “collectl” at the command prompt now to get some information. Let’s have a look at the output:
$ collectl waiting for 1 second sample... #<--------CPU--------><----------Disks-----------><----------Network----------> #cpu sys inter ctxsw KBRead Reads KBWrit Writes KBIn PktIn KBOut PktOut 1 0 1163 10496 113 14 18 4 8 55 5 19 0 0 1046 10544 0 0 2 3 164 195 30 60 0 0 1279 10603 144 9 746 148 20 67 11 19 3 0 1168 10615 144 9 414 69 14 69 5 20 1 0 1121 10416 362 28 225 19 11 71 8 35 Ouch!
The “ouch” has been caused by my CTRL-c to stop the execution.
Collectl is organised to work by subsystems, the standard option is to print CPU, disk and network subsystem, aggregated.
I have already written about the renamedg command, but since then fell in love with ASMLib. The use of ASMLib introduces a few caveats you should be aware of.
This document presents research I performed with ASM on a lab environment. It should be applicable to any environment, but you should NOT use this for production-the renamedg command still is buggy, and you should not mess with ASM disk headers in an important system such as production or staging/UAT. You set the importance here! The recommended setup for cloning disk groups is to use a data guard physical standby database on a different storage array to create a real time copy of your production database on that array. Again, do not use you production array for this!
This post may not be relevant to the majority of people out there but I didn’t find a lot of useful documentation out there regarding RAC 11.2 on OCFS2 installations. And also I am in desparate need of my own test environment! Oracle Cluster File System version 2 is a promising cluster file system on Linux, and I stand corrected about the state of development (see also comment at the bottom of the post!), ocfs2 file system OCFS2 1.4.4-1 dates from 2009.09.25.
The reason for evaluating this combination is that I am constrained on disk space and memory so saving a little bit of RAN by not having to have a set of ASM instances is good for me (I hope). Also I’d like to try having the RDBMS binaries setup as a shared home-a configuration I haven’t used before.
So here are the key facts about my environment:
- dom0: OpenSuSE 11.2 kernel 184.108.40.206-0.1-xen x86-64
- domU: Oracle Enterprise Linux 5 update 4 x86-64
I created the domUs using virt-manager, an openSuSE supplied tool. Unfortunately it can’t set the “shareable” attribute to the shared storage (libvirt’s equivalent to the exclamation mark in the disk= directive) so I have to do this manually. “virsh edit <domain>” is a blessing-it is no more necessary to do the four-step
- “virsh dumpxml <domain> > /tmp/domain.xml
- virsh undefine <domain>
- vi /tmp/domain.xml
- virsh define /tmp/domain.xml
How nice is that! Continue reading
A quick note on how to get around this problem. Background: many shops uses individual operating system accounts for DBAs and keep the oracle password secret. Once connected, the user would sudo to oracle: “sudo su – oracle” which is explicitly allowed. The auditors can then trace who did what and when, otherwise the logins to oracle would be almost completely anonymous.
Here’s a sample session output to demonstrate the problem:
login as: mbh mbh@prodbox's password: Last login: Thu Jan 14 12:11:12 2010 from desktop001 RHN kickstart on 2009-07-27 [mbh@prodbox ~]$ sudo su - oracle [oracle@prodbox ~]$ screen Cannot open your terminal '/dev/pts/4' - please check.
This is slightly frustrating-starting the screen session with your account works fine, but then no one can follow up and connect to your session. The quick but insecure solution is as follows:after logging in as yourself, find out which tty you use:
[oracle@prodbox ~]$ w | grep mbh mbh pts/4 desktop001 12:14 0.00s 0.05s 0.07s sshd: mbh
Then grant permission to your tty to the world:
[mbh@prodbox ~]$ chmod a+rw /dev/pts/4
Alternatively, add the oracle user to group tty, which owns all the ttys.
Now sudo to oracle and start your screen sesssion:
[mbh@prodbox ~]$ sudo su - oracle [oracle@prodbox ~]$ screen [screen is terminating]
Also check the comment by Ariel for another solution. Anyway, check with your security team what method is most appropriate in your situation.
I recently upgraded my laptop’s opensuse 11.1 installation to 11.2, mainly because it has updated xen to version 3.4 which makes it one of the most modern distributions with xen support available. I did some research first about which linux distribution would best suit my needs. When I came across a post which said that Fedora 12 had no (official) kernel support for use as dom0 the decision was made. I know I could have used a debian clone (ubuntu 9.10 seemed quite attractive), but for personal reasons I preferred a RPM based system. I was very pleasently surprised that the Intel GMA 4500MHD graphics chipset finally found hardware acceleration, making it so much more enjoyable to browse the web.
This has proven to be a worthy adversary – “PRVF-4664 : Found inconsistent name resolution entries for SCAN name”. The error has been thrown at the end of a 220.127.116.11 Grid Infrastructure installation, during the cluster verification utility run. Some bloggers I checked recommended workarounds, but I wanted the solution.
Some facts first:
- Oracle Grid Infrastructure 11.2
- RHEL 5.4 32bit
- named 9.3.2
- GPnP (grid plug and play) enabled
The error message from the log was as follows:
INFO: Checking name resolution setup for "rac-scan.rac.the-playground.de"... INFO: ERROR: INFO: PRVF-4664 : Found inconsistent name resolution entries for SCAN name INFO: "rac-scan.rac.the-playground.de" INFO: Verification of SCAN VIP and Listener setup failed
I have scratched my head, browsed the Internet and came to a metalink note (887471.1) which didn’t help (I didn’t try to resolve the SCAN through /etc/hosts).
So something must have been wrong with the subdomain delegation for DNS. And in fact, once that had been rectified using a combination of rndc, dig and nslookup, the error went away. In short, if your /etc/resolv.conf points to your nameserver (not the GNS server), and you can resolve hostnames such as the scan as part of the subdomain, the error will go away.
I previously blogged about this, you can check the setup here: