Martin's Blog

Another Oracle DBA blog

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 513 other followers

  • UKOUG tebs

  • Subscribe

  • Copyright

    All content is © Martin Bach and "Martin's Blog", 2009-2011. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Martin Bach and "Martin's Blog" with appropriate and specific direction to the original content.

RAC One Node on Oracle Enterprise Manager 12c

Posted by Martin Bach on January 17, 2012

One of the promises from Oracle for OEM 12c was improved support for Oracle RAC One Node. I have spent quite a bit of time researching RON, and wrote a little article in 2 parts about it which you can find here:

One of my complaints with it was the limited support in OEM 11.1. At the time I was on a major consolidation project, which would have used OEM for management of the database.

OEM 11.1

Unfortunately OEM 11.1 didn’t have support for RAC One Node. Why? RON is a cluster database running on just one node. The interesting bit is that the ORACLE_SID is your normal ORACLE_SID with an underscore and a number. Under normal circumstances that number is _1, or RON_1. But as soon as you relocate the database using srvctl relocate database -d a second instance RON_2 is started until all sessions have failed over.

OEM obviously doesn’t know about RON_2: it was never discovered. Furthermore, the strict mapping of instance name to host is no longer true (the same applies for policy managed databases by the way!). A few weeks and a few switchover operations later you could be running RON_2 on racnode1.

As a consequence, the poor on-call DBA is paged about a database that has gone down, when it hasn’t-it’s up and running. As a DBA, I wouldn’t want that. After discussions with Oracle they promised to fix that problem, but it hasn’t made it into 11.1 hence this blog post about 12.

Read the rest of this entry »

Posted in 11g Release 2, Cloud Control, Linux | Tagged: , | Leave a Comment »

Beware of ACFS when upgrading to 11.2.0.3

Posted by Martin Bach on January 10, 2012

This post is about a potential pitfall when migrating from 11.2.0.x to the next point release. I stumbled over problem this one on a two node cluster.

The operating system is Oracle Linux 5.5 running 11.2.0.2.3 and I wanted to go to 11.2.0.3.0. As you know, Grid Infrastructure upgrades are out-of-place, in other words require a separate Oracle home. This is also one of the reasons I wouldn’t want less than 20G on a non-lab like environment for the Grid Infrastructure mount points …

Now when you are upgrading from 11.2.0.x to 11.2.0.3 you need to apply a one-off patch, but the correct one! Search for patch number 12539000 (11203:ASM UPGRADE FAILED ON FIRST NODE WITH ORA-03113) and apply the one that matches your version-and pay attention to these PSUs! There is the obvious required opatch update to be performed before again as well.

So much for the prerequisites. Oracle 11.2.0.3 is available as patch 10404530, and part 3 is for Grid Infrastructure which has to be done first. This post only covers the GI upgrade, the database part is usually quite uneventful in comparison…

Upgrading Grid Infrastructure

After unzipping the third patch file you start runInstaller. But not before having carefully unset all pointers to the current 11.2.0.2 GRID_HOME (ORACLE_HOME, ORACLE_SID, LD_LIBRARY_PATH, ORA_CRS_HOME, etc)!

Clicking through OUI is mostly a matter of “next”, “next”, “next”, the action starts with the rootupgrade.sh script. Here’s the output from node1:

[root@node1 ~]# /u01/crs/11.2.0.3/rootupgrade.sh
Performing root user operation for Oracle 11g

The following environment variables are set as:
ORACLE_OWNER= oracle
ORACLE_HOME=  /u01/crs/11.2.0.3

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The file "oraenv" already exists in /usr/local/bin.  Overwrite it? (y/n)
[n]: y
Copying oraenv to /usr/local/bin ...
The file "coraenv" already exists in /usr/local/bin.  Overwrite it? (y/n)
[n]: y
Copying coraenv to /usr/local/bin ...

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/crs/11.2.0.3/crs/install/crsconfig_params
Creating trace directory
User ignored Prerequisites during installation

ASM upgrade has started on first node.

CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'node1'
...
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'node1' has completed
CRS-2673: Attempting to stop 'ora.gpnpd' on 'node1'
CRS-2677: Stop of 'ora.gpnpd' on 'node1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'node1' has completed
CRS-4133: Oracle High Availability Services has been stopped.
OLR initialization - successful
Replacing Clusterware entries in inittab
clscfg: EXISTING configuration version 5 detected.
clscfg: version 5 is 11g Release 2.
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
PRCA-1056 : Unable to upgrade ACFS from version 11.2.0.2.0 to version 11.2.0.3.0
PRCT-1011 : Failed to run "advmutil". Detailed error: advmutil:
CLSU-00100: Operating System function: open64 failed with error data: 2advmutil: CLSU-00101: Operating System error message: No such file or directory|advmutil: CLSU-00103: error location: OOF_1|advmutil: CLSU-00104: additional error information: open64 (/dev/asm/orahomevol-315)|advmutil: ADVM-09006: Error opening volume /dev/asm/orahomevol-315
srvctl upgrade model -first ... failed
Failed to perform first node tasks for cluster modeling upgrade at /u01/crs/11.2.0.3/crs/install/crsconfig_lib.pm line 9088.
/u01/crs/11.2.0.3/perl/bin/perl -I/u01/crs/11.2.0.3/perl/lib -I/u01/crs/11.2.0.3/crs/install /u01/crs/11.2.0.3/crs/install/rootcrs.pl execution failed

So that was not too great indeed-my update failed halfway through. Two facts make this bearable:

  1. rootupgrade.sh (and root.sh for that matter) are restartable since 11.2.0.2 at least
  2. A great deal of logging is available in $GRID_HOME/cfgtoollogs/crsconfig/rootcrs_hostname.log

Now advmutil was correct-there were no volumes in /dev/asm/*

An analysis of the rootcrs_node1.log file showed that the command that failed was this one

2012-01-06 10:09:10: Executing cmd: /u01/crs/11.2.0.3/bin/srvctl upgrade model  -s 11.2.0.2.0 -d 11.2.0.3.0 -p first
2012-01-06 10:09:12: Command output:
>  PRCA-1056 : Unable to upgrade ACFS from version 11.2.0.2.0 to version 11.2.0.3.0
>  PRCT-1011 : Failed to run "advmutil". Detailed error: advmutil: CLSU-00100: Operating System function: open64 failed with error data: 2|advmutil: CLSU-00101: Operating System error message: No such file or directory|advmutil: CLSU-00103: error location: OOF_1|advmutil: CLSU-00104: additional error information: open64 (/dev/asm/orahomevol-315)|advmutil: ADVM-09006: Error opening volume /dev/asm/orahomevol-315
>End Command output
2012-01-06 10:09:12:   "/u01/crs/11.2.0.3/bin/srvctl upgrade model  -s 11.2.0.2.0 -d 11.2.0.3.0 -p first" failed with status 1.
2012-01-06 10:09:12: srvctl upgrade model -first ... failed

Thinking Clearly

Thinking Clearly is an idea I thought I had adopted from Cary Millsap, but sadly I didn’t apply it here! Lesson learned: don’t assume, check!

I however assumed that because of the shutdown of the clusterware stack there wasn’t any Oracle software running on the node, hence there wouldn’t be an ADVM volume BY DEFINITION. Cluster down-ADVM down too.

Upon checking the log file again, I realised how wrong I was. Most of the lower stack Clusterware daemons were actually running by the time the srvctl command failed to upgrade ACFS to 11.2.0.3. So the reason for this failure had to be a different one. It quickly turned out that ALL the ACFS volumes were disabled. A quick check with asmcmd verified this:

$ asmcmd volinfo -a

Volume Name: ORAHOMEVOL
Volume Device: /dev/asm/orahomevol-315
State: DISABLED
Size (MB): 15120
Resize Unit (MB): 256
Redundancy: UNPROT
Stripe Columns: 4
Stripe Width (K): 128
Usage: ACFS
Mountpath: /u01/app/oracle/product/11.2.0.2

OK, that explains it all-disabled volumes are obviously NOT presented in /dev/asm/. A call to “asmcmd volenable -a” sorted that problem.

Back to point 1 – rootupgrade.sh is restartable. I then switched back to the root session and started another attempt at running the script and: (drums please) it worked. Now all that was left to do was to run rootupgrade.sh on the second (and last) node. This completed successfully as well. The required patch for the ASM rolling upgrade by the way is needed there and then-the rootcrs_lastnode.log file has these lines:

2012-01-10 09:44:10: Command output:
>  Started to upgrade the Oracle Clusterware. This operation may take a few minutes.
>  Started to upgrade the CSS.
>  Started to upgrade the CRS.
>  The CRS was successfully upgraded.
>  Oracle Clusterware operating version was successfully set to 11.2.0.3.0
>End Command output
2012-01-10 09:44:10: /u01/crs/11.2.0.3/bin/crsctl set crs activeversion ... passed
2012-01-10 09:45:10: Rolling upgrade is set to 1
2012-01-10 09:45:10: End ASM rolling upgrade
2012-01-10 09:45:10: Executing as oracle: /u01/crs/11.2.0.3/bin/asmca -silent -upgradeLocalASM -lastNode /u01/crs/11.2.0.2
2012-01-10 09:45:10: Running as user oracle: /u01/crs/11.2.0.3/bin/asmca -silent -upgradeLocalASM -lastNode /u01/crs/11.2.0.2
2012-01-10 09:45:10:   Invoking "/u01/crs/11.2.0.3/bin/asmca -silent -upgradeLocalASM -lastNode /u01/crs/11.2.0.2" as user "oracle"
2012-01-10 09:45:10: Executing /bin/su oracle -c "/u01/crs/11.2.0.3/bin/asmca -silent -upgradeLocalASM -lastNode /u01/crs/11.2.0.2"
2012-01-10 09:45:10: Executing cmd: /bin/su oracle -c "/u01/crs/11.2.0.3/bin/asmca -silent -upgradeLocalASM -lastNode /u01/crs/11.2.0.2"
2012-01-10 09:45:51: Command output:
>
>  ASM upgrade has finished on last node.
>
>End Command output
2012-01-10 09:45:51: end rolling ASM upgrade in last

Note the ROLLING UPGRADE!

Summary

If your rootupgrade.sh script bails out with ADVMUTIL, check if your ACFS volumes are enabled-they most likely are not.

Posted in 11g Release 2, Linux | Tagged: , , , | 3 Comments »

Provision Oracle RDBMS software via RPM

Posted by Martin Bach on December 13, 2011

I have always asked myself why Oracle doesn’t package their software as an RPM-surely such a large organisation has the resources to do so!

Well the short answer is they don’t give you an RPM, except for the XE version of the database which prompted me to do it myself. The big problem anyone faces with RPM is that the format doesn’t seem to support files larger than 2GB. Everybody knows that the Oracle database installation is > 2G which requires a little trick on our side. And the trick is not even obscure in any way as I remembered: some time ago I read an interesting article written by Frits Hoogland about cloning Oracle homes. It’s still very relevant and can be found here:

http://fritshoogland.wordpress.com/2010/07/03/cloning-your-oracle-database-software-installation/

Now that gave me the idea:

  1. You install the oracle binaries on a reference host
  2. Apply any patches and PSUs you need
  3. Wrap the oracle home up in a tar-ball just the way Frits describes by descending into $ORACLE_HOME and creating a tar archive of all files, excluding those ending in “*.log”, network config files in $ORACLE_HOME/network/admin and anything in $ORACLE_HOME/dbs. We don’t want to proliferate our database initialisation files …
  4. You make that tarball available on a central repository and export that with CIFS/NFS or whatever other mechanism you like
  5. Mount this exported file system in /media, so that /media/db11.2.0.3/ has the database.tar.gz file available
  6. Install the RPM

Simple! Piet de Visser would be proud. Read the rest of this entry »

Posted in Linux | Tagged: , | Leave a Comment »

Getting started with FusionIO

Posted by Martin Bach on December 12, 2011

I have been lucky enough to do some work with Fusion IO cards in a blade server, soon to be followed by another set of tests on a full rack mounted server. I didn’t know exactly where model I was given, but powered my server down in eager anticipation of the events to come.

After the engineer plugged the card in, and powered the server up I logged in as root to find out what about the pre-christmas present. I knew it was a PCI card, so surely lspci would tell me more. Read the rest of this entry »

Posted in Linux | Tagged: | Leave a Comment »

The cause for and against the Exadata simulator

Posted by Martin Bach on December 8, 2011

I am on my way back from the best UKOUG conference I ever attended, unfortunately a lot earlier than planned. Before I start forgetting all these great moments it is time to write them up. To make use of James Morle’s words: if you weren’t there, you lose. I couldn’t agree more!

The Oak Table Network organised “Oak Table Sunday”, a hugely successful event on Sunday afternoon. This event featured some of the brightest Oracle minds, and thanks to a very relaxed atmosphere made it all a truly exceptional experience. I have to say that the audience was quite illustrious too-I didn’t recognise Paul Vallee from Pythian with his Movember moustache at first and to my great joy I finally met Piet de Visser again. After exchanging a few words with them I ran into so many people it was just great!

Unfortunately I couldn’t make it before the HA panel session, where Alex Gorbatchev, Dave Ensor, James Morle, Greg Rahn, Dan Norris, Graham Wood, Jonathan Lewis and Mogens Norgaard and all the others I just forgot to mention answered questions from the audience.

Read the rest of this entry »

Posted in Exadata | 11 Comments »

Getting started with Xen virtualisation on Ubuntu 11.10

Posted by Martin Bach on November 30, 2011

After a long time and lots of problems I decided to abandon openSuSE 11.4 and its xen implementation in favour of the PVOPS kernel and a different distribution.

It’s been difficult to choose the correct one for me, for now I’m working with Ubuntu 11.10. One reason is that it’s said to be user friendly, and highly customisable. It comes with all the right ingredients for running different hypervisors, including my favourite: xen.

Important update! See “Security” below.

Read the rest of this entry »

Posted in Linux, Xen | 5 Comments »

Installing OEM 12c agents in RPM format

Posted by Martin Bach on November 22, 2011

One of the questions I have always asked myself revolved around: “why doesn’t Oracle package certain software as an RPM on Linux?” Well this question has recently been answered in the form of the Oracle 12c agent. It IS possible to use an RPM based installation, although it doesn’t make 100 use of RPM. I have written this post to give you an idea what happens.

The procedure is described in the OEM 12 Cloud Control Advanced Installation and Configuration Guide, chapter 6. The process is very similar to the non-RPM based agent deployment. Let’s have a loot at it in detail.

Read the rest of this entry »

Posted in Cloud Control, Uncategorized | 2 Comments »

Simplified GNS setup for RAC 11.2.0.2 and newer

Posted by Martin Bach on November 17, 2011

One of the main problems I have seen with GNS (Grid Naming Service) installations was that you couldn’t really see if your DNS and DHCP configuration was correct until it’s been too late. This has been addressed, but it’s little known. There are a number of checks you can run before starting Oracle Universal Installer, and this post is about them.

What is the Grid Naming System?

I was initially drawn towards the GNS when it was initially released with 11.2.0.1. It is aimed at environments where the Oracle DBAs take on (yet another) piece of work, namely the DNS administration. By virtue of “subdomain delegation”, the master DNS server responsible for “example.com” hands off requests for a subdomain to this – rac.example.com – to an Oracle managed process. This was quite poorly documented initially, prompting me to figure it out myself in an earlier post: http://martincarstenbach.wordpress.com/2009/10/02/build-your-own-11-2-rac-system-part-ii-dns-dhcp-for-gns/

The problem with GNS in 11.2.0.1 was that you couldn’t really test if the DNS setup was sufficient for Oracle Installer to work, and I had a few attempts at the installation (the discussion here takes into account that I might not have been able to perform sufficient checking!)

Read the rest of this entry »

Posted in Uncategorized | 1 Comment »

Installing FreeNX on OpenSuSE 11.4

Posted by Martin Bach on November 8, 2011

After reading an article in one of my favourite computer magazines about FreeNX and NoMashine’s NX I was very interested to get this to work. Also, google are using NX for some developers-and if a technology is good enough for google than it can only be good enough for me as well.

Unfortunately there wasn’t an awful lot of documentation around for openSuSE 11.4 but by making best use of search engines I finally got it to work. Again, this is looking rather trivial in the post but was a lot of work finding out! Now here’s what I did.

Installing FreeNX

I found the RPMs for FreeNX in the standard SuSE repositories, but there are probably newer builds to be found here:

http://download.opensuse.org/pub/opensuse/repositories/X11:/RemoteDesktop/openSUSE_11.4/

It seems to be sufficient to install these packages on the server:

#  rpm -qa|grep -i nx
NX-3.4.0-21.1.x86_64
FreeNX-0.7.2-29.1.x86_64

Read the rest of this entry »

Posted in Linux | Leave a Comment »

An interesting problem with ext4 on Oracle Linux 5.5

Posted by Martin Bach on November 4, 2011

I have run into an interesting problem with my Red Hat 5.5 installation. Naively I assumed that ext4 has been around for a long time it would be stable. For a test I performed for a friend, I created my database files on a file system formatted with ext4 and mounted it the same way I would have mounted an ext3 file system:

$ mount | grep ext4
/dev/mapper/mpath43p1 on /u02/oradata type ext4 (rw)

Now when I tried to create a data file within a tablespace of a certain size, I got block corruption which I found very interesting. My first thought was: you must have a corruption of the file system. So I shut down all processes accessing /u02/oradata and gave the file system a thorough checking. Read the rest of this entry »

Posted in 11g Release 2, Linux, War Stories | 1 Comment »

 
Follow

Get every new post delivered to your Inbox.

Join 513 other followers