Martins Blog

Trying to explain complex things in simple terms

Adding a node to a 12c RAC cluster

Posted by Martin Bach on August 15, 2013

This post is not in chronological order, I probably should have written something about installing RAC 12c first. I didn’t want to write the tenth installation guide for Clusterware so I’m focusing on extending my two node cluster to three nodes to test the new Flex ASM feature. If you care about installing RAC 12c head over to RAC Attack for instructions, or Tim Hall’s site. The RAC Attack instructions are currently being worked at for a 12c upgrade, you can follow/participate the work on this free mailing list.

The cluster I installed is based on KVM on my lab server. I have used Oracle Linux 6.4 with UEK2 for the host OS. It is a standard, i.e. not a Flex Cluster but with Flex ASM configured. My network configuration is as shown:

  • public network on 192.168.100/24
  • private network on 192.168.101/24
  • ASM network on 192.168.1/24

As you can see here:

[grid@rac12node1 ~]$ oifcfg getif
eth0  192.168.100.0  global  public
eth1  192.168.101.0  global  cluster_interconnect
eth2  192.168.1.0    global  asm

Grid and oracle are the respective software owners, and I am making use of the new separation of duties on the RDBMS layer.

Adding the node

The generic steps to follow when adding the new node to the cluster are:

  • Install Operating System
  • Install required software
  • Add/modify users and groups required for the installation
  • Configure network
  • Configure kernel parameters
  • Configure services required such as NTP
  • Configure storage (multipathing, zoning, storage discovery, ASMLib?)

These steps are the same as for the installation, with a noteworthy exception: The steps to create the groups in the same way they are created on the other nodes requires attention as it is a bit more labour intensive in my case as I wanted to see if the separation of duties works. In such a case you need to get the numeric group IDs for all the groups in question. Check /etc/group for all groups defined, normally on an Oracle server you will find the Oracle related groups towards the end. I have written about the changes in users and groups in a previous blog post. When creating the groups, specify the group ID as found on the other clusters nodes, and when creating the grid and oracle accounts ensure you have defined the users with the same numeric user ID as well (using the -u flag to useradd).

Installing the required RPMs is rather simple thanks to the oracle-rdbms-server-12cR1-preinstall package. Ensure that YUM is working-I am using a local repository-and yum install oracle-rdbms-server-12cR1-preinstall will give you all you need, including the dreaded dependencies. If you made further modifications to /etc/sysctl.conf, /etc/security/limits.conf or files in /etc/pam.d/* then carry them over as well. The same is true for any storage configuration (dm-multipath, PowerPath etc) and especially ASMLib.

I usually put the name of the host plus its IP in /etc/hosts just to be sure that OUI won’t complain.

Also check

  • /etc/sysconfig/selinux to ensure that SELinux is in the required state (permissive in my case)
  • chkconfig iptables –list to ensure that the local firewall is either off, or-in combination with iptables -L-uses the correct settings
  • NTP configuration in /etc/sysconfig/ntpd must include the “-x” flag. If it’s not there, add it and restart NTP
  • network configuration must be up to date
  • LVM configuration and file system space for the installation of the RDBMS and Grid Infrastructure binaries

After all these steps have been performed you should check if the new node can be added successfully using cluvfy

[grid@rac12node1 ~]$ cluvfy stage -pre nodeadd -n rac12node3 -fixup -fixupnoexec

Performing pre-checks for node addition

Checking node reachability...
Node reachability check passed from node "rac12node1"

Checking user equivalence...
User equivalence check passed for user "grid"
Package existence check passed for "cvuqdisk"

Checking CRS integrity...

CRS integrity check passed

Clusterware version consistency passed.

Checking shared resources...

Checking CRS home location...
Location check passed for: "/u01/app/12.1.0.1/grid"
Shared resources check for node addition passed

Checking node connectivity...

Checking hosts config file...

Verification of the hosts config file successful

Check: Node connectivity using interfaces on subnet "192.168.100.0"
Node connectivity passed for subnet "192.168.100.0" with node(s) rac12node2,rac12node3,rac12node1
TCP connectivity check passed for subnet "192.168.100.0"

Check: Node connectivity using interfaces on subnet "192.168.101.0"
Node connectivity passed for subnet "192.168.101.0" with node(s) rac12node2,rac12node1,rac12node3
TCP connectivity check passed for subnet "192.168.101.0"

Check: Node connectivity using interfaces on subnet "192.168.1.0"
Node connectivity passed for subnet "192.168.1.0" with node(s) rac12node1,rac12node2,rac12node3
TCP connectivity check passed for subnet "192.168.1.0"

Checking subnet mask consistency...
Subnet mask consistency check passed for subnet "192.168.100.0".
Subnet mask consistency check passed for subnet "192.168.101.0".
Subnet mask consistency check passed for subnet "192.168.1.0".
Subnet mask consistency check passed.

Node connectivity check passed

Checking multicast communication...

Checking subnet "192.168.101.0" for multicast communication with multicast group "224.0.0.251"...
Check of subnet "192.168.101.0" for multicast communication with multicast group "224.0.0.251" passed.

Checking subnet "192.168.1.0" for multicast communication with multicast group "224.0.0.251"...
Check of subnet "192.168.1.0" for multicast communication with multicast group "224.0.0.251" passed.

Check of multicast communication passed.
Task ASM Integrity check started...

Checking if connectivity exists across cluster nodes on the ASM network

Checking node connectivity...

Checking hosts config file...

Verification of the hosts config file successful

Check: Node connectivity using interfaces on subnet "192.168.1.0"
Node connectivity passed for subnet "192.168.1.0" with node(s) rac12node3
TCP connectivity check passed for subnet "192.168.1.0"

Node connectivity check passed

Network connectivity check across cluster nodes on the ASM network passed

Task ASM Integrity check passed...
Total memory check passed
Available memory check passed
Swap space check failed
Check failed on nodes:
	rac12node3,rac12node1
Free disk space check passed for "rac12node3:/usr,rac12node3:/var,rac12node3:/etc,rac12node3:/sbin,rac12node3:/tmp"
Free disk space check passed for "rac12node1:/usr,rac12node1:/var,rac12node1:/etc,rac12node1:/sbin,rac12node1:/tmp"
Free disk space check passed for "rac12node3:/u01/app/12.1.0.1/grid"
Free disk space check passed for "rac12node1:/u01/app/12.1.0.1/grid"
Check for multiple users with UID value 54322 passed
User existence check passed for "grid"
Run level check passed
Hard limits check passed for "maximum open file descriptors"
Soft limits check passed for "maximum open file descriptors"
Hard limits check passed for "maximum user processes"
Soft limits check passed for "maximum user processes"
System architecture check passed
Kernel version check passed
Kernel parameter check passed for "semmsl"
Kernel parameter check passed for "semmns"
Kernel parameter check passed for "semopm"
Kernel parameter check passed for "semmni"
Kernel parameter check passed for "shmmax"
Kernel parameter check passed for "shmmni"
Kernel parameter check passed for "shmall"
Kernel parameter check passed for "file-max"
Kernel parameter check passed for "ip_local_port_range"
Kernel parameter check passed for "rmem_default"
Kernel parameter check passed for "rmem_max"
Kernel parameter check passed for "wmem_default"
Kernel parameter check passed for "wmem_max"
Kernel parameter check passed for "aio-max-nr"
Package existence check passed for "binutils"
Package existence check passed for "compat-libcap1"
Package existence check passed for "compat-libstdc++-33(x86_64)"
Package existence check passed for "libgcc(x86_64)"
Package existence check passed for "libstdc++(x86_64)"
Package existence check passed for "libstdc++-devel(x86_64)"
Package existence check passed for "sysstat"
Package existence check passed for "gcc"
Package existence check passed for "gcc-c++"
Package existence check passed for "ksh"
Package existence check passed for "make"
Package existence check passed for "glibc(x86_64)"
Package existence check passed for "glibc-devel(x86_64)"
Package existence check passed for "libaio(x86_64)"
Package existence check passed for "libaio-devel(x86_64)"
Package existence check passed for "nfs-utils"
Check for multiple users with UID value 0 passed
Current group ID check passed

Starting check for consistency of primary group of root user

Check for consistency of root user's primary group passed
Group existence check passed for "asmadmin"
Group existence check passed for "asmdba"

Checking ASMLib configuration.
Check for ASMLib configuration passed.

Checking OCR integrity...

OCR integrity check passed

Checking Oracle Cluster Voting Disk configuration...

Oracle Cluster Voting Disk configuration check passed
Time zone consistency check passed

Starting Clock synchronization checks using Network Time Protocol(NTP)...

NTP Configuration file check started...
NTP Configuration file check passed

Checking daemon liveness...
Liveness check passed for "ntpd"
Check for NTP daemon or service alive passed on all nodes

NTP common Time Server Check started...
Check of common NTP Time Server passed

Clock time offset check from NTP Time Server started...
Clock time offset check passed

Clock synchronization check using Network Time Protocol(NTP) passed

User "grid" is not part of "root" group. Check passed
Checking integrity of file "/etc/resolv.conf" across nodes

"domain" and "search" entries do not coexist in any  "/etc/resolv.conf" file
All nodes have same "search" order defined in file "/etc/resolv.conf"
The DNS response time for an unreachable node is within acceptable limit on all nodes

Check for integrity of file "/etc/resolv.conf" passed

Checking integrity of name service switch configuration file "/etc/nsswitch.conf" ...
Check for integrity of name service switch configuration file "/etc/nsswitch.conf" passed

NOTE:
No fixable verification failures to fix

Pre-check for node addition was unsuccessful on all the nodes.

I tend to ignore the SWAP space warning, you should however ensure that the available swap space matches the Oracle requirements. If you are using the management repsository as I do then you will need 500 additional MB available for it. You can get the size as shown here:

[grid@rac12node1 ~]$ oclumon manage -get repsize

CHM Repository Size = 136320
[grid@rac12node1 ~]$

If you are unlucky you need to add additional space to the repository, but I got away with it. Now it seems I am good to add the home, so let’s try:

[grid@rac12node1 addnode]$ ./addnode.sh -silent "CLUSTER_NEW_NODES={rac12node3}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={rac12node3-vip}"
Starting Oracle Universal Installer...

Checking Temp space: must be greater than 120 MB.   Actual 1726 MB    Passed
Checking swap space: must be greater than 150 MB.   Actual 767 MB    Passed
[WARNING] [INS-13014] Target environment does not meet some optional requirements.
   CAUSE: Some of the optional prerequisites are not met. See logs for details.
          /u01/app/oraInventory/logs/addNodeActions2013-07-20_11-51-00PM.log
   ACTION: Identify the list of failed prerequisite checks from the log:
           /u01/app/oraInventory/logs/addNodeActions2013-07-20_11-51-00PM.log. Then
           either from the log file or from installation manual find the appropriate
           configuration to meet the prerequisites and fix it manually.

Prepare Configuration in progress.

Prepare Configuration successful.
..................................................   9% Done.
You can find the log of this install session at:
 /u01/app/oraInventory/logs/addNodeActions2013-07-20_11-51-00PM.log

Instantiate files in progress.

Instantiate files successful.
..................................................   15% Done.

Copying files to node in progress.

Copying files to node successful.
..................................................   79% Done.

Saving cluster inventory in progress.
..................................................   87% Done.

Saving cluster inventory successful.
The Cluster Node Addition of /u01/app/12.1.0.1/grid was successful.
Please check '/tmp/silentInstall.log' for more details.

As a root user, execute the following script(s):
	1. /u01/app/oraInventory/orainstRoot.sh
	2. /u01/app/12.1.0.1/grid/root.sh

Execute /u01/app/oraInventory/orainstRoot.sh on the following nodes:
[rac12node3]
Execute /u01/app/12.1.0.1/grid/root.sh on the following nodes:
[rac12node3]

The scripts can be executed in parallel on all the nodes. If there are any policy
managed databases managed by cluster, proceed with the addnode procedure without
executing the root.sh script. Ensure that root.sh script is executed after all the
policy managed databases managed by clusterware are extended to the new nodes.
..........
Update Inventory in progress.
..................................................   100% Done.

Update Inventory successful.
Successfully Setup Software.
[grid@rac12node1 addnode]$

A small victory: the software has made it across. Next the main task: running the root scripts. The first one is easy:

[root@rac12node3 usr]# /u01/app/oraInventory/orainstRoot.sh
Changing permissions of /u01/app/oraInventory.
Adding read,write permissions for group.
Removing read,write,execute permissions for world.

Changing groupname of /u01/app/oraInventory to oinstall.
The execution of the script is complete.

It’s the second one you need to be concerned about. Run the script in a screen session!

[root@rac12node3 usr]# /u01/app/12.1.0.1/grid/root.sh
Check /u01/app/12.1.0.1/grid/install/root_rac12node3.example.com_2013-07-20_23-59-23.log for the output of root script
... wait ...

Create a new screen (ctrl-a then c) and tail the log file, shown here in its completeness:

[root@rac12node3 ~]# cat /u01/app/12.1.0.1/grid/install/root_rac12node3.example.com_2013-07-20_23-59-23.log
Performing root user operation for Oracle 12c

The following environment variables are set as:
    ORACLE_OWNER= grid
    ORACLE_HOME=  /u01/app/12.1.0.1/grid
   Copying dbhome to /usr/local/bin ...
   Copying oraenv to /usr/local/bin ...
   Copying coraenv to /usr/local/bin ...

Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Relinking oracle with rac_on option
Using configuration parameter file: /u01/app/12.1.0.1/grid/crs/install/crsconfig_params
2013/07/20 23:59:36 CLSRSC-363: User ignored prerequisites during installation

OLR initialization - successful
2013/07/21 00:00:13 CLSRSC-330: Adding Clusterware entries to file 'oracle-ohasd.conf'

CRS-4133: Oracle High Availability Services has been stopped.
CRS-4123: Oracle High Availability Services has been started.
CRS-4133: Oracle High Availability Services has been stopped.
CRS-4123: Oracle High Availability Services has been started.
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac12node3'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'rac12node3'
CRS-2677: Stop of 'ora.drivers.acfs' on 'rac12node3' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rac12node3' has completed
CRS-4133: Oracle High Availability Services has been stopped.
CRS-4123: Starting Oracle High Availability Services-managed resources
CRS-2672: Attempting to start 'ora.mdnsd' on 'rac12node3'
CRS-2672: Attempting to start 'ora.evmd' on 'rac12node3'
CRS-2676: Start of 'ora.mdnsd' on 'rac12node3' succeeded
CRS-2676: Start of 'ora.evmd' on 'rac12node3' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rac12node3'
CRS-2676: Start of 'ora.gpnpd' on 'rac12node3' succeeded
CRS-2672: Attempting to start 'ora.gipcd' on 'rac12node3'
CRS-2676: Start of 'ora.gipcd' on 'rac12node3' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac12node3'
CRS-2676: Start of 'ora.cssdmonitor' on 'rac12node3' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rac12node3'
CRS-2672: Attempting to start 'ora.diskmon' on 'rac12node3'
CRS-2676: Start of 'ora.diskmon' on 'rac12node3' succeeded
CRS-2789: Cannot stop resource 'ora.diskmon' as it is not running on server 'rac12node3'
CRS-2676: Start of 'ora.cssd' on 'rac12node3' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'rac12node3'
CRS-2672: Attempting to start 'ora.ctssd' on 'rac12node3'
CRS-2676: Start of 'ora.ctssd' on 'rac12node3' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'rac12node3' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rac12node3'
CRS-2676: Start of 'ora.asm' on 'rac12node3' succeeded
CRS-2672: Attempting to start 'ora.storage' on 'rac12node3'
CRS-2676: Start of 'ora.storage' on 'rac12node3' succeeded
CRS-2672: Attempting to start 'ora.crf' on 'rac12node3'
CRS-2676: Start of 'ora.crf' on 'rac12node3' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'rac12node3'
CRS-2676: Start of 'ora.crsd' on 'rac12node3' succeeded
CRS-6017: Processing resource auto-start for servers: rac12node3
CRS-2672: Attempting to start 'ora.ASMNET1LSNR_ASM.lsnr' on 'rac12node3'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN2.lsnr' on 'rac12node1'
CRS-2672: Attempting to start 'ora.ons' on 'rac12node3'
CRS-2677: Stop of 'ora.LISTENER_SCAN2.lsnr' on 'rac12node1' succeeded
CRS-2673: Attempting to stop 'ora.scan2.vip' on 'rac12node1'
CRS-2677: Stop of 'ora.scan2.vip' on 'rac12node1' succeeded
CRS-2672: Attempting to start 'ora.scan2.vip' on 'rac12node3'
CRS-2676: Start of 'ora.scan2.vip' on 'rac12node3' succeeded
CRS-2676: Start of 'ora.ons' on 'rac12node3' succeeded
CRS-2672: Attempting to start 'ora.LISTENER_SCAN2.lsnr' on 'rac12node3'
CRS-2676: Start of 'ora.ASMNET1LSNR_ASM.lsnr' on 'rac12node3' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rac12node3'
CRS-2676: Start of 'ora.LISTENER_SCAN2.lsnr' on 'rac12node3' succeeded
CRS-2676: Start of 'ora.asm' on 'rac12node3' succeeded
CRS-2672: Attempting to start 'ora.proxy_advm' on 'rac12node3'
CRS-2676: Start of 'ora.proxy_advm' on 'rac12node3' succeeded
CRS-2672: Attempting to start 'ora.DATA.ORAHOMEVOL.advm' on 'rac12node3'
CRS-5017: The resource action "ora.DATA.ORAHOMEVOL.advm start" encountered the following error:
CRS-5000: Expected resource ora.DATA.dg does not exist in agent process
. For details refer to "(:CLSN00107:)" in "/u01/app/12.1.0.1/grid/log/rac12node3/agent/crsd/orarootagent_root/orarootagent_root.log".
CRS-2676: Start of 'ora.DATA.ORAHOMEVOL.advm' on 'rac12node3' succeeded
CRS-2672: Attempting to start 'ora.data.orahomevol.acfs' on 'rac12node3'
CRS-2676: Start of 'ora.data.orahomevol.acfs' on 'rac12node3' succeeded
CRS-6016: Resource auto-start has completed for server rac12node3
CRS-2664: Resource 'ora.DATA.dg' is already running on 'rac12node2'
CRS-2664: Resource 'ora.OCR.dg' is already running on 'rac12node2'
CRS-2664: Resource 'ora.data.orahomevol.acfs' is already running on 'rac12node2'
CRS-2664: Resource 'ora.proxy_advm' is already running on 'rac12node1'
CRS-2664: Resource 'ora.proxy_advm' is already running on 'rac12node2'
CRS-2664: Resource 'ora.proxy_advm' is already running on 'rac12node3'
CRS-6024: Completed start of Oracle Cluster Ready Services-managed resources
CRS-4123: Oracle High Availability Services has been started.
2013/07/21 00:10:23 CLSRSC-343: Successfully started Oracle clusterware stack

clscfg: EXISTING configuration version 5 detected.
clscfg: version 5 is 12c Release 1.
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
2013/07/21 00:11:03 CLSRSC-325: Configure Oracle Grid Infrastructure for a Cluster ... succeeded

[root@rac12node3 ~]#

And it looks like I have a three node cluster!

[root@rac12node3 ~]# crsctl check cluster -all
**************************************************************
rac12node1:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
rac12node2:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
rac12node3:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
[root@rac12node3 ~]#

So cool! The extension of the database homes isn’t shown here to prevent you from excessive boredom.

About these ads

2 Responses to “Adding a node to a 12c RAC cluster”

  1. […] Adding a node to a 12c RAC Cluster […]

  2. Brian said

    Great instructions! Just added a new node to my two node 12c test cluster. Thanks!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

Join 2,271 other followers

%d bloggers like this: