Building your own RAC system part III: Grid Infrastructure
Posted by Martin Bach on October 7, 2009
This is the third part of my “build your own RAC” article series, and this time we are about to install and discuss Grid Infrastructure.
After all the prerequisistes are met, some of which are described in previous posts, it’s about time to begin with the installation of the Grid Infrastructure. As with the artist Prince, Grid Infrastructure has come with many names – Cluster Ready Services (10.1), Clusterware (10.2 & 11.1) and now finally Grid Infrastructure.
Download the necessary zipfile from OTN as always and unzip it somewhere you like, then execute runInstaller. You will notice that OUI in 11.2 has gone red, and it looks quite different from previous versions. I will attach screenshots of the installation process once I have figured out how that works with my blogging software. Since the Grid Infrastructure installation is quite unlike the previous Clusterware installation I thought I’d point out the differences.
My sample setup has to serve a number of purposes:
- I want to try adding a node to the cluster. Will it really be as simple as they say?
- I want to try RAC One Node – how good will omotion be?
- I am testing Grid Plug And Play.
So if you find anything strange such as all SCAN addresses on a single host, then it’s because I haven’t had time to extend the cluster.
If you thought you knew it all about public, private and virtual IP addresses, you are in for a bit of a revelation. Grid Infrastructure requires even more network addresses, in addition to the ones mentioned before. During the installation, OUI will prompt you for a SCAN address and, depending whether or not you use Grid Plug And Play, a final one for the Grid Naming Service GNS. The concepts behind public, private and virtual IP addresses haven’t changed so they won’t be mentioned here as it’s nothing new.
SCAN – single client access network
Forget about what you knew about client connections to the database-this is going to change with 11.2, not only for RAC but also for single instance if you intend to use ASM. Instead of connecting to the database listener port, clients from now on connect to a SCAN address. SCANs are either manually defined in DNS (you need 3 of them, regardless of the number of nodes in your cluster) or will be taken care of by GNS (see below). Oh, and yes, they are virtual addresses as well. SCAN addresses point to the whole cluster, not individual nodes and should make (local) naming a lot easier. Like VIPs, SCANs need to be on the same subnet as the public IPs.
The previous 2 posts explain how to set up SCAN addresses in DNS.
I couldn’t find a better way to explain it than the documentation, so this is verbally quoted from section D.1.3.5 About the SCAN of the Oracle® Grid Infrastructure Installation Guide for Linux:
“The SCAN works by being able to resolve to multiple IP addresses reflecting multiple listeners in the cluster handling public client connections. When a client submits a request, the SCAN listener listening on a SCAN IP address and the SCAN port is contracted on a client’s behalf. Because all services on the cluster are registered with the SCAN listener, the SCAN listener replies with the address of the local listener on the least-loaded node where the service is currently being offered. Finally, the client establishes connection to the service through the listener on the node where service is offered. All of these actions take place transparently to the client without any explicit configuration required in the client.”
The scan address defaults to clustername-scan, so it’s easy to predict the name. If GNS is used, the scan address will be clustername-scan.GNS_domain. Otherwise, it defaults to clustername-scan.current_domain. I’m using GNS so the scan name is rac-scan.rac.the-playground.de
Once Grid Infrastructure is installed, how do you find out which IPs are allocated for your SCAN addresses? First of all, you see many more IPs allocated on your public interface:
[oracle@rac11gr2node1 ~]$ /sbin/ifconfig | grep -a1 eth0 eth0 Link encap:Ethernet HWaddr 00:16:3E:42:62:33 inet addr:192.168.1.90 Bcast:192.168.1.255 Mask:255.255.255.0 ... eth0:1 Link encap:Ethernet HWaddr 00:16:3E:42:62:33 inet addr:192.168.1.92 Bcast:192.168.1.255 Mask:255.255.255.0 ... eth0:2 Link encap:Ethernet HWaddr 00:16:3E:42:62:33 inet addr:192.168.1.251 Bcast:192.168.1.255 Mask:255.255.255.0 ... eth0:3 Link encap:Ethernet HWaddr 00:16:3E:42:62:33 inet addr:192.168.1.252 Bcast:192.168.1.255 Mask:255.255.255.0 ... eth0:4 Link encap:Ethernet HWaddr 00:16:3E:42:62:33 inet addr:192.168.1.253 Bcast:192.168.1.255 Mask:255.255.255.0 ... eth0:5 Link encap:Ethernet HWaddr 00:16:3E:42:62:33 inet addr:192.168.1.254 Bcast:192.168.1.255 Mask:255.255.255.0
You can use srvctl status scan to find out on which cluster node a SCAN is allocated:
[oracle@rac11gr2node1 ~]$ srvctl status scan SCAN VIP scan1 is enabled SCAN VIP scan1 is running on node rac11gr2node1 SCAN VIP scan2 is enabled SCAN VIP scan2 is running on node rac11gr2node1 SCAN VIP scan3 is enabled SCAN VIP scan3 is running on node rac11gr2node1
Similarly, the SCAN listeners can be queried as well:
[oracle@rac11gr2node1 ~]$ srvctl status scan_listener SCAN Listener LISTENER_SCAN1 is enabled SCAN listener LISTENER_SCAN1 is running on node rac11gr2node1 SCAN Listener LISTENER_SCAN2 is enabled SCAN listener LISTENER_SCAN2 is running on node rac11gr2node1 SCAN Listener LISTENER_SCAN3 is enabled SCAN listener LISTENER_SCAN3 is running on node rac11gr2node1
Here is proof that naming is a lot easier indeed. The below is an entry for my database, created by dbca:
ORCL = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = rac-scan.rac.the-playground.de)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = orcl.the-playground.de) ) )
rac-scan.rac.the-playground.de will be resolved by DNS/GNS (see below) in a round robin fashion.
GNS – Grid Naming Service virtual address
If you intend to use Grid Plug and Play, then tick the box next to “use GNS” in OUI. GNS works in conjunction with DHCP and DNS, whereby the named (“DNS”) daemon delegates a subdomain in which the cluster nodes will be administered to GNS. Why would you like to do this? Oracle say that this kind of setup is handy if you don’t want to ask your sys admin when it comes to extending the cluster. No additional entries would have to be made in DNS and local naming for clients. I don’t know about you but extending a RAC system is not a daily task for me. You can check my previous post about a possible DNS setup for GPnP.
How can you check if GNS is grabbing addresses from dhcp? Check the dhcpd server’s messages log file:
Oct 7 22:13:38 auxOEL5 dhcpd: DHCPDISCOVER from 00:00:00:00:00:00 via eth0 Oct 7 22:13:39 auxOEL5 dhcpd: DHCPOFFER on 192.168.1.251 to 00:00:00:00:00:00 via eth0 Oct 7 22:13:39 auxOEL5 dhcpd: Wrote 4 leases to leases file. Oct 7 22:13:39 auxOEL5 dhcpd: DHCPREQUEST for 192.168.1.251 (192.168.1.82) from 00:00:00:00:00:00 via eth0 Oct 7 22:13:39 auxOEL5 dhcpd: DHCPACK on 192.168.1.251 to 00:00:00:00:00:00 via eth0 Oct 7 22:13:39 auxOEL5 dhcpd: DHCPDISCOVER from 00:00:00:00:00:00 via eth0 Oct 7 22:13:40 auxOEL5 dhcpd: DHCPOFFER on 192.168.1.252 to 00:00:00:00:00:00 via eth0 Oct 7 22:13:40 auxOEL5 dhcpd: DHCPREQUEST for 192.168.1.252 (192.168.1.82) from 00:00:00:00:00:00 via eth0 Oct 7 22:13:40 auxOEL5 dhcpd: DHCPACK on 192.168.1.252 to 00:00:00:00:00:00 via eth0 Oct 7 22:13:40 auxOEL5 dhcpd: DHCPDISCOVER from 00:00:00:00:00:00 via eth0 Oct 7 22:13:41 auxOEL5 dhcpd: DHCPOFFER on 192.168.1.253 to 00:00:00:00:00:00 via eth0 Oct 7 22:13:41 auxOEL5 dhcpd: DHCPREQUEST for 192.168.1.253 (192.168.1.82) from 00:00:00:00:00:00 via eth0 Oct 7 22:13:41 auxOEL5 dhcpd: DHCPACK on 192.168.1.253 to 00:00:00:00:00:00 via eth0 Oct 7 22:13:41 auxOEL5 dhcpd: DHCPDISCOVER from 00:00:00:00:00:00 via eth0 Oct 7 22:13:42 auxOEL5 dhcpd: DHCPOFFER on 192.168.1.254 to 00:00:00:00:00:00 via eth0 Oct 7 22:13:42 auxOEL5 dhcpd: DHCPREQUEST for 192.168.1.254 (192.168.1.82) from 00:00:00:00:00:00 via eth0 Oct 7 22:13:42 auxOEL5 dhcpd: DHCPACK on 192.168.1.254 to 00:00:00:00:00:00 via eth0
These are listed in the ifconfig output above – so it’s working.
In OUI, you define the subdomain and the GNS virtual IP address. In my example, my DNS server is authorative for the “the-playground.de” zone but delegates control to GNS for a subdomain “rac.the-playground.de” to GNS. The GNS virtual IP has to be on the same subnet as the cluster’s public IPs and needs to be registered in DNS. As with all virtual IPs it must not yet be assigned to any host.
With GNS, you hand over control to the cluster without getting the sys admins involved. It comes at a cost though: it’s not recommended to change host names. I also expect it’ll be more than tricky to change that subdomain within GNS.
Install Grid Infrastructure
After all that theoretical stuff, it’s about to install the software. Do this as always by starting runInstaller. You will notices that there is no more screen to specify voting disk and OCR on devices, they default to ASM now. Fill in the values as appropriate and install the software. In my case I limited myself to 1 node, and made use of GNS. Voting disk and OCR go into ASM. The root.sh script has become quite massive! Actually, I personally consider it too large to be posted in it’s entirety, the “highlights” are:
Running Oracle 11g root.sh script... Finished running generic part of root.sh script. Now product-specific root actions will be performed. 2009-10-05 21:48:34: Parsing the host name 2009-10-05 21:48:34: Checking for super user privileges 2009-10-05 21:48:34: User has super user privileges Using configuration parameter file: /u01/crs/oracle/product/11.2.0/grid/crs/install/crsconfig_params Creating trace directory LOCAL ADD MODE Creating OCR keys for user 'root', privgrp 'root'.. Operation successful. ... Adding daemon to inittab CRS-4123: Oracle High Availability Services has been started. ohasd is starting ... ASM created and started successfully. DiskGroup DATA created successfully. clscfg: -install mode specified Successfully accumulated necessary OCR keys. Creating OCR keys for user 'root', privgrp 'root'.. Operation successful. CRS-2672: Attempting to start 'ora.crsd' on 'rac11gr2node1' CRS-2676: Start of 'ora.crsd' on 'rac11gr2node1' succeeded CRS-4256: Updating the profile Successful addition of voting disk 65ed9d8df48a4feebfad4b179480f1a9. Successfully replaced voting disk group with +DATA. CRS-4256: Updating the profile CRS-4266: Voting file(s) successfully replaced ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 65ed9d8df48a4feebfad4b179480f1a9 (/dev/raw/raw1) [DATA] Located 1 voting disk(s). rac11gr2node1 2009/10/05 21:54:46 /u01/crs/oracle/product/11.2.0/grid/cdata/rac11gr2node1/backup_20091005_215446.olr Preparing packages for installation... cvuqdisk-1.0.7-1 Configure Oracle Grid Infrastructure for a Cluster ... succeeded Updating inventory properties for clusterware Starting Oracle Universal Installer... Checking swap space: must be greater than 500 MB. Actual 511 MB Passed The inventory pointer is located at /etc/oraInst.loc The inventory is located at /u01/app/oraInventory 'UpdateNodeList' was successful.
Hmm, normally I then run crs_stat to get a list of resources:
[oracle@rac11gr2node1 ~]$ crsstat HA Resource Target State ----------- ------ ----- ora.DATA.dg ONLINE ONLINE on rac11gr2node1 ora.LISTENER.lsnr ONLINE ONLINE on rac11gr2node1 ora.LISTENER_SCAN1.lsnr ONLINE ONLINE on rac11gr2node1 ora.LISTENER_SCAN2.lsnr ONLINE ONLINE on rac11gr2node1 ora.LISTENER_SCAN3.lsnr ONLINE ONLINE on rac11gr2node1 ora.asm ONLINE ONLINE on rac11gr2node1 ora.eons ONLINE ONLINE on rac11gr2node1 ora.gns ONLINE ONLINE on rac11gr2node1 ora.gns.vip ONLINE ONLINE on rac11gr2node1 ora.gsd OFFLINE OFFLINE ora.net1.network ONLINE ONLINE on rac11gr2node1 ora.oc4j OFFLINE OFFLINE ora.ons ONLINE ONLINE on rac11gr2node1 ora.rac11gr2node1.ASM1.asm ONLINE ONLINE on rac11gr2node1 ora.rac11gr2node1.LISTENER_RAC11GR2NODE1.lsnr ONLINE ONLINE on rac11gr2node1 ora.rac11gr2node1.gsd OFFLINE OFFLINE ora.rac11gr2node1.ons ONLINE ONLINE on rac11gr2node1 ora.rac11gr2node1.vip ONLINE ONLINE on rac11gr2node1 ora.registry.acfs ONLINE ONLINE on rac11gr2node1 ora.scan1.vip ONLINE ONLINE on rac11gr2node1 ora.scan2.vip ONLINE ONLINE on rac11gr2node1 ora.scan3.vip ONLINE ONLINE on rac11gr2node1
Whoa – that’s a lot more than I was used to. Another article will detail these resources, later… So much to learn, so little time.