Build your own 11.2 RAC system-part II: DNS & DHCP for GNS

UPDATE 221103: Oracle 11.2 is effectively out of support, this article is now archived and shouldn’t be referred to.

Wow, I managed to cram 4 acronyms into the subject line, not bad for a Friday! The reason of this post is simple: I want to get Grid Naming Service (GNS) up and running on my 11.2 RAC system, after all it’s one of the major new features for 11.2. It required me to get acquainted with administering a Domain Name Service (DNS) server. In addition to a DNS system I also had to set up a system serving IP addresses using the Dynamic Host Configuration Protocol (DHCP). Getting GNS to work with RAC is purely optional though and the configuration files shown below are for lab-use only!

To use GNS with RAC 11.2 (part of Grid Plug And Play) we need DHCP and DNS, so let’s tackle them one by one in the lab. This article is based on Oracle Linux 5.2 and Oracle Real Application Cluster 11.2. A dedicated VM was created to serve DHCP and DNS requests, its IP is 192.168.1.82.

DHCP

DHCP is another one of those services the linux sys admin normally provides; DHCP is short for Dynamic Host Configuration Protocol. There has to be a package which provides the service, and sure enough this is called  dhcp. The dhcp client normally is installed already which makes it easy to avoid confusion.

Let’s start by installing the dhcp server (your version is most likely newer than the base release)

# rpm -ihv dhcp-3.0.5-13.el5

Documentation is a great way to get started so I looked up a sample configuration file for my network. GNS will require to assign IP addresses on the public network so I came up with the following configuration in /etc/dhcpd.conf

ddns-update-style interim; 
ignore client-updates;

subnet 192.168.1.0 netmask 255.255.255.0 {
option subnet-mask 255.255.255.0;
option domain-name "the-playground.de";
range 192.168.1.100 192.168.1.254;
default-lease-time 21600;
max-lease-time 43200;
}

A simple service dhcpd start as root started the service. Easy enough, now for DNS.

Configuring DNS for GNS

Please refer to part I of the series for an (albeit very brief) introduction to DNS. The difference now is that we have to delegate the subdomain to GNS,  which involves a little bit of work on the DNS server. Also, we need to put a virtual IP address for GNS into DNS, but that address must not be in use – the so-called GNS glue record.

Bind9 as shipped with RHEL 5.2 is again in use. Remember that the DNS server resides on auxOEL5.the-playground.de with an IP address of 192.168.1.82. I’m using 192.168.1.92 as the GNS glue record, and the subdomain used for GNS is rac.the-playground.de. Final reminder that this isn’t a production-grade configuration: a lot of security features normally found in real DNS servers is left out of the configuration to keep it simple. Let’s start again by installing bind:

# rpm -ihv bind-9.3.4-6.P1.el5.x86_64.rpm

With the software and its dependencies in place the next step is to create the configuration. This is a rather complex topics and many books have been written about bind 9. Getting into every detail of the configuration however is out of scope of this article, please refer to the documentation if you like to better understand what each option in the following files implies in more detail.

File: /etc/named.conf

options
{
	/* make named use port 53 for the source of all queries, to allow
         * firewalls to block all ports except 53:
         */
	query-source    port 53;	
	query-source-v6 port 53;
	
	// Put files that named is allowed to write in the data/ directory:
	directory "/var/named"; // the default
	dump-file 		"data/cache_dump.db";
        statistics-file 	"data/named_stats.txt";
        memstatistics-file 	"data/named_mem_stats.txt";

};
logging 
{
/*      If you want to enable debugging, eg. using the 'rndc trace' command,
 *      named will try to write the 'named.run' file in the $directory (/var/named).
 *      By default, SELinux policy does not allow named to modify the /var/named directory,
 *      so put the default debug log file in data/ :
 */
        channel default_debug {
                file "data/named.run";
                severity dynamic;
        };	
};

view "internal"
{
/* This view will contain zones you want to serve only to "internal" clients
   that connect via your directly attached LAN interfaces - "localnets" .
 */
	match-clients		{ localnets; };
	match-destinations	{ localnets; };
	recursion yes;
	// all views must contain the root hints zone:
	include "/etc/named.root.hints";


	zone "the-playground.de" IN {
 		type master;
		file "the-playground.zone";
		allow-transfer { 192.168.1.92; };
	};

	zone "1.168.192.in-addr.arpa" IN {
		type master;
		file "the-playground.reverse";
	};
};

File: /var/named/the-playground.zone

$TTL 86400
$ORIGIN the-playground.de.
@ 1D IN SOA auxOEL5.the-playground.de. hostmaster.the-playground.de. (
 2002022401 ; serial
 3H ; refresh
 15 ; retry
 1w ; expire
 3h ; minimum
)
; main domain name servers
 IN NS auxOEL5.the-playground.de.
auxOEL5 IN A 192.168.1.82
; A record for mail server above
mail IN A 192.168.1.82
 
rac11gr2node1 IN A 192.168.1.90
rac11gr2node2 IN A 192.168.1.91
 
; sub-domain definitions
$ORIGIN rac.the-playground.de.
@ IN NS gns.rac.the-playground.de.
; sub-domain address records for name server only - glue record
gns IN A 192.168.1.92 ; 'glue' record

File: /var/named/the-playground.reverse

$TTL 86400 ; 24 hours could have been written as 24h or 1d
@ 1D IN SOA the-playground.de. hostmaster.the-playground.de. (
 2002022401 ; serial
 3H ; refresh
 15 ; retry
 1w ; expire
 3h ; minimum
 )
 IN NS auxOEL5.the-playground.de.
82 IN PTR auxOEL5.the-playground.de.
90 IN PTR rac11gr2node1.the-playground.de.
91 IN PTR rac11gr2node2.the-playground.de.

That’s it! Use service named restart to activate your changes.

Validating the configuration – testing the clients

So far so good. Let’s check that name resolution via DNS works on RAC node1 and node2. The debugging tools are dig and nslookup. My configuration uses the following settings on the client.

File: /etc/resolv.conf

nameserver        192.168.1.82
search            the-playground.de

options attempts: 2
options timeout: 1

File: /etc/nsswitch.conf

...
group: files

#hosts:  db files nisplus nis dns
hosts:   files dns

That should be enough for the moment, if your name service caching daemon is active it might be a good idea to stop it now. First we try forward resolution:

[oracle@rac11gr2node1 ~]$ nslookup rac11gr2node1.the-playground.de
Server:          192.168.1.82
Address          192.168.1.82#53

Name:    rac11gr2node1.the-playground.de
Address: 192.168.1.90

Now let’s try the reverse lookup:

[oracle@rac11gr2node1 ~]$ nslookup 192.168.1.90
Server:           192.168.1.82
Address:          192.168.1.82#53

90.1.168.192.in-addr.arpa  name = rac11gr2node1.the-playground.de.

[oracle@rac11gr2node1 ~]$

With the subdomain delegation, who answers my questions for domain “the-playground.de”?

[oracle@rac11gr2node1 ~]$ dig the-playground.de

; <<>> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5 <<>> the-playground.de
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 29033
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0

;; QUESTION SECTION:
;the-playground.de. IN A

;; AUTHORITY SECTION:
the-playground.de. 10800 IN SOA auxOEL5.the-playground.de. hostmaster.the-playground.de. 2002022401 10800 15 604800 10800

;; Query time: 2 msec
;; SERVER: 192.168.1.82#53(192.168.1.82)
;; WHEN: Wed Oct 7 13:48:05 2009
;; MSG SIZE rcvd: 90

The important bit is the “AUTHORITY SECTION”. So, it’s working! I’ll write up my troubleshooting tips for DNS (I learned the hard way) in a future article.

We are ready to run runInstaller for the Grid Infrastructure now. Remember these points:

  • Public hostnames go into DNS such as rac11gr2node1.example.com
  • SCAN is NOT in DNS and will be created by DHCP. The SCAN address is in the subdomain, i.e. rac11gr2-scan.rac.the-playground.de
  • The GNS VIP must be in the subdomain delegation section

Responses

  1. Hi,

    Very interesting and extra ordinary. Nice article and discussion.
    Thanks a lot for same.

    Regards,
    Gitesh Trivedi
    http://www.dbametrix.com

  2. I am confused about the DNS configuration. You mentioned a dedicated DNS server on 192.168.30.5. Is that the server on which you configured this files? What server do you have at 192.168.1.82?

  3. Ayo,

    you actually found an error in the post, a leftover from a previous configuration. The nameserver “aux.the-playground.de” should be listed as 192.168.1.82.

    Thanks for your input,

    Martin

  4. Hi,

    I tried your article “Build your own 11.2 RAC system-part II: DNS & DHCP for GNS” and while installing 11gr2 Grid Infra I am running into this error (while executing root.sh on primary node).Iam not using /etc/hosts to resolv scan. Would appreciate if you can point out where I am going wrong.

    –error–
    CRS-2676: Start of ‘ora.CRS.dg’ on ‘vmnode1’ succeeded
    PRCR-1079 : Failed to start resource ora.gns
    CRS-2674: Start of ‘ora.gns’ on ‘vmnode1’ failed
    CRS-2632: There are no more servers to try to place resource ‘ora.gns’ on that would satisfy its placement policy
    start gns … failed
    Configure Oracle Grid Infrastructure for a Cluster … failed
    –error–

    Thank you
    Ragesh

    1. Ragesh,

      did you run cluvfy before you started the installation? It seems you are missing out on prerequisites. I suggest you double check the Grid Infrastructure installation guide for Linux and ensure that your DNS server really hands off DNS requests to GNS.

      Hope this helps,

      Martin

      1. Hi Martin,

        Thanks for a prompt response.
        But how can I check if subdomain delegation is working before the start of the grid infra install. I executed “./runcluvfy.sh stage -pre crsinst -n vmnode1,vmnode2 -fixup” before the start of install and all were successful except a “Warning:Could not find a suitable set of interfaces for VIPs”.

        Thanks
        Ragesh

      2. This would go too far to discuss here, you might want to consult the DNS server’s manual (bind in my case), especially the section on diagnosing and troubleshooting.

  5. Martin,
    Merry Christmas, if you celebrate Christams,and happy holidays, since I am writing this to you on Christmas day, nothing else to do. There are a few questions I would like to ask you please. One, when you say in your article “I also created a dedicated server for DNS and bind, called “auxOEL5.the-playground.de” on 192.168.1.82″, is this server outside of your RAC nodes being a physical anothor server explicitly for DNS? Second, I have two physicall servers with OEL5 update 5 installed on them, with all the HW requirements met, the run of ” ./runcluvfy.sh stage -pre crsinst -n host01,host02 -fixup -verbose ” completely successful, and I followed step by steyp this article, and yet my nslookup fails, why? Third, in your article, you have “rac11gr2node1/node2″. Are these the hostname of your physical RAC nodes?

    The only modification I have made on my setup for DNS, is that my domain is called ” localdoamin as opposed to yours the-playground.de”. And I named my servers thier acutal hostnames, host01 and host02. Why is it that I cannot get “nslookup” to work? This is pretty much a showstopper for me and I appreciate your help. Here is my “nslookup” search:

    [root@host01 tmp]# nslookup host01.localdomain
    Server: 76.14.0.8
    Address: 76.14.0.8#53

    ** server can’t find host01.localdomain: NXDOMAIN

    As you can see the DNS server is the DNS server name from my ISP and not the one with 192.168.1.82, eventhough I put its entry in the /etc/resolv.conf file like so:
    ; generated by /sbin/dhclient-script
    search astound.net
    nameserver 76.14.0.8
    nameserver 76.14.0.9
    nameserver 192.168.1.82
    search localodmain

    And here is the result of “dig”

    [root@host01 tmp]# dig localdomain.

    ; <> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_4.2 <> localdomain.
    ;; global options: printcmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 24230
    ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0

    ;; QUESTION SECTION:
    ;localdomain. IN A

    ;; AUTHORITY SECTION:
    localdomain. 86400 IN SOA localhost.localdomain. root.localdomain. 42 10800 900 604800 86400

    ;; Query time: 15 msec
    ;; SERVER: 76.14.0.8#53(76.14.0.8)
    ;; WHEN: Sat Dec 25 13:57:35 2010
    ;; MSG SIZE rcvd: 80

    Please let me know.
    thanks,
    Kabbo

    1. Hi Kabbo,

      Thanks for your Christmas wishes, well received. I am UK based, which means I’m on holidays until Wednesday this week :) Anyway, I’ll try and answer your questions, see below:

      1) The auxOEL5 server is a dedicated VM, and dns is not running on any of the RAC nodes. Which wouldn’t be a good idea anyway in my opinion.

      2) Your nslookup may use the wrong DNS server. Have a look at the man page for nslookup, it has an interactive mode where you can set your domain server to the one you are using. Your RAC nodes have to set your own name server in /etc/resolv.conf, not the ISP’s

      And finally, my hosts rac11gr2node{1,2} are virtual machines.

      Hope this helps,

      Martin

  6. Hi Martin,

    It certainly helps. I completely agree with you to not have one of the RAC to server as your DNS server as well which makes it very impractical if it is in a production environment. However, this is my home setup and I wanted to just get it done so that I could start practicing for my next Oracle certification, Oracle OCM. So, haveing explained that, will I still be able to use one of my RAC nodes as a local DNS server as wells? If I can use it, do I need to cofigure “DNS for GNS” on the second RAC node as wells?

    thanks for your prompt response!
    Kabbo

    1. Hi Kabbo,

      well as you can imagine running a DNS server on one of your cluster nodes will defeat the purpose of the cluster once that node is down: it takes the DNS server with it. Naming resolution will fail and no one can connect to the cluster. If you really have to, make the bind server highly available-for pointers see either chapter 8 of Pro Oracle Database 11g Rac on Linux or chapter 5 of the Clusterware Administration Guide.

      In your case I’d use /etc/hosts for name resolution – easier to set up and works as well. Just not supported by Oracle, but you wouldn’t mind on your playground environment. Oh, and GNS doesn’t work with this setup, obviously.

      Hope this helps,

      Martin

  7. Simply awesome and straight forward, as always
    Keep up the good work
    Farooq

  8. Quick question, though not directly related to the post but still you might be able to help.

    The subdoamin delegation to gns works fine and the scan ip addresses respond in a round-robin fashion. But for some reason i am unable to ping the gns by the alias (ip ping works fine). The last line in the file “/var/named/the-playground.zone” has been glued to point to gns, but the gns itself does not respond with the alias.

    One way that works is to add the gns entry to /etc/hosts file, but i want that to happen through the dns.

    Any clue on how to accomplish this.

    Cheers,
    Farooq

  9. Hi Martin, firstly I want to thank you for your blog
    this is really excellent that you share with your experience , RESPECT! I follow instructions in this (Build your own 11.2 RAC system-part II: DNS & DHCP for GNS) article but I facing error
    white DNS configuration.
    PRVF-4664 : Found inconsistent name resolution entries for SCAN name.
    So I want to ask you this is final configuration and your installation is completed successful ?
    ./thanks

    https://forums.oracle.com/forums/thread.jspa?threadID=2320951&messageID=10033453#10033453

    1. I have just done another installation using my instructions and it worked.

      Do you have scan related information in your hosts file? Is the nsswitch file consistent across nodes? If yes, and you still have the problem please run cluvfy and post the relevant output here

      1. Thanks Martin for reply , I already solved this problem, I don’t know why but OUI don’t
        understood my new parameters in DNS and I was forced to restart OUI instead of retry with cluster verification utility.

        ./thanks

  10. Hi Martin,

    Thank you for all of these nice articles. I wanted to understand this concept correctly since my first try was not successful. First, the hardware that I have at hand. Please don’t advise on Virtualization.

    1- I have 2 physical servers with OEL with update 5 installed. Each machine has 8GB of RAM and 500 GB of disk space. The disks in each server is SATA, two SATA in each one.
    2 – Two NIC cards in each server , one for Pbulic and one Private interconnect
    3- all the user equivalence, ssh, are setup between the two and they are communication with each other.
    4 – This set up is at my house, not at work, so I have an ISP with cable modem. Then I use a 4 port Linksys Router along with 4 port Netgear switch.
    5 – I have one PC client set up to access these Linux machines via Putty

    I would like to setup Oracle 11g R2 RAC here, and here is my question. Basically, it boils down to the shared storage, and here is the question:

    How would I set up one of the severs not only as one of the RAC nodes, but also, as the shared storage device? I know that this is not a good practice in prod, but I just need to use in my house for hands on practice of getting my certification in RAC and MAA.

    Do I need a third NIC card in each server for communicating to the shared storage? Please let me know.

    thanks,
    Kabbo

    1. Hi Kabbo,

      The only way I can see is to use iSCSI on one of the hosts. I have used tgtd for this successfully, it all depends on available CPU cycles. You know that if your storage is too slow you might get node evictions. The use if iSCSI Not pretty, but good enough to get your feet wet. Definitely don’t do this in a real life project! Having the iSCSI targets on the same hosts as your cluster.

      Hope this helps,

      Martin

  11. Hi Martin, thanks for article is very useful , but I want to ask , is this configuration for only RHEL 5.4 OS ? I trying in OEL 6.2. I able to dig my RAC nodes but not to subdomain.
    I get “connection timed out; no servers could be reached” with dig or nslookup.

    Please help.
    ./thanks

    1. I solved this problem , sorry :)
      However I have questions please take time to answer.
      1. In dhcp configuration I didn’t found “option routers” parameter. This is not required ?
      2. In dhcp configuration you mentioned “the-playground.de” for “option domain-name” not
      sub-domain “rac.the-playground.de”.

      Can you please explain usage of these parameters in RAC system point of view.

      Thanks for advance.

    2. Hi Tiran,

      yes it was some RHEL 5.x version, it might have been 5.3 given the time I wrote the article.

      Martin

      1. OK thanks.

  12. Hi Martin, Sorry to drag an old post, but I just wanted to say thanks. I was having problems installing GNS with 12c and it was the placement of the gns record in the DNS that was causing it. Your post helped me out.

    1. No need to apologize, I am glad this post helped!

Blog at WordPress.com.