Change SCAN post installation

A quick note to myself on how to change the SCAN from /etc/hosts to DNS. I have set up a small DNS server for a local interface on my dom0 to help me resolve problems around TAF and session failover in general. This is a test/dev/playground cluster-you should have never been in this situation with anything used “for real”! Multipe IP addresses for a SCAN are needed since you no longer specify all the node VIPs in tnsnams.ora-there is no FAILOVER=yes either! DNS round robin now provides this for you.

During the initial installation of my four node cluster I hard coded the SCAN into /etc/hosts, resolving to 192.168.99.23. Now I want more SCAN listeners, so I need to enter a few more addresses into my bind9 config. The round robin DNS resolution for scanrac11gr2, my SCAN, is configured as follows in /var/lib/named/zone.file:

openSUSE:~ # cat /var/lib/named/localdomain.zone
$TTL 1W
@               IN SOA  @   root (
                42              ; serial (d. adams)
                2D              ; refresh
                4H              ; retry
                6W              ; expiry
                1W )            ; minimum

                        IN NS   @
                        IN A    192.168.99.10

rac11gr2node1           IN A    192.168.99.15
rac11gr2node2           IN A    192.168.99.16
rac11gr2node3           IN A    192.168.99.17
rac11gr2node4           IN A    192.168.99.18

rac11gr2node1-vip       IN A    192.168.99.19
rac11gr2node2-vip       IN A    192.168.99.20
rac11gr2node3-vip       IN A    192.168.99.21
rac11gr2node4-vip       IN A    192.168.99.22

scanrac11gr2            IN A    192.168.99.23
scanrac11gr2            IN A    192.168.99.24
scanrac11gr2            IN A    192.168.99.25
...

The reverse mapping is defined this way:

openSUSE:~ # cat /var/lib/named/localdomain.reverse
$TTL 1W
@               IN SOA          openSUSE.   root.openSUSE. (
                42              ; serial (d. adams)
                2D              ; refresh
                4H              ; retry
                6W              ; expiry
                1W )            ; minimum

                IN NS           openSUSE.localdomain.
15              IN PTR          rac11gr2node1.localdomain.
16              IN PTR          rac11gr2node2.localdomain.
17              IN PTR          rac11gr2node3.localdomain.
18              IN PTR          rac11gr2node4.localdomain.

19              IN PTR          rac11gr2node1-vip.localdomain.
20              IN PTR          rac11gr2node2-vip.localdomain.
21              IN PTR          rac11gr2node3-vip.localdomain.
22              IN PTR          rac11gr2node4-vip.localdomain.

23              IN PTR          scanrac11gr2.localdomain.
24              IN PTR          scanrac11gr2.localdomain.
25              IN PTR          scanrac11gr2.localdomain.

Now all I had to do was to remove the scan from /etc/hosts and add the following to /etc/resolv.conf:

nameserver  192.168.99.10
search  localdomain

A quick verification showed that this worked:

[oracle@rac11gr2node2 ~]$ nslookup scanrac11gr2
Server:        192.168.99.10
Address:    192.168.99.10#53

Name:    scanrac11gr2.localdomain
Address: 192.168.99.25
Name:    scanrac11gr2.localdomain
Address: 192.168.99.23
Name:    scanrac11gr2.localdomain
Address: 192.168.99.24

Wonderful – 3 IP addresses now. I then changed to the grid user who owns my Grid Infrastructure installation:

[oracle@rac11gr2node2 ~]$ su - grid
Password:
[grid@rac11gr2node2 ~]$ . oraenv
ORACLE_SID = [grid] ? +ASM2
The Oracle base for ORACLE_HOME=/u01/app/grid/product/11.2.0/crs is /u01/app/oracle

I stopped current SCAN and SCAN listener

[grid@rac11gr2node2 ~]$ srvctl stop scan_listener
[grid@rac11gr2node2 ~]$ srvctl stop scan

Actually, curious what the configuration was-I like a before/after look at things:

[grid@rac11gr2node2 ~]$ srvctl config scan
SCAN name: scanrac11gr2, Network: 1/192.168.99.0/255.255.255.0/eth0
SCAN VIP name: scan1, IP: /scanrac11gr2.localdomain/192.168.99.23
[grid@rac11gr2node2 ~]$ srvctl config scan_listener
SCAN Listener LISTENER_SCAN1 exists. Port: TCP:1521

So right, it used just 1 SCAN and SCAN listener which is NOT GOOD from a HA point of view! I wanted to be sure the resources are really stopped before continuing:

[grid@rac11gr2node2 ~]$ crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
 ONLINE  ONLINE       rac11gr2node1                                
 ONLINE  ONLINE       rac11gr2node2                                
 ONLINE  ONLINE       rac11gr2node3                                
 ONLINE  ONLINE       rac11gr2node4                                
ora.LISTENER.lsnr
 ONLINE  ONLINE       rac11gr2node1                                
 ONLINE  ONLINE       rac11gr2node2                                
 ONLINE  ONLINE       rac11gr2node3                                
 ONLINE  ONLINE       rac11gr2node4                                
ora.OCRVOTE.dg
 ONLINE  ONLINE       rac11gr2node1                                
 ONLINE  ONLINE       rac11gr2node2                                
 ONLINE  ONLINE       rac11gr2node3                                
 ONLINE  ONLINE       rac11gr2node4                                
ora.asm
 ONLINE  ONLINE       rac11gr2node1            Started             
 ONLINE  ONLINE       rac11gr2node2                                
 ONLINE  ONLINE       rac11gr2node3                                
 ONLINE  ONLINE       rac11gr2node4                                
ora.eons
 ONLINE  ONLINE       rac11gr2node1                                
 ONLINE  ONLINE       rac11gr2node2                                
 ONLINE  ONLINE       rac11gr2node3                                
 ONLINE  ONLINE       rac11gr2node4                                
ora.gsd
 OFFLINE OFFLINE      rac11gr2node1                                
 OFFLINE OFFLINE      rac11gr2node2                                
 OFFLINE OFFLINE      rac11gr2node3                                
 OFFLINE OFFLINE      rac11gr2node4                                
ora.net1.network
 ONLINE  ONLINE       rac11gr2node1                                
 ONLINE  ONLINE       rac11gr2node2                                
 ONLINE  ONLINE       rac11gr2node3                                
 ONLINE  ONLINE       rac11gr2node4                                
ora.ons
 ONLINE  ONLINE       rac11gr2node1                                
 ONLINE  ONLINE       rac11gr2node2                                
 ONLINE  ONLINE       rac11gr2node3                                
 ONLINE  ONLINE       rac11gr2node4                                
ora.registry.acfs
 ONLINE  ONLINE       rac11gr2node1                                
 ONLINE  ONLINE       rac11gr2node2                                
 ONLINE  ONLINE       rac11gr2node3                                
 ONLINE  ONLINE       rac11gr2node4                                
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
 1        OFFLINE OFFLINE                                                   
ora.admindb.db
 1        ONLINE  ONLINE       rac11gr2node2                                
 2        OFFLINE OFFLINE                               Instance Shutdown   
ora.admindb.reporting.svc
 1        ONLINE  OFFLINE                                                   
 2        ONLINE  ONLINE       rac11gr2node2                                
ora.admindb.techserv.svc
 1        ONLINE  OFFLINE                                                   
 2        ONLINE  ONLINE       rac11gr2node2                                
ora.oc4j
 1        OFFLINE OFFLINE                                                   
ora.poldb.db
 1        ONLINE  ONLINE       rac11gr2node3                                
 2        ONLINE  INTERMEDIATE rac11gr2node4                                
ora.poldb.dev-wiki.svc
 1        ONLINE  ONLINE       rac11gr2node3                                
ora.polstdby.db
 1        ONLINE  INTERMEDIATE rac11gr2node3                                
 2        ONLINE  INTERMEDIATE rac11gr2node4                                
ora.rac11gr2node1.vip
 1        ONLINE  ONLINE       rac11gr2node1                                
ora.rac11gr2node2.vip
 1        ONLINE  ONLINE       rac11gr2node2                                
ora.rac11gr2node3.vip
 1        ONLINE  ONLINE       rac11gr2node3                                
ora.rac11gr2node4.vip
 1        ONLINE  ONLINE       rac11gr2node4                                
ora.scan1.vip
 1        OFFLINE OFFLINE                                                   

SCAN and SCAN listener are indeed offline so let’s proceed. As with so many things you have to be root to do this.

[grid@rac11gr2node2 ~]$ su -
Password:
[root@rac11gr2node2 ~]# . oraenv
ORACLE_SID = [root] ? +ASM2
The Oracle base for ORACLE_HOME=/u01/app/grid/product/11.2.0/crs is /u01/app/oracle

The syntax to modify the SCAN is as follows:

[root@rac11gr2node2 ~]# srvctl modify scan -h

Modifies the SCAN name.

Usage: srvctl modify scan -n <scan_name>
 -n <scan_name>           Domain name qualified SCAN name
 -h                       Print usage

Simple enough, so let’s give it a go:

[root@rac11gr2node2 ~]# srvctl modify scan -n scanrac11gr2
[root@rac11gr2node2 ~]# srvctl config scan
SCAN name: scanrac11gr2, Network: 1/192.168.99.0/255.255.255.0/eth0
SCAN VIP name: scan1, IP: /scanrac11gr2.localdomain/192.168.99.24
SCAN VIP name: scan2, IP: /scanrac11gr2.localdomain/192.168.99.25
SCAN VIP name: scan3, IP: /scanrac11gr2.localdomain/192.168.99.23

Now that seemed to have worked! Updating the listener is trivial:

[root@rac11gr2node2 ~]# srvctl modify scan_listener -u
[root@rac11gr2node2 ~]# srvctl config scan_listener
SCAN Listener LISTENER_SCAN1 exists. Port: TCP:1521
SCAN Listener LISTENER_SCAN2 exists. Port: TCP:1521
SCAN Listener LISTENER_SCAN3 exists. Port: TCP:1521

All good, so let’s start the scan listener:

[root@rac11gr2node2 ~]# srvctl start scan_listener

Success here:

[root@rac11gr2node2 ~]# srvctl config scan
SCAN name: scanrac11gr2, Network: 1/192.168.99.0/255.255.255.0/eth0
SCAN VIP name: scan1, IP: /scanrac11gr2.localdomain/192.168.99.24
SCAN VIP name: scan2, IP: /scanrac11gr2.localdomain/192.168.99.25
SCAN VIP name: scan3, IP: /scanrac11gr2.localdomain/192.168.99.23
[root@rac11gr2node2 ~]# srvctl config scan_listener
SCAN Listener LISTENER_SCAN1 exists. Port: TCP:1521
SCAN Listener LISTENER_SCAN2 exists. Port: TCP:1521
SCAN Listener LISTENER_SCAN3 exists. Port: TCP:1521

TAF still not working

Brilliant, and back to my TAF problems. Actually they didn’t go away with this. When shutting down one of my 2 instances I got an ORA-12514 “TNS: listener doesn not currently know of service requested in connect descriptor”.

[oracle@rac11gr2node1 tns]$ tnsping reporting

TNS Ping Utility for Linux: Version 11.2.0.1.0 - Production on 04-MAY-2010 14:03:44

Copyright (c) 1997, 2009, Oracle. All rights reserved.

Used parameter files:

Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = scanrac11gr2.localdomain)
(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = reporting) (FAILOVER_MODE = 
(TYPE = SELECT)(METHOD = BASIC)(RETRIES = 180)(DELAY = 5))))
OK (10 msec)

However whenever I tried to connect to the service name reporting, I got this when using the SCAN:

[oracle@rac11gr2node1 tns]$ sqlplus martin/asdsa@reporting

SQL*Plus: Release 11.2.0.1.0 Production on Tue May 4 14:01:58 2010

Copyright (c) 1982, 2009, Oracle. All rights reserved.

ERROR:
ORA-12514: TNS:listener does not currently know of service requested in connect
descriptor

Enter user-name: 

Bugger! I restarted the listener and SCAN listeners to see if I could get rid of the problem. A client trace revealed that the SCAN listener was actually blocking connections. I rebooted my test cluster and the problem went away. Weird-need to do more investigation.

Blog at WordPress.com.