Build your own stretched RAC

Finally time for a new series! With the arrival of the new patchset I thought it was about time to try and set up a virtual extended distance or stretched RAC. So, it’s virtual, fair enough. It doesn’t allow me to test things like the impact of latency on the inter-SAN communication, but it allowed me to test the general setup. Think of this series as a guide after all the tedious work has been done, and SANs happily talk to each other. The example requires some understanding of how XEN virtualisation works, and it’s tailored to openSuSE 11.2 as the dom0 or “host”. I have tried OracleVM in the past but back then a domU (or virtual machine) could not mount an iSCSI target without a kernel panic and reboot. Clearly not what I needed at the time. OpenSuSE has another advantage: it uses a new kernel-not the 3 year old 2.6.18 you find in Enterprise distributions. Also, xen is recent (openSuSE 11.3 even features xen 4.0!) and so is libvirt.

The Setup

The general idea follows the design you find in the field, but with less cluster nodes. I am thinking of 2 nodes for the cluster, and 2 iSCSI target providers. I wouldn’t use iSCSI in the real world, but my lab isn’t connected to an EVA or similar.A third site will provide quorum via an NFS provided voting disk.

Site A will consist of filer01 for the storage part, and edcnode1 as the RAC node. Site B will consist of filer02 and edcnode2. The iSCSI targets are going to be provided by openFiler’s domU installation, and the cluster nodes will make use of Oracle Enterprise Linux 5 update 5.To make it more realistic, site C will consist of another openfiler isntance, filer03 to provide the NFS export for the 3rd voting disk. Note that openFiler seems to support NFS v3 only at the time of this writing. All systems are 64bit.

The network connectivity will go through 3 virtual switches, all “host only” on my dom0.

  • Public network: 192.168.99/24
  • Private network: 192.168.100/24
  • Storage network: 192.168.101/24

As in the real world, private and storage network have to be separated to prevent iSCSI packets clashing with Cache Fusion traffic. Also, I increased the MTU for the private and storage networks to 9000 instead of the default 1500. If you like to use jumbo frames you should check if your switch supports it.

Grid Infrastructure will use ASM to store OCR and voting disks, and the inter-SAN replication will also be performed by ASM in normal redundancy. I am planning on using preferred mirror read and intelligent data placement to see if that makes a difference.

Known limitations

This setup has some limitations, such as the following ones:

  • You cannot test inter-site SAN connectivity problems
  • You cannot make use of udev for the ASM devices-a xen domU doesn’t report anything back from /sbin/scsi_id which makes the mapping to /dev/mapper impossible (maybe someone knows a workaround?)
  • Network interfaces are not bonded-you certainly would use bonded NICs in real life
  • No “real” fibre channel connectivity between the cluster nodes

So much for the introduction-I’ll post the setup step-by-step. The intended series will consist of these articles:

  1. Introduction to XEN on openSuSE 11.2 and dom0 setup
  2. Introduction to openFiler and their installation as a virtual machine
  3. Setting up the cluster nodes
  4. Installing Grid Infrastructure
  5. Adding third voting disk on NFS
  6. Installing RDBMS binaries
  7. Creating a database

That’s it for today, I hope I got you interested and following the series. It’s been real fun doing it; now it’s about writing it all up.


4 thoughts on “Build your own stretched RAC

  1. Jakub Wartak


    actually “It doesn’t allow me to test things like the impact of latency on the inter-SAN communication, but it allowed me to test the general setup.” is not true ;) You can use Linux on Xen dom0 to simulate that using “netem” queuing.

    I’ve already been there, see page 3.

    Or as before OTN reorg,

    Enjoy! :)

  2. Pingback: Extended Distance RAC Test Cluster « Dave Burnham's Rambling Database Technology Blog

  3. Gil Standen

    I use this udev rule for Xen ASM vbd devices:

    [root@i-6-60211-VM rules.d]# udevinfo -a -p /block/xvdb | grep nodename
    [root@i-6-60211-VM rules.d]# udevinfo -a -p /block/xvdc | grep nodename

    [root@i-6-60211-VM rules.d]# cat /etc/udev/rules.d/99-oracle-asmdevices.rules

    SYSFS{nodename}==”device/vbd/51728″, NAME=”asmdisk1″, ACTION==”add|change”, OWNER=”grid”, GROUP=”asmadmin”, MODE=”0660″
    SYSFS{nodename}==”device/vbd/51744″, NAME=”asmdisk2″, ACTION==”add|change”, OWNER=”grid”, GROUP=”asmadmin”, MODE=”0660″

    [root@i-6-60211-VM rules.d]# ls -lrt /dev/xvdb1
    brwx—— 1 root root 202, 17 Nov 21 10:17 /dev/xvdb1
    [root@i-6-60211-VM rules.d]# ls -lrt /dev/xvdc1
    brwx—— 1 root root 202, 33 Nov 21 10:17 /dev/xvdc1
    [root@i-6-60211-VM rules.d]#

    This produces the following usable devices nodes:

    [root@i-6-60211-VM rules.d]# ls -lrt /dev/asm*

    brw-rw—- 1 grid asmadmin 202, 17 Nov 21 10:50 /dev/asmdisk1
    brw-rw—- 1 grid asmadmin 202, 33 Nov 21 10:50 /dev/asmdisk2

Comments are closed.