Build your own stretched RAC part III

On to the next part in the series. This time I am showing how I prepared the iSCSI openFiler “appliances” on my host. This is quite straight forward, if one knows how it works :)

Setting up the openFiler appliance on the dom0

OpenFiler 2.3 has a special download option suitable for paravirtualised Xen hosts. Proceed by downloading the file from your favourite mirror, the file name I am using is “openfiler-2.3-x86_64.tar.gz”, you might have to pick another one if you don’t want a 64bit system.

All my domU go to /var/lib/xen/images/vm-name, and so do the openFiler ones. I am not using LVM to present storage to the domUs, my system came without free space I could have turned into a physical volume. Here are the steps to create the openFiler, remember to repeat this 3 times, one for each storage provider.

Begin with the first openFiler appliance. Whenever you see numbers in {} then that implies that the operation has to be repeated for each of the numbers in the curly braces.

# cd /var/lib/xen/images/
# mkdir filer0{1,2,3}
# cd filer0{1,2}

Next create the virtual disks for the appliance. I use 4G for the root file system and one 5G + 2 10G disks. The 5G disk will later on be part of the OCR and voting files disk group, whereas the other two are going to be the local ASM disks. These steps are for filer01 and filer02, the iSCSI target providers.

# dd if=/dev/zero of=disk01 bs=1 count=0 seek=4G
0+0 records in
0+0 records out
0 bytes (0 B) copied, 1.3296e-05 s, 0.0 kB/s  

# dd if=/dev/zero of=disk02 bs=1 count=0 seek=5G
# dd if=/dev/zero of=disk03 bs=1 count=0 seek=10G
# dd if=/dev/zero of=disk04 bs=1 count=0 seek=10G

For the NFS filer03, you only need two 4G disks, disk1 and disk2. For all filers, a root partition has to be created. You also have to create a file system on the “root” volume:

# mkfs.ext3 disk01
mke2fs 1.41.9 (22-Aug-2009)
disk01 is not a block special device.
Proceed anyway? (y,n) y
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
262144 inodes, 1048576 blocks
52428 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=1073741824
32 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 21 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
openSUSE-112-64-minimal:/var/lib/xen/images/filer01 #

Prepare to mount the root volume as a loop device, and also label the disk. Once mounted, copy the contents of the downloaded openfiler tarball into it as shown in this example:

# e2label disk01 root
# mkdir tmpmnt/
# mount -o loop disk01 tmpmnt/
# cd tmpmnt
# tar --gzip -xvf /m/downloads/openfiler-2.3-x86_64.tar.gz

With this done, we need to extract the kernel and the initial RAMdisk for later use in the xen config file. I have not experimented with pygrub for the openfiler appliances, someone with more knowledge may correct me here. This in any case works for this demonstration:

# mkdir  /m/xenkernels/openfiler
# cp -a /var/lib/xen/images/filer01/tmpmnt/boot /m/xenkernels/openfiler

Here are the files now stored inside the kernel directory on the dom0:

# ls -l /m/xenkernels/openfiler/
total 9276
-rw-r--r-- 1 root root  770924 May 30  2008
-rw-r--r-- 1 root root   32220 Jun 28  2008 config-
drwxr-xr-x 2 root root    4096 Jul  1  2008 grub
-rw-r--r-- 1 root root 1112062 Jul  1  2008 initrd-
-rw-r--r-- 1 root root 5986208 May 14 18:01 vmlinux
-rw-r--r-- 1 root root 1558259 Jun 28  2008 vmlinuz-

With this information, at hand we can construct ourselves a xen configuration file, such as the following:

# cat filer01.xml
<domain type='xen' >
    <cmdline>root=/dev/xvda1 ro </cmdline>
  <clock offset='utc'/>
    <disk type='file' device='disk'>
      <driver name='file'/>
      <source file='/var/lib/xen/images/filer01/disk01'/>
      <target dev='xvda1' bus='xen'/>
    <disk type='file' device='disk'>
      <driver name='file'/>
      <source file='/var/lib/xen/images/filer01/disk02'/>
      <target dev='xvdb' bus='xen'/>
    <disk type='file' device='disk'>
      <driver name='file'/>
      <source file='/var/lib/xen/images/filer01/disk03'/>
      <target dev='xvdc' bus='xen'/>
    <disk type='file' device='disk'>
      <driver name='file'/>
      <source file='/var/lib/xen/images/filer01/disk04'/>
      <target dev='xvdd' bus='xen'/>
    <interface type='bridge'>
      <mac address='00:16:3e:38:75:88'/>
      <source bridge='br1'/>
      <script path='/etc/xen/scripts/vif-bridge'/>
    <interface type='bridge'>
      <mac address='00:16:3e:38:75:89'/>
      <source bridge='br3'/>
      <script path='/etc/xen/scripts/vif-bridge'/>
    <console type='pty' tty='/dev/pts/0'>
      <source path='/dev/pts/0'/>
      <target port='0'/>

In plain English, this verbose XML file describes the VM as a paravirtualised linux system with 4 hard disks and 2 network interfaces. The MAC must be static, otherwise you’ll end up with network problems each time you boot. For all currently started domUs the MAC also has to be unique! Change the UUID, name, paths to the disks (“source file”) and MAC addresses for filer02. The same applies for filer03, but this one only uses 2 disks-xvda and xvdb so please remove the disk-tags for disk03 and disk04.

Define the VM in xenstore and start it, while staying attached to the console:

# virsh define filer0{1,2,3}.xml
# xm start filer01 -c

Repeat this for filer02.xml and filer03.xml in separate terminal sessions.

Eventually, you are going to be presented with the welcome screen:

 Welcome to Openfiler NAS/SAN Appliance, version 2.3

You do not appear to have networking. Please login to start networking.

Configuring the OpenFiler domU

Log in as root (which doesn’t have a password, you should change this now!) and correct the missing network information. We have 2 virtual NICs, eth0 for the public network, and eth1 for the storage network. As root, navigate to /etc/sysconfig/network-scripts/ and edit ifcfg-eth{0,1}. In our example, we need 2 static interfaces. For eth0 for example, the existing file has the following contents:

[root@localhost network-scripts]# vi ifcfg-eth0
# Device file installed by rBuilder

Change this to:

[root@localhost network-scripts]# cat ifcfg-eth0
# Device file installed by rBuilder

Similarly, change ifcfg-eth1 for address and restart the network:

[root@localhost network-scripts]# service network restart

After this, ifconfig should report the correct interfaces and you are ready to access the web console.

The network for filer02 uses for eth0 and for eth1. Similarly, filer03 uses for eth0 and for eth1.

All domUs are in the internal network, you have to set up some port forwarding rules. The easiest way  to do this is in your $HOME/.ssh/config file. For my server, I set up the following options:

martin@linux-itgi:~> cat .ssh/config
Host *eq8
HostName eq8
User martin
Compression yes
# note the white space 
LocalForward 4460
LocalForward 4470
LocalForward 4480
LocalForward 5902

# other hosts
Host *
PasswordAuthentication yes
 FallBackToRsh no

I am forwarding the local ports 4460, 4470, 4480 on my PC to the openfiler appliances. This way, I can enter https://localhost:44{6,7,8}0 to access the web frontend for the openFiler appliance. This is needed, as you can’t really administer them otherwise. When using Firefox, you’ll get a warning about certificates-I have added security exceptions because I know the web server is not conducting a man in the middle attack on me. You should always be careful adding unknown certificates to your browser in other cases.

Administering OpenFiler

NOTE: The following steps are for filer01 and filer02 only!

Once logged in as user “openfiler” (the default password is “password”), you might want to secure that password. Click on Accounts -> Admin Password and make the changes you like.

Next I recommend you verify the system setup. Click on System and review the settings. You should see the network configured correctly, and can change the hostname to filer0{1,2}.localdomain. Save your changes. Networking settings should be correct, if not you can update them here.

Next we need to partition our block devices. Previously unknown to me, openFiler uses the “gpt” format to partition disks. Click on Volumes -> Block devices to see all the block devices. Since you are running a domU, you can’t see the root device /dev/xvda. For each device (xvd{b,c,d} create one partition spanning the whole of the “disk”. You can do so by clicking on the device name. Scroll down to the “Create partition in /dev/xvdx” section and fill the data. Click “create” to create the partition. Note that you can’t see the partitions in fdisk should you log in to the appliance as root.

Once the partitions are created, it’s time to create volumes to be exported as iSCSI targets. Still in “Volumes”, click on “Volume Groups”. I chose to create the following volume groups:

  • ASM_VG with member PVs xvdc1 and xvdd1
  • OCRVOTE_VG with member PV xvdb1

Once the volume groups are created, you should proceed by creating logical volumes within these. Click on “Add Volume” to access this screen. You have a drop-down menu to select your volume group. For OCRVOTE_VG I opted to create the following logical volumes (you have to set the type to iSCSI rather than XFS):

  • ocrvote01_lv, about 2.5G in size, type iSCSI
  • ocrvote02_lv, about 2.5G in size, type iSCSI

For volume group ASM_VG, I created these logical volumes:

  • asmdata01_lv, about 10G in size, type iSCSI
  • asmdata02_lv, about 10G in size, type iSCSI

We are almost there! The storage has been carved out of the pool of available storage, and what remains to be done is the definition of the iSCSI targets and ACLs. You can define very fine grained access to iSCSI targets, and even for iSCSI discovery! This example tries to keep it simple and doesn’t use any CHAP authentication for iSCSI targets and discovery-in the real world you’d very much want to implement these security features though.

Preparing the iSCSI part

We are done for now on the Volumes tab. First, we need to enable the iSCSI target server. In “Services”, ensure that the “iSCSI target server” is enabled. If not, click on the link next to it. Before we can export any LUNs, we need to define who is eligible to mount them. In openFiler, this is configured via ACLs. Go to the “System” tab and scroll down to the “Network access configuration” section. Fill in the details of our cluster nodes here as shown below. These are the settings for edcnode1:

  • Name: edcnode1
  • Network/Host:
  • Netmaksk: (IMPORTANT: it has to be, NOT
  • Type: share

The settings for edcnode2 are identical, except for the IP address which is, we are configuring the “STORAGE” network here! Click on “Update” to make the changes permanent. You are now ready to create the iSCSI targets, of which there will be 2: one for the OCR/Voting Disk, and another one for the ASM LUNs.

Back to the Volume tab, click on “iSCSI targets”. You will be notified that no targets have been defined yet. You will have to defined the following targets for filer01:


Leave the default settings, they will do for our example. You simply add the name to the “Target IQN” field and then click on “Add”. The targets currently don’t support any LUNs yet, something that needs addressing in this step.

Switch to target and then use the tab “LUN mapping” to map a LUN. In the list of available LUNs add ocrvote01_lv and ocrvote02_lv to the target. Click on “network ACL” and allow access to the LUN from edcnode1 and edcnode2. For the first ASM target, map asmdata01_lv and set the permissions, then repeat for the last target with asmdata02_lv.

Create the following targets for filer02:


The mappings and settings for the ASM targets are identical to filer01, but for the OCRVOTE target only export the first logical volume, i.e. ocrvote01_lv.

NFS export

The third filer, filer03 is a little bit different in way that it only exports a NFS share to the cluster. It only has one data disk, data02. In a nutshell, create the filer as described to the point where it’s accessible via its web interface. The high level steps for it are:

  1. Partition /dev/xvdb into 1 partition spanning the whole disk
  2. Create a volume group ocrvotenfs_vg from /dev/xvdb1
  3. Create a logical volume nfsvol_lv, approx 1G in size with ext3 as its file system
  4. Enable the NFS v3 server (Services tab)

From there on the procedure is slightly different. Click on “Shares” to access the network shares available from the filer. You should see your volume group with the logical volume nfsvol_lv. Click on the link “nfsvol_lv” and enter “ocrvote” as subfolder name. A new folder icon with the name ocrvote will appear. Click on this one, and in the pop-up dialog click on “Make share”. You should set the following on the now opening lengthy configuration dialog:

  • Public guest acces
  • Host access for edcnode1 and edcnode2 for NFS RW (select the radio button)
  • Click on edit to access special options for edcnode1 and edcnode2. Ensure that the anonymous UID and GID match the one for the grid software owner. The UID/GID mapping has to be “all_squash”, IO mode has to be “sync”. You can ignore the write delay and origin port for this example
  • Leave all other protocols deselected
  • Click update to make the changes permanent

That was it! The storage layer is now perfectly set up for the cluster nodes which I’ll discuss in a follow-on post.

openSUSE-112-64-minimal:/var/lib/xen/images/filer01 # mkfs.ext3 disk01
mke2fs 1.41.9 (22-Aug-2009)
disk01 is not a block special device.
Proceed anyway? (y,n) y
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
262144 inodes, 1048576 blocks
52428 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=1073741824
32 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: doneThis filesystem will be automatically checked every 21 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
openSUSE-112-64-minimal:/var/lib/xen/images/filer01 #mkdir tmpmnt/e2label disk01 root

mount -o loop disk01 tmpmnt/

cd tmpmnt

tar –gzip -xvf ../openfiler-2.3-x86_64.tar.gz

# only for the first time
mkdir  /m/xenkernels/openfiler

cp -a /var/lib/xen/images/filer01/tmpmnt/boot /m/xenkernels/openfiler/

cd ..
umount tmpmnt

openSUSE-112-64-minimal:/var/lib/xen/images/filer01 # ls -l /m/xenkernels/openfiler/
total 9276
-rw-r–r– 1 root root  770924 May 30  2008
-rw-r–r– 1 root root   32220 Jun 28  2008 config-
drwxr-xr-x 2 root root    4096 Jul  1  2008 grub
-rw-r–r– 1 root root 1112062 Jul  1  2008 initrd-
-rw-r–r– 1 root root 5986208 May 14 18:01 vmlinux
-rw-r–r– 1 root root 1558259 Jun 28  2008 vmlinuz-


4 thoughts on “Build your own stretched RAC part III

  1. Eranga

    Hi Martin,
    I’m totally stuck after the following step as it gives these errors and i wasted nearly two days trying it in different ways to make it work, but still no luck. If you could help..that would be really great. About 6 months back I implemented openfiler successfully using the .img file available, on Oracle VM. But now as there’s only this tarball version for Oracle VM, this guide is the only help i have.

    virsh define openfiler_1.xml
    libvir: Remote error : No such file or directory
    libvir: warning : Failed to find the network: Is the daemon running ?
    Domain openfiler_2 defined from openfiler_1.xml

    And even with those errors when issued the following command, the errors listed below are coming and don’t know how to proceed.

    xm start openfiler_2 -c

    pyGRUB version 0.6

    Use the ^ and v keys to select which entry is highlighted. Press enter to boot the selected OS, ‘e’ to edit the commands before booting, ‘a’ to modify the kernel arguments before booting, or ‘c’ for a command line.

    rtc: IRQ 8 is not free.
    rtc: IRQ 8 is not free.
    i8042.c: No controller found.
    Red Hat nash version 4.2.15 starting
    mount: error 6 mounting ext3
    ERROR opening /dev/console!!!!: 2
    error dup2’ing fd of 0 to 0
    error dup2’ing fd of 0 to 1
    error dup2’ing fd of 0 to 2
    switchroot: mount failed: 22
    Kernel panic – not syncing: Attempted to kill init!

    Any help is greatly appreciated, as otherwise i have to give up the idea of Oracle VM and openfiler test VMware :(

    here’s my xml file if it helps;


    root=/dev/xvda1 ro


    Many Thanks!

    1. Martin Post author

      Hi there,

      so you’d like to get openfiler up and running as a PV domU? I have written about it, see:

      Should be easy enough to translate into Oracle VM. BTW, can you mount iSCSI targets on a domU? When I last used Oracle VM a while back that cause a kernel oops and an instance reboot. Was in 2.1.2 I think (yes, it’s old). Use OpenSuSE or its commercial brother, SLES 11 SP x if you want paravirtualisation – much more recent than anything else out of the box.



      1. Eranga

        Hi Martin,
        Thanks for your reply. Yes…I was using your guide to set it up. I’m setting it up as a dom0, exactly as you have mentioned in your guide, not as a domU. I followed each and every step until I got errors after “virsh define” command. Those are the errors i have listed initial query. If you could help me out rectifying those errors that would be great.

        Many Thanks,

      2. Martin Post author

        Umm, not sure how to help here.

        First of all, I’m not aware of a dom0 aware openFiler distribution. After all this is 2-3 years old now. Which Linux distribution do you use? There are subtle differences between them.

        Also, the error you get when trying to define the VM indicates that there are typos or incorrect paths in the configuration file. I would recommend starting with the xen native format (not the libvirt XML domain description) and take it from there.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s