Martins Blog

Trying to explain complex things in simple terms

Getting started with iSCSI part IV-Slightly advanced material

Posted by Martin Bach on October 23, 2009

The final part of the article series focuses on some slightly more advanced topics, such as deleting targets, device name stability and insight into how the automatic login works

Automatic login

The question I had when I saw the automatic login feature was: how does the daemon know which targets to log in? The iSCSI target configuration is stored in a DBM database, represented in the following locations:

  • /var/lib/iscsi/send_targets
  • /var/lib/iscsi/nodes

Information stored there contains nodes and discovered targets, not surprisingly. You should not mess with these files using your favourite text editor-use iscsiadm for all operations!

The following files were present after login:

[root@aux iscsi]# ls -l nodes/
total 20
drw------- 3 root root 4096 Oct 16 02:32 iqn.2009-10.com.openfiler:rac_vg.ocr_a
drw------- 3 root root 4096 Oct 16 02:32 iqn.2009-10.com.openfiler:rac_vg.ocr_b
drw------- 3 root root 4096 Oct 16 02:32 iqn.2009-10.com.openfiler:rac_vg.vote_a
drw------- 3 root root 4096 Oct 16 02:32 iqn.2009-10.com.openfiler:rac_vg.vote_b
drw------- 3 root root 4096 Oct 16 02:32 iqn.2009-10.com.openfiler:rac_vg.vote_c

[root@aux iscsi]# ls -l send_targets/
total 4
drw------- 2 root root 4096 Oct 16 02:32 192.168.30.22,3260

Deleting a target

The information in the database will persist, if you permanently want to get rid of a LUN then use the iscsidadm command in node mode, with operation “delete”. So assume you want to remove target iqn.2009-10.com.openfiler:rac_vg.vote_c, you’d do:

iscsiadm -m node --target iqn.2009-10.com.openfiler:rac_vg.vote_c --logout
iscsiadm -m node --target iqn.2009-10.com.openfiler:rac_vg.vote_c --op delete

To log back in again, should you have changed your mind, you’d follow these steps:

[root@aux ~]# iscsiadm -m node
192.168.30.22:3260,1 iqn.2009-10.com.openfiler:rac_vg.ocr_b
192.168.30.22:3260,1 iqn.2009-10.com.openfiler:rac_vg.vote_a
192.168.30.22:3260,1 iqn.2009-10.com.openfiler:rac_vg.vote_b
192.168.30.22:3260,1 iqn.2009-10.com.openfiler:rac_vg.ocr_a
[root@aux ~]# iscsiadm -m node  --target iqn.2009-10.com.openfiler:rac_vg.vote_c --login
iscsiadm: no records found!

Oops-seems I need to run a new discover command first:

[root@aux ~]# iscsiadm -m discovery -t st -p 192.168.30.22
192.168.30.22:3260,1 iqn.2009-10.com.openfiler:rac_vg.ocr_b
192.168.30.22:3260,1 iqn.2009-10.com.openfiler:rac_vg.vote_a
192.168.30.22:3260,1 iqn.2009-10.com.openfiler:rac_vg.vote_c
192.168.30.22:3260,1 iqn.2009-10.com.openfiler:rac_vg.vote_b
192.168.30.22:3260,1 iqn.2009-10.com.openfiler:rac_vg.ocr_a

Now try to log in again:

[root@aux ~]# iscsiadm -m node  --target iqn.2009-10.com.openfiler:rac_vg.vote_c --login
Logging in to [iface: default, target: iqn.2009-10.com.openfiler:rac_vg.vote_c, portal: 192.168.30.22,3260]
Login to [iface: default, target: iqn.2009-10.com.openfiler:rac_vg.vote_c, portal: 192.168.30.22,3260]: successful

Success, as you can see:

[root@aux ~]# iscsiadm -m node
192.168.30.22:3260,1 iqn.2009-10.com.openfiler:rac_vg.ocr_b
192.168.30.22:3260,1 iqn.2009-10.com.openfiler:rac_vg.vote_a
192.168.30.22:3260,1 iqn.2009-10.com.openfiler:rac_vg.vote_b
192.168.30.22:3260,1 iqn.2009-10.com.openfiler:rac_vg.vote_c
192.168.30.22:3260,1 iqn.2009-10.com.openfiler:rac_vg.ocr_a

Device name stability

So far we are mounting LUNs in no particular order, the Kernel enumerates the LUNs on a first come first served basis.  In this context, enumeration is the process of moving through the system in an order, each discovered device is assigned a default name.

For obvious reason that leaves a problem:

The default device name assigned by the OS may have no relationship to eventual system usage. And even worse, a given device name may change if the system topology or enumeration order changes. Tools and configuration files that use default device names may behave incorrectly if the names change. So assume you shuffle the iSCSI targets around a bit and you end up with OCR and voting disk on wrong devices, which means the cluster won’t start.

Udev to the rescue

Udev is available with Kernel 2.6. Based on sysfs and hotplug, replaces devfs, it allows you to define rules to maintain files for devices that are actually present in the system.

Beginning with RAC 11.2 you possibly don’t need this all-you can store voting disk and OCR in ASM, and you can make use of asmlib to stamp disks so that the following steps aren’t needed. If you are on an earlier release and I bet you are, then this might come in handy.

The proposed solution works by adding symbolic links under /dev/iscsi. A script called by udev reads the IQN, extracts the last part after a “.” and creates a symbolic link under /dev/iscsi/

As root, create the rule file with the folliwing contents:

# /etc/udev/rules.d/55-openiscsi.rules
KERNEL=="sd*",BUS=="scsi", PROGRAM="/etc/udev/scripts/iscsidev.sh %b",  SYMLINK+="iscsi/%c/part%n"

The variables used in the scripts are documented in man 8 udev:

  • %b : kernel bus ID
  • %c : result of the called script
  • %n : kernel number for device

So in a nutshell, the program /etc/udev/scripts/iscsidev.sh which we’ll create in a bit is invoked for scsi devices, takes the kernel bus ID and creates a symbolic link in /dev/iscsi.

Now create the script to manage symbolic links:

# vi /etc/udev/scripts/iscsidev.sh

Paste the following content, then change file permissions to 0775.

#!/bin/sh
# FILE: /etc/udev/scripts/iscsidev.sh
BUS=${1}
HOST=${BUS%%:*}

[ -e /sys/class/iscsi_host ] || exit 1

file="/sys/class/iscsi_host/host${HOST}/device/session*/iscsi_session*/targetname"
target_name=$(cat ${file})
if [ -z "${target_name}" ]; then  # This is not an open-scsi drive
 exit 1
fi

# the echo returns the last part of the IQN, i.e "ocr" for IQN
# iqn.2006-01.com.openfiler:rac.ocr
echo "${target_name##*.}"

With these prerequisites met, you can restart udev and see them in action. I have ommitted the logout/login part for clarity here. You should see these under /dev/iscsi now:

[root@aux ~]# ls -lR /dev/iscsi
/dev/iscsi:
total 0
drwxr-xr-x 2 root root 60 Oct 16 02:41 ocr_a
drwxr-xr-x 2 root root 60 Oct 16 02:41 ocr_b
drwxr-xr-x 2 root root 60 Oct 16 02:41 vote_a
drwxr-xr-x 2 root root 60 Oct 16 02:41 vote_b
drwxr-xr-x 2 root root 60 Oct 16 02:41 vote_c

/dev/iscsi/ocr_a:
total 0
lrwxrwxrwx 1 root root 9 Oct 16 02:41 part -> ../../sdb

/dev/iscsi/ocr_b:
total 0
lrwxrwxrwx 1 root root 9 Oct 16 02:41 part -> ../../sdc

/dev/iscsi/vote_a:
total 0
lrwxrwxrwx 1 root root 9 Oct 16 02:41 part -> ../../sdd

/dev/iscsi/vote_b:
total 0
lrwxrwxrwx 1 root root 9 Oct 16 02:41 part -> ../../sdf

/dev/iscsi/vote_c:
total 0
lrwxrwxrwx 1 root root 9 Oct 16 02:41 part -> ../../sde

Job done-device name stability guaranteed. The fdisk tool works just as nicely, the partitions are below LUN name. So for example, you could use fdisk /dev/iscsi/vote_c/part to partition the vote_c LUN.

Summary

Hopefully this article series has helped shed some light on iscsi and its use. Before you rush off now to implemt this outside a lab, I strongly recommend you check the CPU load that comes with it. Ideally you’d use a TCoE (TCP offload engine) or iSCSI HBA, or otherwise you might be in for a disappointment, especially in I/O intensive applications such as Oracle. The iSCSI initiator has ways of binding to a specific interface to make most use of your new fancy hardware, but this is out of the scope of this article.

By the way the industry also knows about these shortcomings, and The Next Big Thing on the horizon is FCoE, FibreChannel over Ethernet. So instead of using TCP for transportation, greatly modified Ethernet (“data centre ethernet or converged ethernet) will transport your data, using the fibre channel protocol. The first DCE “HBA”s are already available, but the whole technology stack still needs a lot maturing.

If I could chose, I’d use Infiniband for all communication needs, but so far I have not had the pleasure to work with an Enterprise class Infiniband installation.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: