Category Archives: Automation

Versioning for your local Vagrant boxes: handling updates

In my last post I summarised how to enable versioning for Vagrant box outside Vagrant’s cloud. In this part I’d like to share how to update a box.

My environment

The environment hasn’t changed compared to the first post. In summary I’m using

  • Ubuntu 20.04 LTS
  • Virtualbox 6.1.6
  • Vagrant 2.2.7

Updating a box

Let’s assume it’s time to update the base box for whatever reason. I most commonly update my boxes every so often after having run an “yum upgrade -y” to bring it up to the most current software. A new drop of the Guest Additions also triggers a rebuild, and so on.

Packaging

Once the changes are made, you need to package the box again. Continuing the previous example I save all my boxes and their JSON metadata in ~/vagrant/boxes. The box comes first:

[martin@host ~]$ vagrant package --base oraclelinux7base --output ~/vagrant/boxes/ol7_7.8.1.box

This creates a second box right next to the existing one. Note I bumped the version number to 7.8.1 to avoid file naming problems:

[martin@host boxes]$ ls -1
ol7_7.8.0.box
ol7_7.8.1.box
ol7.json 

Updating metadata

The next step is to update the JSON document. At this point in time, it references version 7.8.0 of my box:

[martin@host boxes]$ cat ol7.json 
{
    "name": "ol7",
    "description": "Martins Oracle Linux 7",
    "versions": [
      {
        "version": "7.8.0",
        "providers": [
          {
            "name": "virtualbox",
            "url": "file:///home/martin/vagrant/boxes/ol7_7.8.0.box",
            "checksum": "db048c3d61c0b5a8ddf6b59ab189248a42bf9a5b51ded12b2153e0f9729dfaa4",
            "checksum_type": "sha256"
          }
        ]
      }
    ]
  } 

You probably suspected what’s next :) A new version is created by adding a new element into the versions array, like so:

{
  "name": "ol7",
  "description": "Martins Oracle Linux 7",
  "versions": [
    {
      "version": "7.8.0",
      "providers": [
        {
          "name": "virtualbox",
          "url": "file:///home/martin/vagrant/boxes/ol7_7.8.0.box",
          "checksum": "db048c3d61c0b5a8ddf6b59ab189248a42bf9a5b51ded12b2153e0f9729dfaa4",
          "checksum_type": "sha256"
        }
      ]
    },
    {
      "version": "7.8.1",
      "providers": [
        {
          "name": "virtualbox",
          "url": "file:///home/martin/vagrant/boxes/ol7_7.8.1.box",
          "checksum": "f9d74dbbe88eab2f6a76e96b2268086439d49cb776b407c91e4bd3b3dc4f3f49",
          "checksum_type": "sha256"
        }
      ]
    }
  ]
} 

Don’t forget to update the SHA256 checksum!

Check for box updates

Back in my VM directory I can now check if there is a new version of my box:

[martin@host versioning]$ vagrant box outdated
Checking if box 'ol7' version '7.8.0' is up to date...
A newer version of the box 'ol7' for provider 'virtualbox' is
available! You currently have version '7.8.0'. The latest is version
'7.8.1'. Run `vagrant box update` to update.
[martin@host versioning]$ 

And there is! Not entirely surprising though, so let’s update the box:

[martin@host versioning]$ vagrant box update
==> default: Checking for updates to 'ol7'
    default: Latest installed version: 7.8.0
    default: Version constraints: 
    default: Provider: virtualbox
==> default: Updating 'ol7' with provider 'virtualbox' from version
==> default: '7.8.0' to '7.8.1'...
==> default: Loading metadata for box 'file:///home/martin/vagrant/boxes/ol7.json'
==> default: Adding box 'ol7' (v7.8.1) for provider: virtualbox
    default: Unpacking necessary files from: file:///home/martin/vagrant/boxes/ol7_7.8.1.box
    default: Calculating and comparing box checksum...
==> default: Successfully added box 'ol7' (v7.8.1) for 'virtualbox'! 

At the end of this exercise both versions are available:

[martin@host versioning]$ vagrant box list | grep ^ol7
ol7               (virtualbox, 7.8.0)
ol7               (virtualbox, 7.8.1)
[martin@host versioning]$  

This is so much better than my previous approach!

What are the effects of box versioning?

You could read earlier when I created a Vagrant VM based on version 7.8.0 of my box. This VM hasn’t been removed. What happens if I start it up now that there’s a newer version of the ol7 box available?

Bringing machine 'default' up with 'virtualbox' provider...
==> default: Checking if box 'ol7' version '7.8.0' is up to date...
==> default: A newer version of the box 'ol7' is available and already
==> default: installed, but your Vagrant machine is running against
==> default: version '7.8.0'. To update to version '7.8.1',
==> default: destroy and recreate your machine.
==> default: Clearing any previously set forwarded ports...
==> default: Fixed port collision for 22 => 2222. Now on port 2200.
==> default: Clearing any previously set network interfaces...
==> default: Preparing network interfaces based on configuration...
    default: Adapter 1: nat
==> default: Forwarding ports...
    default: 22 (guest) => 2200 (host) (adapter 1)
==> default: Running 'pre-boot' VM customizations...
==> default: Booting VM...
==> default: Waiting for machine to boot. This may take a few minutes...
    default: SSH address: 127.0.0.1:2200
    default: SSH username: vagrant
    default: SSH auth method: private key
==> default: Machine booted and ready!
==> default: Checking for guest additions in VM...
==> default: Setting hostname...
==> default: Mounting shared folders...
    default: /vagrant => /home/martin/vagrant/versioning
==> default: Machine already provisioned. Run `vagrant provision` or use the `--provision`
==> default: flag to force provisioning. Provisioners marked to run always will still run. 

Vagrant tells me that I’m using an old version of the box, and how to switch to the new one. I think I’ll do this eventually, but I can still work with the old version.

And what if I create a new VM? By default, Vagrant creates the new VM based on the latest version of my box, 7.8.1. You can see this here:

Bringing machine 'default' up with 'virtualbox' provider...
==> default: Importing base box 'ol7'...
==> default: Matching MAC address for NAT networking...
==> default: Checking if box 'ol7' version '7.8.1' is up to date...
==> default: Setting the name of the VM: versioning2_default_1588259041745_89693
==> default: Fixed port collision for 22 => 2222. Now on port 2201.
==> default: Clearing any previously set network interfaces...
==> default: Preparing network interfaces based on configuration...
    default: Adapter 1: nat
==> default: Forwarding ports...
    default: 22 (guest) => 2201 (host) (adapter 1)
==> default: Running 'pre-boot' VM customizations...
==> default: Booting VM...
==> default: Waiting for machine to boot. This may take a few minutes...
    default: SSH address: 127.0.0.1:2201
    default: SSH username: vagrant
    default: SSH auth method: private key
==> default: Machine booted and ready!
==> default: Checking for guest additions in VM...
==> default: Setting hostname...
==> default: Mounting shared folders...
    default: /vagrant => /home/martin/vagrant/versioning2 

Cleaning up

As with every technology, housekeeping is essential to keep disk usage in check. Refer back to the official documentation for more details on housekeeping and local copies of Vagrant boxes.

Summary

In the past I really struggled maintaining my local Vagrant boxes. Updating a box proved quite tricky and came with undesired side effects. Using versioning as demonstrated in this post is a great way out of this dilemma. And contrary to what I thought for a long time uploading my boxes to Vagrant cloud is not needed.

There is of course a lot more to say about versioning as this feature can do so much more. Maybe I’ll write another post about that subject some other time, until then I kindly refer you to the documentation.

Versioning for your local Vagrant boxes: adding a new box

I have been using Vagrant for quite some time now can’t tell you how much of a productivity boost it has been. All the VMs I have on my laptop are either powered by Vagrant, or feed into the Vagrant workflow.

One thing I haven’t worked out though is how to use versioning outside of Vagrant’s cloud. I don’t think I have what it takes to publish a good OS image publicly, and rather keep my boxes to myself to prevent others from injury.

My environment

While putting this post together I used the following software:

  • Ubuntu 20.04 LTS acts as my host operating system
  • Virtualbox 6.1.6
  • Vagrant 2.2.7

This is probably as current as it gets at the time of writing.

The need for box versioning

Vagrant saves you time by providing “gold images” you can spin up quickly. I prefer to always have the latest and greatest software available without having to spend ages on updating kernels and/or other components. As a result, I update my “gold image” VM from time to time, before packaging it up for Vagrant. Until quite recently I haven’t figured out how to update a VM other than delete/recreated it. This isn’t the best idea though, as indicated by this error message:

$ vagrant box remove debianbase-slim
Box 'debianbase-slim' (v0) with provider 'virtualbox' appears
to still be in use by at least one Vagrant environment. Removing
the box could corrupt the environment. We recommend destroying
these environments first:

default (ID: ....)

Are you sure you want to remove this box? [y/N] n 

This week I finally sat down trying to work out a better way of refreshing my Vagrant boxes.

As I understand it, box versioning allows me to update my base box without having to trash any environments. So instead of removing the box and replacing it with another, I can add a new version to the box. Environments using the old version can do so until they are torn down. New environments can use the new version. This works remarkably easy, once you know how to set it up! I found a few good sources on the Internet and combined them into this article.

Box versioning for Oracle Linux 7

As an Oracle person I obviously run Oracle Linux a lot. Earlier I came up with a procedure to create my own base boxes. This article features “oraclelinux7base” as the source for my Vagrant boxes. It adheres to all the requirements for Vagrant base boxes to be used with the Virtualbox provider.

Packaging the base box

Once you are happy to release your Virtualbox VM to your host, you have to package it for use with Vagrant. All my Vagrant boxes go to ~/vagrant/boxes, so this command creates the package:

$ vagrant package --base oraclelinux7base --output ~/vagrant/boxes/ol7_7.8.0.box
==> oraclelinux7base: Attempting graceful shutdown of VM...
==> oraclelinux7base: Clearing any previously set forwarded ports...
==> oraclelinux7base: Exporting VM...
==> oraclelinux7base: Compressing package to: /home/martin/vagrant/boxes/ol7_1.0.0.box 

In plain English this command instructs Vagrant to take Virtualbox’s oraclelinux7base VM and package it into ~/vagrant/boxes/ol7_7.8.0.box. I am creating this VM as the first OL 7.8 system, the naming convention seems optional yet I think it’s best to indicate the purpose and version in the package name.

At this stage, DO NOT “vagrant add” the box!

Creating box metadata

The next step is to create a little metadata describing the box. This time it’s not to be written in YAML, but JSON for a change. I found a few conflicting sources and I couldn’t get them to work until I had a look at how Oracle solved the problem. If you navigate to yum.oracle.com/boxes, you can find the links to their metadata files. I really appreciate Oracle changing to using versioning of their boxes, too!

After a little trial-and-error I came up with this file. It’s probably just the bare minimum, but it works for me in my lab so I’m happy to keep it the way it is. The file lives in ~/vagrant/boxes alongside the box file itself.

$ cat ol7.json
{
    "name": "ol7",
    "description": "Martins Oracle Linux 7",
    "versions": [
      {
        "version": "7.8.0",
        "providers": [
          {
            "name": "virtualbox",
            "url": "file:///home/martin/vagrant/boxes/ol7_7.8.0.box",
            "checksum": "db048c3d61c0b5a8ddf6b59ab189248a42bf9a5b51ded12b2153e0f9729dfaa4",
            "checksum_type": "sha256"
          }
        ]
      }
    ]
  } 

The file should be self-explanatory. The only noteworthy issue to run into is an insufficient number of forward slashes in the URL the URI is composed of “file://” followed by the fully qualified path to the box file, 3 forward slashes in total.

I used “sha256sum /home/martin/vagrant/boxes/ol7_7.8.0.box” to calculate the checksum.

Creating a VM

Finally it’s time to create the VM. I tend to create a directory per Vagrant environment, in this example I called it “versioning”. Within ~/vagrant/versioning I can create a Vagrantfile with the VM’s definition. At this stage, the base box is unknown to Vagrant.

$ nl Vagrantfile 
     1    # -*- mode: ruby -*-
     2    # vi: set ft=ruby :

     3    Vagrant.configure("2") do |config|
     4      config.vm.box = "ol7"
     5      config.vm.box_url = "file:///home/martin/vagrant/boxes/ol7.json"
     6      
     7      config.ssh.private_key_path = '/home/martin/.ssh/vagrantkey'

     8      config.vm.hostname = "server1"

     9      config.vm.provider "virtualbox" do |vb|
    10        vb.cpus = 2
    11        vb.memory = "4096"
    12      end

    13    end
 

The difference to my earlier post is the reference to the JSON file in line 5. The JSON file tells vagrant where to find the Vagrant box. The remaining configuration isn’t different from using non-versioned Vagrant boxes.

Based on this configuration file I can finally spin up my VM:

$ vagrant up
Bringing machine 'default' up with 'virtualbox' provider...
==> default: Box 'ol7' could not be found. Attempting to find and install...
    default: Box Provider: virtualbox
    default: Box Version: >= 0
==> default: Loading metadata for box 'file:///home/martin/vagrant/boxes/ol7.json'
    default: URL: file:///home/martin/vagrant/boxes/ol7.json
==> default: Adding box 'ol7' (v7.8.0) for provider: virtualbox
    default: Unpacking necessary files from: file:///home/martin/vagrant/boxes/ol7_7.8.0.box
    default: Calculating and comparing box checksum...
==> default: Successfully added box 'ol7' (v7.8.0) for 'virtualbox'!
==> default: Importing base box 'ol7'...
==> default: Matching MAC address for NAT networking...
==> default: Checking if box 'ol7' version '7.8.0' is up to date...
==> default: Setting the name of the VM: versioning_default_1588251635800_49095
==> default: Fixed port collision for 22 => 2222. Now on port 2200.
==> default: Clearing any previously set network interfaces...
==> default: Preparing network interfaces based on configuration...
    default: Adapter 1: nat
==> default: Forwarding ports...
    default: 22 (guest) => 2200 (host) (adapter 1)
==> default: Running 'pre-boot' VM customizations...
==> default: Booting VM...
==> default: Waiting for machine to boot. This may take a few minutes...
    default: SSH address: 127.0.0.1:2200
    default: SSH username: vagrant
    default: SSH auth method: private key
==> default: Machine booted and ready!
==> default: Checking for guest additions in VM...
==> default: Setting hostname...
==> default: Mounting shared folders...
    default: /vagrant => /home/martin/vagrant/versioning 

Right at the beginning you can see that Vagrant loads “metadata for box ‘file:///home/martin/vagrant/boxes/ol7.json'” and then loads the box from the location specified in the JSON file.

Once the machine is started, I can also see it available for future use:

$ vagrant box list | grep ^ol7
ol7               (virtualbox, 7.8.0) 

The box is registered as ol7, using the Virtualbox provider in version 7.8.0.

Summary

In this post I summarised (mainly for my own later use ;) how to use box versioning on my development laptop. It really isn’t that much of a difference compared to the previous way I worked and the benefit will become apparent once you update the box. I’m going to cover upgrading my “ol7” box in another post.

Passing complex data types to Ansible on the command line

Earlier this year I wrote a post about passing JSON files as --extra-vars to ansible-playbook in order to simplify deployments and to make them more flexible. JSON syntax must be used to pass more complex data types to Ansible playbooks, the topic of this post. Unlike last time though I’ll pass the arguments directly to the playbook rather than by means of a JSON file. This should cover both methods of passing extra variables.

A lot of what you are about to read depends on Ansible configuration settings. I have used Ansible on Debian 10. When I installed it earlier today I found it to be version 2.7.7. It’s the distribution’s (stock) Ansible version:

vagrant@debian10:~$ lsb_release -a
No LSB modules are available.
Distributor ID:    Debian
Description:       Debian GNU/Linux 10 (buster)
Release:           10
Codename:          buster
vagrant@debian10:~$ ansible-playbook --version
ansible-playbook 2.7.7
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/home/vagrant/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3/dist-packages/ansible
  executable location = /usr/bin/ansible-playbook
  python version = 3.7.3 (default, Dec 20 2019, 18:57:59) [GCC 8.3.0]
vagrant@debian10:~$  

The only change I consciously made was to set the output to debug:

vagrant@debian10:~$ export ANSIBLE_STDOUT_CALLBACK=debug

However, as with all information you find on the Internet – this post explicitly included – your mileage may vary. Don’t blindly copy/paste. Test everything you deem useful on an unimportant, disposable, lower-tier test system and make sure you understand any code before you even think about using it! Vagrant is a pretty good tool for this purpose by the way.

Having said that, let’s go over some examples.

Dictionaries

Let’s assume you’d like to use a dictionary in your playbook, like so:

 ---
- hosts: localhost
  connection: local
  vars:
    dictExample:
      propertyA: propertyA-key
      propertyB: propertyB-key

  tasks:
  - name: dump dictExample
    debug:
      var: dictExample 

Unsurprisingly, when invoking the playbook, the output matches the code exactly:

$ ansible-playbook -i localhost, dict-example.yml 

PLAY [localhost] ***************************************************************

TASK [Gathering Facts] *********************************************************
ok: [localhost]

TASK [dump dictExample] ********************************************************
ok: [localhost] => {
    "dictExample": {
        "propertyA": "propertyA-key",
        "propertyB": "propertyB-key"
    }
}

PLAY RECAP *********************************************************************
localhost                  : ok=2    changed=0    unreachable=0    failed=0 

Overriding dictExample on the command line requires the use of JSON, while paying attention to shell expansion at the same time. Here is an example:

$ ansible-playbook -i localhost, dict-example.yml --extra-vars "{
>     "dictExample": {
>       "propertyA": "'property A set on the command line'",
>       "propertyB": "'property B set on the command line'"
>     }
> }"
Using /etc/ansible/ansible.cfg as config file

PLAY [localhost] ***************************************************************

TASK [Gathering Facts] *********************************************************
ok: [localhost]

TASK [dump dictExample] ********************************************************
ok: [localhost] => {
    "dictExample": {
        "propertyA": "property A set on the command line",
        "propertyB": "property B set on the command line"
    }
}

PLAY RECAP *********************************************************************
localhost                  : ok=2    changed=0    unreachable=0    failed=0 

As per the Ansible documentation, variables passed as extra-vars take precedence over those defined in the playbook.

Lists

Similarly it is possible to pass lists to playbooks. Here is an example:

---
- hosts: localhost
  connection: local
  vars:
    listExample:
    - one
    - two
    - three

  tasks:
  - name: dump listExample
    debug:
      var: listExample 

Invoking it with the defaults yields the expected result:

$ ansible-playbook -i localhost, list-example.yml 

PLAY [localhost] ***************************************************************

TASK [Gathering Facts] *********************************************************
ok: [localhost]

TASK [dump listExample] ********************************************************
ok: [localhost] => {
    "listExample": [
        "one",
        "two",
        "three"
    ]
}

PLAY RECAP *********************************************************************
localhost                  : ok=2    changed=0    unreachable=0    failed=0 

You can override listExample as shown here:

$ ansible-playbook -i localhost, list-example.yml --extra-vars "{
>     "listExample": [
>         "'commandline one'",
>         "'commandline two'",
>         "'commandline three'"
>     ]
> }"

PLAY [localhost] ***************************************************************

TASK [Gathering Facts] *********************************************************
ok: [localhost]

TASK [dump listExample] ********************************************************
ok: [localhost] => {
    "listExample": [
        "commandline one",
        "commandline two",
        "commandline three"
    ]
}

PLAY RECAP *********************************************************************
localhost                  : ok=2    changed=0    unreachable=0    failed=0 

Combinations

If you worked with perl, you probably used a Hash Of Hashes (HoH) as it’s a very powerful data structure. Something similar is also possible in Ansible. Here is the example playbook:

---
- hosts: localhost
  connection: local
  vars:
    complexExample:
      propertyA:
      - a_one
      - a_two
      - a_three
      propertyB:
      - b_one
      - b_two
      - b_three

  tasks:
  - name: dump complexExample
    debug:
      var: complexExample 

By now you are probably tired of seeing the result of the call to the playbook, so I’ll skip that and move on to an example where I’m overriding the variable:

$ ansible-playbook -i localhost, complex-example.yml --extra-vars "{
    "complexExample": {
        "propertyA": [
            "a_one_changed",
            "a_two_changed",
            "a_three_changed"
        ],
        "propertyB": [
            "b_one_changed",
            "b_two_changed",
            "b_three_changed"
        ]
    }
}"

PLAY [localhost] ***************************************************************

TASK [Gathering Facts] *********************************************************
ok: [localhost]

TASK [dump complexExample] *****************************************************
ok: [localhost] => {
    "complexExample": {
        "propertyA": [
            "a_one_changed",
            "a_two_changed",
            "a_three_changed"
        ],
        "propertyB": [
            "b_one_changed",
            "b_two_changed",
            "b_three_changed"
        ]
    }
}

PLAY RECAP *********************************************************************
localhost                  : ok=2    changed=0    unreachable=0    failed=0 

Summary

Passing variables to Ansible playbooks is a powerful way of working with automation. Apart from simple variables you could read about in the previous post on the topic, you can pass dictionaries, lists and combinations thereof using JSON notation. Happy automating!

Ansible tips’n’tricks: defining –extra-vars as JSON

While I’m continuing to learn more about Ansible I noticed a nifty little thing I wanted to share: it is possible to specify –extra-vars for an Ansible playbook in a JSON document in addition to the space-separated list of key=value pairs I have used so often. This can come in handy if you have many parameters in your play and want to test changing them without having to modify your defaults stored in group_vars/*.yml or wherever else you stored them. If you do change your global variables, you can almost be certain that your version control system notifies you about a change in the file and it wants to commit it next time. This might not be exactly what you had in mind.

For later reference, this article was composed using Ubuntu 18.04.4 LTS with all updates up to February 3rd, 2020.

The documentation reference for this article can be found in Docs – User Guide – Working With Playbooks – Using Variables. It links to the “latest” Ansible version though, so you might have to go to your specific Ansible version’s documentation in case stuff changes.

The Playbook

Let’s assume the following, simple playbook:

---
- hosts: localhost
  connection: local
  vars:
    var1: "var1 set in playbook"
    var2: "var2 set in playbook"
    var3: "var3 set in playbook"
    var4: "var4 set in playbook"
    var5: "var5 set in playbook"
    var6: "var6 set in playbook"
    var7: "var7 set in playbook"
    var8: "var8 set in playbook"
    var9: "var9 set in playbook"

  tasks:

  - name: print var1
    debug: var=var1

  - name: print var2
    debug: var=var2

  - name: print var3
    debug: var=var3

  - name: print var4
    debug: var=var4

  - name: print var5
    debug: var=var5

  - name: print var6
    debug: var=var6

  - name: print var7
    debug: var=var7

  - name: print var8
    debug: var=var8

  - name: print var9
    debug: var=var9

I appreciate it isn’t the usual fancy code, spread out nicely into roles and group_vars, but it keeps the discussion simple enough and using a more elaborate coding structure wouldn’t change the end result, so please bear with me…

As you can see I defined 9 variables in the playbook, and each of them is assigned a value. Executing the playbook, the output is what would be expected:

$ ansible-playbook -i 127.0.0.1, -v main.yml
Using /etc/ansible/ansible.cfg as config file

PLAY [localhost] ****************************************************************************************************************************

TASK [Gathering Facts] **********************************************************************************************************************
ok: [127.0.0.1]

TASK [print var1] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var1": "var1 set in playbook"
}

TASK [print var2] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var2": "var2 set in playbook"
}

TASK [print var3] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var3": "var3 set in playbook"
}

TASK [print var4] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var4": "var4 set in playbook"
}

TASK [print var5] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var5": "var5 set in playbook"
}

TASK [print var6] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var6": "var6 set in playbook"
}

TASK [print var7] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var7": "var7 set in playbook"
}

TASK [print var8] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var8": "var8 set in playbook"
}

TASK [print var9] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var9": "var9 set in playbook"
}

PLAY RECAP **********************************************************************************************************************************
127.0.0.1                  : ok=10   changed=0    unreachable=0    failed=0  

Not really a surprise.

Overriding variables on the command line

Now let’s assume I want to override var8 and var9 with a custom parameter, and without changing the code. Pretty straight forward, since ansible-playbook allows us to do so.

$ ansible-playbook --help 2>&1 | grep -i extra-vars
  -e EXTRA_VARS, --extra-vars=EXTRA_VARS

With that in mind, let’s change var8 and var9:

$ ansible-playbook -i 127.0.0.1, -v main.yml --extra-vars "var8='var8 set on the command line' var9='var9 set on the command line'"
Using /etc/ansible/ansible.cfg as config file

PLAY [localhost] ****************************************************************************************************************************

TASK [Gathering Facts] **********************************************************************************************************************
ok: [127.0.0.1]

TASK [print var1] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var1": "var1 set in playbook"
}

TASK [print var2] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var2": "var2 set in playbook"
}

TASK [print var3] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var3": "var3 set in playbook"
}

TASK [print var4] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var4": "var4 set in playbook"
}

TASK [print var5] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var5": "var5 set in playbook"
}

TASK [print var6] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var6": "var6 set in playbook"
}

TASK [print var7] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var7": "var7 set in playbook"
}

TASK [print var8] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var8": "var8 set on the command line"
}

TASK [print var9] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var9": "var9 set on the command line"
}

PLAY RECAP **********************************************************************************************************************************
127.0.0.1                  : ok=10   changed=0    unreachable=0    failed=0   

Great! But you can probably spot where this is heading … changing lots of parameters on the command line is a real pain and you are almost guaranteed to introduce a typo on the way.

Passing parameters as a JSON file

As per the documentation reference I mentioned earlier, I can use a JSON document to set all parameters. For this example, I’ll use the following file:

$ cat parameters.json 
{
    "var1" : "var1 as defined in the JSON parameter file",
    "var2" : "var2 as defined in the JSON parameter file",
    "var3" : "var3 as defined in the JSON parameter file",
    "var4" : "var4 as defined in the JSON parameter file",
    "var5" : "var5 as defined in the JSON parameter file",
    "var6" : "var6 as defined in the JSON parameter file",
    "var7" : "var7 as defined in the JSON parameter file",
    "var8" : "var8 as defined in the JSON parameter file",
    "var9" : "var9 as defined in the JSON parameter file"
} 

With the file in place, I can easily reference it as shown in the next example:

$ ansible-playbook -i 127.0.0.1, -v main.yml --extra-vars "@parameters.json"

You include the JSON file containing all your parameters using the same command line argument, eg –extra-vars. Note however that this time you use the “at” sign followed immediately by the filename.

The result is exactly what I wanted:

$ ansible-playbook -i 127.0.0.1, -v main.yml --extra-vars "@parameters.json"
Using /etc/ansible/ansible.cfg as config file

PLAY [localhost] ****************************************************************************************************************************

TASK [Gathering Facts] **********************************************************************************************************************
ok: [127.0.0.1]

TASK [print var1] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var1": "var1 as defined in the JSON parameter file"
}

TASK [print var2] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var2": "var2 as defined in the JSON parameter file"
}

TASK [print var3] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var3": "var3 as defined in the JSON parameter file"
}

TASK [print var4] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var4": "var4 as defined in the JSON parameter file"
}

TASK [print var5] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var5": "var5 as defined in the JSON parameter file"
}

TASK [print var6] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var6": "var6 as defined in the JSON parameter file"
}

TASK [print var7] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var7": "var7 as defined in the JSON parameter file"
}

TASK [print var8] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var8": "var8 as defined in the JSON parameter file"
}

TASK [print var9] ***************************************************************************************************************************
ok: [127.0.0.1] => {
    "var9": "var9 as defined in the JSON parameter file"
}

PLAY RECAP **********************************************************************************************************************************
127.0.0.1                  : ok=10   changed=0    unreachable=0    failed=0 

Summary

Using a JSON file to pass parameters to an Ansible playbook is a great way of testing (parameter-driven) changes to your code without having to worry about your version control system wanting to check in changes to your variable definitions. I tend to stick with defaults in my playbooks, and use –extra-vars to deviate from them if needed and to deal with edge cases.

The extra benefit to me is that I can add the JSON file containing all the parameters to .gitignore as well.

Happy automating!

Vagrant tips’n’tricks: changing /etc/hosts automatically for Oracle Universal Installer

Oracle Universal Installer, or OUI for short, doesn’t at all like it if the hostname resolves to an IP address in the 127.0.0.0/0 range. At best it complains, at worst it starts installing and configuring software only to abort and bury the real cause deep in the logs.

I am a great fan of HashiCorp’s Vagrant as you might have guessed reading some of the previous articles, and as such wanted a scripted solution to changing the hostname to something more sensible before I begin provisioning software. I should probably add that I’m using my own base boxes; the techniques in this post should equally apply to other boxes as well.

Each of the Vagrant VMs I’m creating is given a private network for communication with its peers. This is mainly done to prevent me from having to deal with port forwarding on the NAT device. If you haven’t used Vagrant before you might not know that by default, each Vagrant VM will come up with a single NIC that has to use NAT. The end goal for this post is to ensure that my VM’s hostname maps to the private network’s IP address, not 127.0.0.1 as it would normally do.

Setting the scene

By default, Vagrant doesn’t seem to mess with the hostname of the VM. This can be changed by using a configuration variable. Let’s start with the Vagrantfile for my Oracle Linux 7 box:

# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure("2") do |config|
  config.vm.define "ol7guest" do |u|
    # this is a base box I created and stored locally
    u.vm.box = "oracleLinux7Base"

    u.ssh.private_key_path = "/path/to/key"

    u.vm.hostname = "ol7guest"
    u.vm.network "private_network", ip: "192.168.56.204"

    u.vm.provider "virtualbox" do |v|
      v.memory = 2048
      v.name = "ol7guest"
      v.cpus = 1
    end
  end
end 

Please ignore the fact that my Vagrantfile is slightly more complex than it needs to be. I do like having speaking names for my VMs, rather than “default” showing up in vagrant status. Using this terminology in the Vagrantfile also makes it easier to add more VMs to the configuration should I so need.

Apart from you just read the only remarkable thing to mention about this file is this line:

    u.vm.hostname = "ol7guest"

As per the Vagrant documentation, I can use this directive to set the hostname of the VM. And indeed, it does:

$ vagrant ssh ol7guest
Last login: Thu Jan 09 21:14:59 2020 from 10.0.2.2
[vagrant@ol7guest ~]$  

The hostname is set, however it resolves to 127.0.0.1 as per /etc/hosts:

[vagrant@ol7guest ~]$ cat /etc/hosts
127.0.0.1    ol7guest    ol7guest
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6 

Not quite what I had in mind, but apparently expected behaviour. So the next step is to change the first line in /etc/hosts to match the private IP address I assigned to the second NIC. As an Ansible fan I am naturally leaning towards using a playbook, but I also understand that not everyone has Ansible installed on the host and using the ansible_local provisioner might take longer than necessary unless your box has Ansible pre-installed.

The remainder of this post deals with an Ansible solution and the least common denominator, the shell provisioner.

Using an Ansible playbook

Many times I’m using Ansible playbooks to deploy software to Vagrant VMs anyway, so embedding a little piece of code into my playbooks to change /etc/hosts isn’t a lot of work. The first step is to amend the Vagrantfile to reference the Ansible provisioner. One possible way to do this in the context of my example is this:

# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure("2") do |config|
  config.vm.define "ol7guest" do |u|
    # this is a base box I created and stored locally
    u.vm.box = "oracleLinux7Base"

    u.ssh.private_key_path = "/path/to/key"

    u.vm.hostname = "ol7guest"
    u.vm.network "private_network", ip: "192.168.56.204"

    u.vm.provision "ansible" do |ansible|
      ansible.playbook = "change_etc_hosts.yml"
      ansible.verbose = "v"
    end

    u.vm.provider "virtualbox" do |v|
      v.memory = 2048
      v.name = "ol7guest"
      v.cpus = 1
    end
  end
end  

It is mostly the same file with the addition of the call to Ansible. As you can imagine the playbook is rather simple:

---
- hosts: ol7guest
  become: yes
  tasks:
  - name: change /etc/hosts
    lineinfile:
      path: '/etc/hosts'
      regexp: '.*ol7guest.*' 
      line: '192.168.56.204   ol7guest.example.com   ol7guest' 
      backup: yes

It uses the lineinfile module to find lines containing ol7guest and replaces that line with the “correct” IP address. The resulting hosts file is exactly what I need:

[vagrant@ol7guest ~]$ cat /etc/hosts
192.168.56.204   ol7guest.example.com   ol7guest
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
[vagrant@ol7guest ~]$ 

The first line of the original file has been replaced with the private IP which should enable OUI to progress past this potential stumbling block.

Using the shell provisioner

The second solution involves the shell provisioner, which – unlike Ansible – isn’t distribution agnostic and needs to be tailored to the target platform. On Oracle Linux, the following worked for me:

# -*- mode: ruby -*-
# vi: set ft=ruby :

$script = <<-SCRIPT
/usr/bin/cp /etc/hosts /root && \
/usr/bin/sed -ie '/ol7guest/d' /etc/hosts && \
/usr/bin/echo '192.168.56.204 ol7guest.example.com ol7guest' >> /etc/hosts
SCRIPT

Vagrant.configure("2") do |config|
  config.vm.define "ol7guest" do |u|
    # this is a base box I created and stored locally
    u.vm.box = "oracleLinux7Base"

    u.ssh.private_key_path = "/path/to/key"

    u.vm.hostname = "ol7guest"
    u.vm.network "private_network", ip: "192.168.56.204"

    u.vm.provision "shell", inline: $script

    u.vm.provider "virtualbox" do |v|
      v.memory = 2048
      v.name = "ol7guest"
      v.cpus = 1
    end
  end
end 

The script copies /etc/hosts to root’s home directory and then changes it to match my needs. At the end, the file is in exactly the shape I need it to be in.

Summary

Whether you go with the shell provisioner or embed the change to the hostname in an (existing) Ansible playbook doesn’t matter much. I would definitely argue in support of having the code embedded in a playbook if that’s what will provision additional software anyways. If installing Ansible on the host isn’t an option, using the shell as a fallback mechanism is perfectly fine, too. Happy hacking!

Tips’n’tricks: finding the (injected) private key pair used in Vagrant boxes

In an earlier article I described how you could use SSH keys to log into a Vagrant box created by the Virtualbox provider. The previous post emphasised my preference for using custom Vagrant boxes and my own SSH keys.

Nevertheless there are occasions when you can’t create your own Vagrant box, and you have to resort to the Vagrant insecure-key-pair-swap procedure instead. If you are unsure about these security related discussion points, review the documentation about creating one’s own Vagrant boxes (section “Default User Settings”) for some additional background information.

Continuing the discussion from the previous post, what does a dynamically injected SSH key imply for the use with the SSH agent?

Vagrant cloud, boxes, and the insecure key pair

Let’s start with an example to demonstrate the case. I have decided to use the latest Ubuntu 16.04 box from HashiCorp’s Vagrant cloud for no particular reason. In hindsight I should have gone for 18.04 instead, as it’s much newer. For the purpose of this post it doesn’t really matter though.

$ vagrant up ubuntu
Bringing machine 'ubuntu' up with 'virtualbox' provider...
==> ubuntu: Importing base box 'ubuntu/xenial64'...
==> ubuntu: Matching MAC address for NAT networking...
==> ubuntu: Checking if box 'ubuntu/xenial64' version '20191204.0.0' is up to date...
==> ubuntu: Setting the name of the VM: ubuntu
==> ubuntu: Fixed port collision for 22 => 2222. Now on port 2200.
==> ubuntu: Clearing any previously set network interfaces...
==> ubuntu: Preparing network interfaces based on configuration...
    ubuntu: Adapter 1: nat
    ubuntu: Adapter 2: hostonly
==> ubuntu: Forwarding ports...
    ubuntu: 22 (guest) => 2200 (host) (adapter 1)
==> ubuntu: Running 'pre-boot' VM customizations...
==> ubuntu: Booting VM...
==> ubuntu: Waiting for machine to boot. This may take a few minutes...
    ubuntu: SSH address: 127.0.0.1:2200
    ubuntu: SSH username: vagrant
    ubuntu: SSH auth method: private key
    ubuntu: 
    ubuntu: Vagrant insecure key detected. Vagrant will automatically replace
    ubuntu: this with a newly generated keypair for better security.
    ubuntu: 
    ubuntu: Inserting generated public key within guest...
    ubuntu: Removing insecure key from the guest if it's present...
    ubuntu: Key inserted! Disconnecting and reconnecting using new SSH key...
==> ubuntu: Machine booted and ready!
==> ubuntu: Checking for guest additions in VM...
    ubuntu: The guest additions on this VM do not match the installed version of
    ubuntu: VirtualBox! In most cases this is fine, but in rare cases it can
    ubuntu: prevent things such as shared folders from working properly. If you see
    ubuntu: shared folder errors, please make sure the guest additions within the
    ubuntu: virtual machine match the version of VirtualBox you have installed on
    ubuntu: your host and reload your VM.
    ubuntu: 
    ubuntu: Guest Additions Version: 5.1.38
    ubuntu: VirtualBox Version: 6.0
==> ubuntu: Setting hostname...
==> ubuntu: Mounting shared folders...
    ubuntu: /vagrant => /home/martin/vagrant/ubunutu 

This started my “ubuntu” VM (I don’t like it when my VMs are called “default”, so I tend to give them better designations):

$ vboxmanage list vms | grep ubuntu
"ubuntu" {a507ba0c-...24bb} 

You may have noticed that 2 network interfaces are brought online in the output created by vagrant up. This is done to stay in line with the story of the previous post and not something that’s strictly speaking necessary.

The key message in the context of this blog post found the logs is this:

    ubuntu: SSH auth method: private key
    ubuntu: 
    ubuntu: Vagrant insecure key detected. Vagrant will automatically replace
    ubuntu: this with a newly generated keypair for better security.
    ubuntu: 
    ubuntu: Inserting generated public key within guest...
    ubuntu: Removing insecure key from the guest if it's present...
    ubuntu: Key inserted! Disconnecting and reconnecting using new SSH key... 

As you can read, the insecure key was detected and replaced. But where can I find the replaced key?

Locating the new private key

This took me a little while to find out, and I’m hoping this post saves you a minute. The key information (drum roll please) can be found in the output of vagrant ssh-config:

$ vagrant ssh-config ubuntu
Host ubuntu
  HostName 127.0.0.1
  User vagrant
  Port 2200
  UserKnownHostsFile /dev/null
  StrictHostKeyChecking no
  PasswordAuthentication no
  IdentityFile /home/martin/vagrant/ubunutu/.vagrant/machines/ubuntu/virtualbox/private_key
  IdentitiesOnly yes
  LogLevel FATAL 

This contains all the information you need to SSH into the machine! It doesn’t seem to print information about the second NIC though, but that’s ok as I can always look at its details in the Vagrantfile itself.

Connection!

Using the information from above, I can connect to the system using either port 2200 (forwarded on the NAT device), or the private IP (which is 192.168.56.204 and has not been shown here):

$ ssh -p 2200 \
> -i /home/martin/vagrant/ubunutu/.vagrant/machines/ubuntu/virtualbox/private_key \
> vagrant@localhost hostname
ubuntu

$ ssh -i /home/martin/vagrant/ubunutu/.vagrant/machines/ubuntu/virtualbox/private_key \
> vagrant@192.168.56.204 hostname
ubuntu 

This should be all you need to get cracking with the Vagrant box. But wait! The full path to the key is somewhat lengthy, and that makes it a great candidate for storing it with the SSH agent. That’s super-easy, too:

$ ssh-add /home/martin/vagrant/ubunutu/.vagrant/machines/ubuntu/virtualbox/private_key
Identity added: /home/martin/vagrant/ubunutu/.vagrant/machines/ubuntu/virtualbox/private_key (/home/martin/vagrant/ubunutu/.vagrant/machines/ubuntu/virtualbox/private_key)

Apologies for the formatting. But it was worth it!

$ ssh vagrant@192.168.56.204 hostname
ubuntu

That’s a lot less typing than before…

By the way, it should be easy to spot this key in the output of ssh-add -l as it’s most likely the one with the longest path. If that doesn’t help you identify the key, ssh-keygen -lf /path/to/key prints the key’s fingerprint, for which you can grep in the output of ssh-add -l.

Have fun!

Tips’n’tricks: understanding “too many authentication failures” in SSH

Virtualbox VMs powered by Vagrant require authentication via SSH keys so you don’t have to provide a password each time vagrant up is doing its magic. Provisioning tools you run as part of the vagrant up command also rely on the SSH key based authentication to work properly. This is documented in the official Vagrant documentation set.

I don’t want to use unknown SSH keys with my own Vagrant boxes as a matter of principle. Whenever I create a new custom box I resort to a dedicated SSH key I’m using just for this purpose. This avoids the trouble with Vagrant’s “insecure key pair”, all I need to do is add config.ssh.private_key_path = "/path/to/key" to the Vagrantfile.

The documentation further reads I have to use a NAT device as the first network card in the VM. For some of my VMs I define an additional NIC using a host-only, private network for communication between say for example middle tier and database layer. I don’t want to mess around with port forwarding to enable communication between my VMs, and Vagrant makes it super easy to define another NIC.

This sounds interesting, but what does that have to do with this post?

Please bear with me, I’m building up a story ;) It will all make sense in a minute…

Connecting to the VM’s second interface

With all that in place it’s easy to SSH into my Vagrant box. Assume I have a Vagrant VM with an IP address of 192.168.56.202 to which I want to connect via SSH. Remember when I said I have a dedicated SSH key for my Vagrant boxes? The SSH key is stored in ~/.ssh/vagrant. The SSH command to connect to the environment is simple:

$ ssh -i ~/.ssh/vagrant vagrant@192.169.56.202

… and this connects me without having to provide a password.

Saving time for the lazy

Providing the path to the SSH key to use gets a little tedious after a while. There are a couple of solutions to this; there might be more, but I only know about these two:

  • Create a configuration in ~/.ssh/config Except that doesn’t work particularly well with keys for which you defined a passphrase as you now have to enter the passphrase each time
  • Add the key to the SSH agent

On Linux and MacOS I prefer the second method, especially since I’m relying on passphrases quite heavily. Recently I encountered a problem with this approach, though. When trying to connect to the VM, I received the following error message:

$ ssh vagrant@192.169.56.202
Received disconnect from 192.169.56.202 port 22:2: Too many authentication failures
Disconnected from 192.169.56.202 port 22

What’s that all about? I am sure I have the necessary key added to the agent:

$ ssh-add -l | grep -c vagrant
1

Well it turns out that if you have too many non-matching keys, you can run into the pre-authentication problem like I did. The first step in troubleshooting SSH connections (at least to me) is to enable the verbose option:

$ ssh -v vagrant@192.169.56.202

[ ... more detail ... ]

debug1: Will attempt key: key1 ... redacted ... agent
debug1: Will attempt key: key10 ... redacted ... agent
debug1: Will attempt key: key2 ... redacted ... agent
debug1: Will attempt key: key3 ... redacted ... agent
debug1: Will attempt key: key4 ... redacted ... agent
debug1: Will attempt key: key5 ... redacted ... agent
debug1: Will attempt key: key6 ... redacted ... agent
debug1: Will attempt key: key7 ... redacted ... agent
debug1: Will attempt key: key8 ... redacted ... agent
debug1: Will attempt key: key9 ... redacted ... agent

[ ... ]

debug1: Next authentication method: publickey
debug1: Offering public key: key1 ... redacted ... agent
debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic,password

[...]

debug1: Offering public key: key5 ... redacted ... agent
debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic,password
Received disconnect from 192.169.56.202 port 22:2: Too many authentication failures
Disconnected from 192.169.56.202 port 22

It is my understanding that SSH is querying the agent for SSH keys, and it receives them. After trying key1 through key5 and not finding a match, it decides to stop and returns said error message.

There are quite a few keys currently added to my running agent:

$ ssh-add -l | wc -l
11

The Solution

The solution is quite straight forward: I need to store keys with the agent, but I have to indicate which of the stored keys to log in to my VM. This is probably best done in ~/.ssh/config:

$ cat ~/.ssh/config 

192.168.56.202
    IdentityFile ~/.ssh/vagrant

In summary, I’m now using a combination of the 2 approaches I outlined above to great effect: now I can log in without having to worry about the keys stored by my agent, and the order in which they are stored.

Ansible Tips’n’tricks: rebooting Vagrant boxes after a kernel upgrade

Occasionally I have to reboot my Vagrant boxes after kernel updates have been installed as part of an Ansible playbook during the “vagrant up” command execution.

I create my own Vagrant base boxes because that’s more convenient for me than pulling them from Vagrant’s cloud. However they, too, need TLC and updates. So long story short, I run a yum upgrade after spinning up Vagrant boxes in Ansible to have access to the latest and greatest (and hopefully most secure) software.

To stay in line with Vagrant’s philosophy, Vagrant VMs are lab and playground environments I create quickly. And I can dispose of them equally quickly, because all that I’m doing is controlled via code. This isn’t something you’d do with Enterprise installations!

Vagrant and Ansible for lab VMs!

Now how do you reboot a Vagrant controlled VM in Ansible? Here is how I’m doing this for VirtualBox 6.0.14 and Vagrant 2.2.6. Ubuntu 18.04.3 comes with Ansible 2.5.1.

Finding out if a kernel upgrade is needed

My custom Vagrant boxes are all based on Oracle Linux 7 and use UEK as the kernel of choice. That is important because it determines how I can find out if yum upgraded the kernel (eg UEK) as part of a “yum upgrade”.

There are many ways to do so, I have been using the following code snippet with some success:

 - name: check if we need to reboot after a kernel upgrade
    shell: if [ $(/usr/bin/rpm -q kernel-uek|/usr/bin/tail -n 1) != kernel-uek-$(uname -r) ]; then /usr/bin/echo 'reboot'; else /usr/bin/echo 'no'; fi
    register: must_reboot

So in other words I compare the last line from rpm -q kernel-uek to the name of the running kernel. If they match – all good. If they don’t, it seems there is a newer kernel-uek* RPM on disk than that of the running kernel. If the variable “must_reboot” contains “reboot”, I guess I have to reboot.

Rebooting

Ansible introduced a reboot module recently, however my Ubuntu 18.04 system’s Ansible version is too old for that and I wanted to stay with the distribution’s package. I needed an alternative.

There are lots of code snippets out there to reboot systems in Ansible, but none of them worked for me. So I decided to write the process up in this post :)

The following block worked for my very specific setup:

  - name: reboot if needed
    block:
    - shell: sleep 5 && systemctl reboot
      async: 300
      poll: 0
      ignore_errors: true

    - name: wait for system to come back online
      wait_for_connection:
        delay: 60
        timeout: 300
    when: '"reboot" in must_reboot.stdout'

This works nicely with the systems I’m using.

Except there’s a catch lurking in the code: when installing Oracle the software is made available via Virtualbox’s shared folders as defined in the Vagrantfile. When rebooting a Vagrant box outside the Vagrant interface (eg not using the vagrant reload command), shared folders aren’t mounted automatically. In other words, my playbook will fail trying to unzip binaries because it can’t find them. Which isn’t what I want. To circumvent this situation I add the following instruction into the block you just saw:

    - name: re-mount the shared folder after a reboot
      mount:
        path: /mnt
        src: mnt
        fstype: vboxsf
        state: mounted

This re-mounts my shared folder, and I’m good to go!

Summary

Before installing Oracle software in Vagrant for lab and playground use I always want to make sure I have all the latest and greatest patches installed as part of bringing a Vagrant box online for the first time.

Using Ansible I can automate the entire process from start to finish, even including kernel updates in the process. These are applied before I install the Oracle software!

Upgrading the kernel (or any other software components for that matter) post Oracle installation is more involved, and I usually don’t need to do this during the lifetime of the Vagrant (playground/lab) VM. Which is why Vagrant is beautiful, especially when used together with Ansible.

Ansible tips’n’tricks: executing a loop conditionally

When writing playbooks, I occasionally add optional tasks. These tasks are only executed if a corresponding configuration variable is defined. I usually define configuration variables either in group_vars/* or alternatively in the role’s roleName/default/ directory.

The “when” keyword can be used to test for the presence of a variable and execute a task if the condition evaluates to “true”. However this isn’t always straight-forward to me, and recently I stumbled across some interesting behaviour that I found worth mentioning. I would like to point out that I’m merely an Ansible enthusiast, and by no means a pro. In case there is a better way to do this, please let me know and I’ll update the post :)

Before showing you my code, I’d like to add a little bit of detail here in case someone finds this post via a search engine:

  • Ansible version: ansible 2.8.2
  • Operating system: Fedora 29 on Linux x86-64

The code

This is the initial code I started with:

$ tree
.
├── inventory.ini
├── roles
│   └── example
│       ├── defaults
│       │   └── main.yml
│       └── tasks
│           └── main.yml
└── variables.yml

4 directories, 4 files

$ nl variables.yml 
      1  ---
      2  - hosts: blogpost
      3    become: yes
      4    roles:
      5    - example

$ nl roles/example/defaults/main.yml 
     1  #
     2  # some variables
     3  #

     4  oracle_disks: ''

$ nl roles/example/tasks/main.yml
     1  ---
     2  - name: print lenght of oracle_disks variable
     3    debug: 
     4      msg: "The variable has a length of {{ oracle_disks | length }}"

     5  - name: format disk devices
     6    parted:
     7      device: "{{ item }}"
     8      number: 1
     9      state: present
    10      align: optimal
    11      label: gpt
    12    loop: "{{ oracle_disks }}"
    13    when: oracle_disks | length > 0

This will not work, as you can see in a minute.

The error

And indeed, the execution of my playbook (variables.yml) failed:

$ ansible-playbook -vi inventory.ini variables.yml 
Using /etc/ansible/ansible.cfg as config file

PLAY [blogpost] ******************************************************************************************************

TASK [Gathering Facts] ***********************************************************************************************
ok: [server6]

TASK [example : print lenght of oracle_disks variable] ***************************************************************
ok: [server6] => {}

MSG:

The variable has a length of 0


TASK [example : format disk devices] *********************************************************************************
fatal: [server6]: FAILED! => {}

MSG:

Invalid data passed to 'loop', it requires a list, got this instead: . 
Hint: If you passed a list/dict of just one element, try adding wantlist=True 
to your lookup invocation or use q/query instead of lookup.


PLAY RECAP ***********************************************************************************************************
server6                    : ok=2    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0

The intention was to not execute the task named “format disk devices” if oracle_disks has a length of 0. This seems to be evaluated too late though, and it turned out to be the wrong check anyway. I tried various permutations of the scheme, but none were successful while oracle_disks was set to the empty string. Which is wrong, but please bear with me …

No errors with meaningful values

The loop syntax in the role’s tasks/main.yml file is correct though, once I set the variable to a list, it worked:

$ nl roles/example/defaults/main.yml
      1  #
      2  # some variables
      3  #
        
      4  oracle_disks: 
      5  - /dev/vdc
      6  - /dev/vdd

$ ansible-playbook -vi inventory.ini variables.yml
Using /etc/ansible/ansible.cfg as config file

PLAY [blogpost] ******************************************************************************************************

TASK [Gathering Facts] ***********************************************************************************************
ok: [server6]

TASK [example : print lenght of oracle_disks variable] ***************************************************************
ok: [server6] => {}

MSG:

The variable has a length of 2

TASK [example : format disk devices] *********************************************************************************
changed: [server6] => (item=/dev/vdc) => {
    "ansible_loop_var": "item",
    "changed": true,
    "disk": {
        "dev": "/dev/vdc",
        "logical_block": 512,
        "model": "Virtio Block Device",
        "physical_block": 512,
        "size": 10485760.0,
        "table": "gpt",
        "unit": "kib"
    },
    "item": "/dev/vdc",
    "partitions": [
        {
            "begin": 1024.0,
            "end": 10484736.0,
            "flags": [],
            "fstype": "",
            "name": "primary",
            "num": 1,
            "size": 10483712.0,
            "unit": "kib"
        }
    ],
    "script": "unit KiB mklabel gpt mkpart primary 0% 100%"
}
changed: [server6] => (item=/dev/vdd) => {
    "ansible_loop_var": "item",
    "changed": true,
    "disk": {
        "dev": "/dev/vdd",
        "logical_block": 512,
        "model": "Virtio Block Device",
        "physical_block": 512,
        "size": 10485760.0,
        "table": "gpt",
        "unit": "kib"
    },
    "item": "/dev/vdd",
    "partitions": [
        {
            "begin": 1024.0,
            "end": 10484736.0,
            "flags": [],
            "fstype": "",
            "name": "primary",
            "num": 1,
            "size": 10483712.0,
            "unit": "kib"
        }
    ],
    "script": "unit KiB mklabel gpt mkpart primary 0% 100%"
}

PLAY RECAP ***********************************************************************************************************
server6                    : ok=3    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

So what gives? It once more goes to show that as soon as you do things right, they start working.

Checking if a variable is defined

How can I prevent the task from being executed? There are probably a great many ways of achieving this goal, I learned that not defining oracle_disks seems to work for me. Here I’m commenting out all references to the variable before trying again:

$ cat roles/example/defaults/main.yml 
#
# some variables
#

#oracle_disks: 
#- /dev/vdc
#- /dev/vdd

$ cat roles/example/tasks/main.yml 
---
- name: print lenght of oracle_disks variable
  debug: 
    msg: "The variable has a length of {{ oracle_disks | length }}"
  when: oracle_disks is defined

- name: format disk devices
  parted:
    device: "{{ item }}"
    number: 1
    state: present
    align: optimal
    label: gpt
  loop: "{{ oracle_disks }}" 
  when: oracle_disks is defined

$ ansible-playbook -vi inventory.ini variables.yml 
Using /etc/ansible/ansible.cfg as config file

PLAY [blogpost] ******************************************************************************************************

TASK [Gathering Facts] ***********************************************************************************************
ok: [server6]

TASK [example : print lenght of oracle_disks variable] ***************************************************************
skipping: [server6] => {}

TASK [example : format disk devices] *********************************************************************************
skipping: [server6] => {
    "changed": false,
    "skip_reason": "Conditional result was False"
}

PLAY RECAP ***********************************************************************************************************
server6                    : ok=1    changed=0    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0 

With the variable not defined, the task is skipped as intended.

As you read earlier, using the empty string (”) isn’t the right way to mark a variable as “empty”. I guess this is where my other programming languages influenced me a bit (cough * perl * cough). The proper way to indicate there are no items in the list (as per the documentation) is this:

$ nl roles/example/defaults/main.yml 
     1  #
     2  # some variables
     3  #

     4  oracle_disks: []

$ nl roles/example/tasks/main.yml 
     1  ---
     2  - name: print lenght of oracle_disks variable
     3    debug: 
     4      msg: "The variable has a length of {{ oracle_disks | length }}"
     5    when: oracle_disks is defined

     6  - name: format disk devices
     7    parted:
     8      device: "{{ item }}"
     9      number: 1
    10      state: present
    11      align: optimal
    12      label: gpt
    13    loop: "{{ oracle_disks | default([]) }}" 

The default() assignment in tasks/main.yml line 13 shouldn’t be necessary with the assignment completed in defaults/main.yml line 4. It doesn’t seem to hurt either. Instead of the conditional check message you will see the task executed, but since there is nothing to loop over, it finishes straight away:

$ ansible-playbook -vi inventory.ini variables.yml 
Using /etc/ansible/ansible.cfg as config file

PLAY [blogpost] ***********************************************************************************************************************************************************************

TASK [Gathering Facts] ****************************************************************************************************************************************************************
ok: [server6]

TASK [example : print lenght of oracle_disks variable] ********************************************************************************************************************************
ok: [server6] => {}

MSG:

The variable has a length of 0


TASK [example : format disk devices] **************************************************************************************************************************************************

PLAY RECAP ****************************************************************************************************************************************************************************
server6                    : ok=2    changed=0    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0

Happy coding!

Ansible tips’n’tricks: checking if a systemd service is running

I have been working on an Ansible playbook to update Oracle’s Tracefile Analyser (TFA). If you have been following this blog over the past few months you might remember that I’m a great fan of the tool! Using Ansible makes my life a lot easier: when deploying a new system I can ensure that I’m also installing TFA. Under normal circumstances, TFA should be present when the (initial) deployment playbook finishes. At least in theory.

As we know, life is what happens when you’re making other plans, and I’d rather check whether TFA is installed/configured/running before trying to upgrade it. The command to upgrade TFA is different from the command I use to deploy it.

I have considered quite a few different ways to do this but in the end decided to check for the oracle-tfa service: if the service is present, TFA must be as well. There are probably other ways, maybe better ones, but this one works for me.

Checking for the presence of a service

Ansible offers a module, called service_facts since version 2.5 to facilitate working with services. I also tried the setup module but didn’t find what I needed. Consider the following output, generated on Oracle Linux 7.6 when gathering service facts:

TASK [get service facts] *******************************************************
 ok: [localhost] => {
     "ansible_facts": {
         "services": {
             "NetworkManager-wait-online.service": {
                 "name": "NetworkManager-wait-online.service", 
                 "source": "systemd", 
                 "state": "stopped"
             }, 
             "NetworkManager.service": {
                 "name": "NetworkManager.service", 
                 "source": "systemd", 
                 "state": "running"
             }, 
             "auditd.service": {
                 "name": "auditd.service", 
                 "source": "systemd", 
                 "state": "running"
             }, 

[ many more services ]

            "oracle-tfa.service": {
                 "name": "oracle-tfa.service", 
                 "source": "systemd", 
                 "state": "running"
             }, 

[ many more services ]

This looks ever so slightly complicated! And indeed, it took a little while to work the syntax out. My first attempt were all but unsuccessful.

Getting the syntax right

Thankfully I wasn’t the only one with the problem, and with a little bit of research ended up with this code:

---
 - hosts: localhost
   connection: local
   become: true

   tasks:
   - name: get service facts
     service_facts:

   - name: try to work out how to access the service
     debug:
       var: ansible_facts.services["oracle-tfa.service"]

Awesome! When running this on a system with TFA installed, it works quite nicely:

TASK [try to work out how to access the service] *******************************
 ok: [localhost] => {
     "ansible_facts.services[\"oracle-tfa.service\"]": {
         "name": "oracle-tfa.service", 
         "source": "systemd", 
         "state": "running"
     }
 }
 

 PLAY RECAP *********************************************************************
 localhost                  : ok=3    changed=0    unreachable=0    failed=0

The same code fails on a system without TFA installed:

TASK [try to work out how to access the service] *******************************
 ok: [localhost] => {
     "ansible_facts.services[\"oracle-tfa.service\"]": "VARIABLE IS NOT DEFINED!
      'dict object' has no attribute 'oracle-tfa.service'"
 }
 

 PLAY RECAP *********************************************************************
 localhost                  : ok=3    changed=0    unreachable=0    failed=0

Now the trick is to ensure that I’m not referencing an undefined variable. This isn’t too hard either, here is a useable playbook:

---
 - hosts: localhost
   connection: local 
 
   tasks:
   - name: get service facts
     service_facts:
 
   - name: check if TFA is installed
     fail:
       msg: Tracefile Analyzer is not installed, why? It should have been there!
     when: ansible_facts.services["oracle-tfa.service"] is not defined

The “tasks” include getting service facts before testing for the presence of the oracle-tfa.service. I deliberately fail the upgrade process to make the user aware of a situation that should not have happened.

Hope this helps!