Category Archives: Cloud

Avoid “Warning: Additional provider information from registry” for OCI Terraform Provider

After updating my main development workstation to Fedora 36 including all the tools I regularly use I noticed a change when working with Terraform code. The call to terraform init succeeded but was accompanied by a warning:

$ terraform version -no-color
Terraform v1.2.3
on linux_amd64
$ terraform init -no-color

Initializing the backend...

Initializing provider plugins...
- Finding latest version of hashicorp/oci...
- Installing hashicorp/oci v4.80.1...
- Installed hashicorp/oci v4.80.1 (signed by HashiCorp)

Terraform has created a lock file .terraform.lock.hcl to record the provider
selections it made above. Include this file in your version control repository
so that Terraform can guarantee to make the same selections by default when
you run "terraform init" in the future.


Warning: Additional provider information from registry

The remote registry returned warnings for registry.terraform.io/hashicorp/oci:
- For users on Terraform 0.13 or greater, this provider has moved to oracle/oci. 
  Please update your source in required_providers.

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.

The “What’s new” section in the OCI Terraform Provider documentation mentions this change. It also describes how to switch the provider’s source to avoid this warning.

So here is what I did. I tend to put my provider details into main.tf, so this seemed like the best place to put the required_providers section:

provider "oci" {
  tenancy_ocid         = var.tenancy_ocid
  user_ocid            = var.user_ocid
  fingerprint          = var.key_fingerprint
  private_key_path     = var.private_key_path
  private_key_password = var.private_key_password
  region               = var.oci_region
}

terraform {
  required_providers {
    oci = {
      source  = "oracle/oci"
      version = ">= 4.0.0"
    }
  }
}

After adding the new terraform block I managed to use the oracle/oci provider and avoid the warning. The OCI driver version 4.80.1 was current at the time of writing.

$ terraform init -no-color

Initializing the backend...

Initializing provider plugins...
- Finding oracle/oci versions matching ">= 4.0.0"...
- Installing oracle/oci v4.80.1...
- Installed oracle/oci v4.80.1 (signed by a HashiCorp partner, key ID 1533A49284137CEB)

Partner and community providers are signed by their developers.
If you'd like to know more about provider signing, you can read about it here:
https://www.terraform.io/docs/cli/plugins/signing.html

Terraform has created a lock file .terraform.lock.hcl to record the provider
selections it made above. Include this file in your version control repository
so that Terraform can guarantee to make the same selections by default when
you run "terraform init" in the future.

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.

And indeed, Terraform will now use the the oracle/oci driver:

$ terraform version -no-color
Terraform v1.2.3
on linux_amd64
+ provider registry.terraform.io/oracle/oci v4.80.1

Happy automating!

Retrieving passwords from OCI Vault for use in Terraform

This post is written with the intention to complement the excellent “A comprehensive guide to managing secrets in your Terraform code” by Yevgeniy Brikman. Its aim is to detail how Oracle Cloud Infrastructure Vault (OCI Vault) can be used to securely store credentials and subsequently use them in Terraform scripts.

If you haven’t done so I recommend reading Yevgeni’s post to get some background information as to why storing passwords anywhere in code, even dot-configuration files, is a Truly Bad Idea. This article provides an example for his third technique: using a dedicated secrets store.

Never, ever, store any credentials in code. Just . don’t . do it. It’s disaster waiting to happen

– every security conscious person, always

Standard disclaimer: please be advised that creating cloud resources most likely costs you money, and keeping them running even more so. Don’t create any cloud resources unless you are authorised to spend that money and know about the implications of creating the resources mentioned in this post.

The problem with the Terraform state file

Whilst using OCI Vault for storing and retrieving secrets is without a doubt a great step towards safer code management, there is still an unsolved issue with Terraform: the state file is considered sensitive information by HashiCorp at the time of writing (2022-05-30). When using the local backend (eg the default) passwords and other sensitive information are stored in clear text in a JSON file. Storing sensitive information in clear text is very much counter-productive to the article’s goals. Alternative backends providing encryption at rest are most likely better suited. Please ensure you remain compliant with your IT security department’s policies regarding the Terraform state file.

Overview

In this article you can read how to create an Autonomous Database (ADB) instance using a tiny Terraform script. Compared to some other tutorials about the subject you won’t find the ADMIN password provided in the code.

Rather than providing the ADB instance’s ADMIN password as an environment variable, the password is retrieved from an OCI Vault secret and passed to the ADB resource. The ADB instance is just one potential use case for using OCI Vault in Terraform: anywhere secrets need to be used to create/maintain resources, the technique detailed for ADB applies as well.

Secrets in the context of OCI Vault are credentials such as passwords, certificates, SSH keys, or authentication tokens that you use with Oracle Cloud Infrastructure services. An OCI Vault Secret cannot be looked up as such: secrets are wrapped into what’s referred to as a secret bundle. A secret bundle consists of the secret contents, properties of the secret and secret version (such as version number or rotation state), and user-provided contextual metadata for the secret.

To keep this article short-ish, it is assumed that a secret has already been created and its Oracle Cloud Identifier (OCID) is known. The secret’s OCID is passed to the Terraform script via a variable.

An Autonomous Database instance is perfectly suited to demonstrate the use of a Terraform Data Source for looking up vault secrets as it does not require any supporting resources such as Virtual Cloud Networks, or any elaborate network security settings. The Terraform script will create a publicly accessible ADB instance protected by an Access Control List (ACL) allowing only specific IP addresses to connect. Furthermore, mutual TLS is enabled for even stronger security.

Using an OCI Vault Secret

Lookup operations in Terraform are performed using Data Sources. There are data sources for most cloud resources, including the aforementioned secret bundle. Provided the secret’s OCID is passed via a variable, the lookup using an oci_secrets_secretbundle data source could be performed as follows:

data "oci_secrets_secretbundle" "bundle" {

  secret_id = var.secret_ocid
}

Thankfully the OCI Terraform provider is smart enough to retrieve the current, active version of the secret. Once the secret has been retrieved, it can be used for the creation of an ADB instance. Since secrets are base64 encoded, they have to be decoded before they can be used. The following snippet demonstrates the use of the data source inside the ADB resource:

resource "oci_database_autonomous_database" "demo_adb_21c" {
  compartment_id              = var.compartment_ocid
  db_name                     = "DEMO"
  admin_password              = base64decode(data.oci_secrets_secretbundle.bundle.secret_bundle_content.0.content)
  cpu_core_count              = 1
  data_storage_size_in_tbs    = 1
  db_version                  = "21c"
  db_workload                 = "OLTP"
  display_name                = "ADB Free Tier 21c"
  is_free_tier                = true
  is_mtls_connection_required = true
  ocpu_count                  = 1
  whitelisted_ips             = var.allowed_ip_addresses
}

A call to terraform plan followed by a terraform apply will initiate the creation of the ADB instance. As long as the admin password complies with the password complexity rules of the ADB resource, the database will be created. Once its lifecycle status changed to running, the database will be accessible to IP addresses specified in var.allowed_addresses (a list of strings). Should you invoke the Terraform script from a Linux shell, this might be a way to set the variable:

$ export TF_VAR_allowed_ip_addresses='[ "1.2.3.4", "4.5.6.7" ]'
$ terraform plan -out myplan

Summary

Using OCI Vault to store sensitive information is a secure way to mitigate against many password-handling problems. The Terraform state file remains a concern, especially when using the local backend as it stores all information in clear text. The IT security department should be consulted as to how this potential security vulnerability should be treated. Other backends than the local backend exist and might suit the IT security team’s needs better.

Once a Vault secret has been looked up, it can be used in any Terraform resource. Referencing data sources should lead to more secure code deployments.

Happy Automating!

Linking Containers with Podman

Users of the Docker engine might find that their container runtime isn’t featured prominently in Oracle Linux 8. In fact, unless you change the default confifguration a dnf search does not reveal the engine at all. For better or for worse, it appears the industry has been gradually switching from Docker to Podman and its related ecosystem.

Whilst most Docker commands can be translated 1:1 to the Podman world, some differences exist. Instead of highlighting all the changes here please have a look at the Podman User Guide.

Overview

This article explains how to create a network link between 2 containers:

  1. Oracle XE 21c
  2. SQLcl client

These containers are going to be run "rootless", which has a few implications. By default Podman will allocate storage for containers in ~/.local/share/containers/ so please ensure you have sufficient space in your home directory.

The article refers to Gerald Venzl’s Oracle-XE images and you will create another image for SQLcl.

Installation

If you haven’t already installed Podman you can do so by installing the container-tools:ol8 module:

[opc@podman ~]$ $ sudo dnf module install container-tools:ol8
Last metadata expiration check: 0:06:04 ago on Mon 21 Mar 2022 13:19:40 GMT.
Dependencies resolved.
========================================================================================================================
 Package                         Arch      Version                                           Repository            Size
========================================================================================================================
Installing group/module packages:
 buildah                         x86_64    1:1.23.1-2.0.1.module+el8.5.0+20494+0311868c      ol8_appstream        7.9 M
 cockpit-podman                  noarch    39-1.module+el8.5.0+20494+0311868c                ol8_appstream        483 k
 conmon                          x86_64    2:2.0.32-1.module+el8.5.0+20494+0311868c          ol8_appstream         55 k
 container-selinux               noarch    2:2.173.0-1.module+el8.5.0+20494+0311868c         ol8_appstream         57 k
 containernetworking-plugins     x86_64    1.0.1-1.module+el8.5.0+20494+0311868c             ol8_appstream         19 M
 containers-common               noarch    2:1-8.0.1.module+el8.5.0+20494+0311868c           ol8_appstream         62 k
 criu                            x86_64    3.15-3.module+el8.5.0+20416+d687fed7              ol8_appstream        518 k
 crun                            x86_64    1.4.1-1.module+el8.5.0+20494+0311868c             ol8_appstream        205 k
 fuse-overlayfs                  x86_64    1.8-1.module+el8.5.0+20494+0311868c               ol8_appstream         73 k
 libslirp                        x86_64    4.4.0-1.module+el8.5.0+20416+d687fed7             ol8_appstream         70 k
 podman                          x86_64    1:3.4.2-9.0.1.module+el8.5.0+20494+0311868c       ol8_appstream         12 M
 python3-podman                  noarch    3.2.1-1.module+el8.5.0+20494+0311868c             ol8_appstream        148 k
 runc                            x86_64    1.0.3-1.module+el8.5.0+20494+0311868c             ol8_appstream        3.1 M
 skopeo                          x86_64    2:1.5.2-1.0.1.module+el8.5.0+20494+0311868c       ol8_appstream        6.7 M
 slirp4netns                     x86_64    1.1.8-1.module+el8.5.0+20416+d687fed7             ol8_appstream         51 k
 udica                           noarch    0.2.6-1.module+el8.5.0+20494+0311868c             ol8_appstream         48 k
Installing dependencies:
 fuse-common                     x86_64    3.2.1-12.0.3.el8                                  ol8_baseos_latest     22 k
 fuse3                           x86_64    3.2.1-12.0.3.el8                                  ol8_baseos_latest     51 k
 fuse3-libs                      x86_64    3.2.1-12.0.3.el8                                  ol8_baseos_latest     95 k
 libnet                          x86_64    1.1.6-15.el8                                      ol8_appstream         67 k
 podman-catatonit                x86_64    1:3.4.2-9.0.1.module+el8.5.0+20494+0311868c       ol8_appstream        345 k
 policycoreutils-python-utils    noarch    2.9-16.0.1.el8                                    ol8_baseos_latest    252 k
 python3-pytoml                  noarch    0.1.14-5.git7dea353.el8                           ol8_appstream         25 k
 python3-pyxdg                   noarch    0.25-16.el8                                       ol8_appstream         94 k
 yajl                            x86_64    2.1.0-10.el8                                      ol8_appstream         41 k
Installing module profiles:
 container-tools/common                                                                                                
Enabling module streams:
 container-tools                           ol8                                                                         

Transaction Summary
========================================================================================================================
Install  25 Packages

If you like DNS on your container network, install podman-plugins and dnsmasq. This article assumes you do so. The latter of the 2 services needs to be enabled and started:

[opc@podman ~]$ for task in enable start is-active; do sudo systemctl ${task} dnsmasq; done
active

If you see active in the output as in the example dnsmasq is working. If your system is part of a more elaborate setup, the use of dnsmasq is discouraged and you should ask your friendly network admin for advice.

Virtual Network Configuration

This section describes setting up a virtual network. That way you are emulating the way you’d previously have worked with Docker. If I should find the time for it I’ll write a second article and introduce you to Podman’s PODs, an elegant concept similar to Kubernetes that is not available with the Docker engine.

Network creation

Before containers can communicate with one another, they need to be told which network to use. The easiest way to do so is by creating a new, custom network as shown in this example:

[opc@podman ~]$ podman network create oranet
/home/opc/.config/cni/net.d/oranet.conflist
[opc@podman ~]$ podman network ls
NETWORK ID    NAME        VERSION     PLUGINS
2f259bab93aa  podman      0.4.0       bridge,portmap,firewall,tuning
4f4bfc6d2c15  oranet      0.4.0       bridge,portmap,firewall,tuning,dnsname
[opc@podman ~]$ 

As you can see the new network – oranet – has been created and it’s capable of using DNS thanks for the dnsname extension. If you opted not to install podman-plugins and dnsmasq this feature won’t be availble. Testing showed that availability of DNS on the container network made life a lot easier.

Storage Volumes

Containers are transient by nature, things you store in them are ephemeral by design. Since that’s not ideal for databases, a persistence layer should be used instead. The industry’s best known method to do so is by employing (Podman) volumes. Volumes are crated using the podman volume create command, for example:

[opc@podman ~]$ podman volume create oradata
oradata

As it is the case with the Container images, by default alll the volume’s data will reside in ~/.local/share/containers.

Database Secrets

The final step while preparing for running a database in Podman is to create a secret. Secrets are a relatively new feature in Podman and relieve you from having to consider workarounds passing sensitive data to containers. The Oracle XE containers to be used need to be initialised with a DBA password and it is prudent not to pass this in clear text on the command line.

For this example the necessary database password has been created as a secret and stored as oracle-password using podman secret create ...

[opc@podman ~]$ podman secret create oracle-password ~/.passwordFileToBeDeletedAfterUse
0c5d6d9eff16c4d30d36c6133
[opc@podman ~]$ podman secret ls
ID                         NAME             DRIVER      CREATED        UPDATED        
0c5d6d9eff16c4d30d36c6133  oracle-password  file        2 minutes ago  2 minutes ago 

This concludes the necessary preparations.

Let there be Containers

With all the setup completed the next step is to start an Oracle 21c XE instance and build the SQLcl container.

Oracle XE

Using the instructions by Gerald Venzl’s GitHub repository, adapted for this use case, a call to podman run might look like this:

[opc@podman ~]$ podman run --name oracle21xe --secret oracle-password \
-e ORACLE_PASSWORD_FILE=/run/secrets/oracle-password -d \
--net oranet -v oradata:/opt/oracle/oradata \
docker.io/gvenzl/oracle-xe:21-slim
5d94c0c3620f811bbe522273f73cbcb7c5210fecc0f88b0ecacc1f5474c0855a

The necessary flags are as follows:

  • --name assigns a name to the container so you can reference it later
  • --secret passes a named secret to the container, accessible in /run/secrets/oracle-password
  • -d tells the container to run in the background
  • --net defines the network the container should be attached to
  • -v maps the newly created volume to a directory in the container

You can check whether the container is up an running by executing podman ps:

[opc@podman ~]$ podman ps
CONTAINER ID  IMAGE                               COMMAND     CREATED         STATUS             PORTS       NAMES
5d94c0c3620f  docker.io/gvenzl/oracle-xe:21-slim              53 seconds ago  Up 54 seconds ago              oracle21xe

Creating a small SQLcl container:

Creating a container to run sqlcl is really quite straight forward. A suitable Dockerfile is shown here, please ensure you update the ZIPFILE with the current SQLcl release.

FROM docker.io/openjdk:11

RUN useradd --comment "sqlcl owner" --home-dir /home/sqlcl --uid 1000 --create-home --shell $(which bash) sqlcl 

USER sqlcl
WORKDIR /home/sqlcl

ENV ZIPFILE=sqlcl-21.4.1.17.1458.zip

RUN curl -LO "https://download.oracle.com/otn_software/java/sqldeveloper/${ZIPFILE}" && \
        /usr/local/openjdk-11/bin/jar -xf ${ZIPFILE} && \
        rm ${ZIPFILE}

ENTRYPOINT ["bash", "/home/sqlcl/sqlcl/bin/sql", "/nolog"]

You could of course pull the latest sqlcl ZIP from https://download.oracle.com/otn_software/java/sqldeveloper/sqlcl-latest.zip. Using a named release should simplify the non-trivial task of naming ("tagging") your container image.

The image can be build using podman much in the same way Docker images were built:

[opc@podman ~]$ podman build . -t tools/sqlcl:21.4.1.17.1458

As you can see from the ENTRYPOINT the image cannot be sent to the backround (-d) by podman, it needs to be run interactively as you will see in the next section.

Linking Containers

The last step is to start the sqlcl container and connect to the database.

podman run --rm -it --name sqlcl --net oranet localhost/tools/sqlcl:21.4.1.17.1458

Here is an example how this works in my container:

[opc@podman ~]$ podman run --rm -it --name sqlcl --net oranet localhost/tools/sqlcl:21.4.1.17.1458


SQLcl: Release 21.4 Production on Mon Mar 21 13:35:05 2022

Copyright (c) 1982, 2022, Oracle.  All rights reserved.

SQL> connect system@oracle21xe/xepdb1
Password? (**********?) ***************
Connected.
SQL> show con_name
CON_NAME 
------------------------------
XEPDB1

The connection string consists of a username (system) and the container name assigned as part of the call to podman run ... --name. Thanks to the dnsname extension and linking the container to the oranet network it is possible to address systems by name. XEPDB1 is the default name of the XE instance’s Pluggable Database.

Instead of connecting to a Pluggable Database it is of course possible to connect to the Container Database’s Root (CDB$ROOT).

Summary

Podman is very compatible to Docker, easing the transition. In this part of the mini-series you could read how to use Podman functionality with Oracle Linux 8 to link a container running Oracle XE and SQLcl.

Configuring a VM using Ansible via the OCI Bastion Service

In my previous post I wrote about the creation of a Bastion Service using Terraform. As I’m incredibly lazy I prefer to configure the system pointed at by my Bastion Session with a configuration management tool. If you followed my blog for a bit you might suspect that I’ll use Ansible for that purpose. Of course I do! The question is: how do I configure the VM accessible via a Bastion Session?

Background

Please have a look at my previous post for a description of the resources created. In a nutshell the Terraform code creates a Virtual Cloud Network (VCN). There is only one private subnet in the VCN. A small VM without direct access to the Internet resides in the private subet. Another set of Terraform code creates a bastion session allowing me to connect to the VM.

I wrote this post on Ubuntu 20.04 LTS using ansible 4.8/ansible-core 2.11.6 by the way. From what I can tell these were current at the time of writing.

Connecting to the VM via a Bastion Session

The answer to “how does one connect to a VM via a Bastion Session?” isn’t terribly difficult once you know how to. The clue to my solution is with the SSH connection string as shown by the Terraform output variable. It prints the contents of oci_bastion_session.demo_bastionsession.ssh_metadata.command

$ terraform output
connection_details = "ssh -i <privateKey> -o ProxyCommand=\"ssh -i <privateKey> -W %h:%p -p 22 ocid1.bastionsession.oc1.eu-frankfurt-1.a...@host.bastion.eu-frankfurt-1.oci.oraclecloud.com\" -p 22 opc@10.0.2.39"

If I can connect to the VM via SSH I surely can do so via Ansible. As per the screen output above you can see that the connection to the VM relies on a proxy in form of the bastion session. See man 5 ssh_config for details. Make sure to provide the correct SSH keys in both locations as specified in the Terraform code. I like to think of the proxy session as a Jump Host to my private VM (its internal IP is 10.0.2.39). And yes, I am aware of alternative options to SSH, the one shown above however is the most compatible (to my knowledge).

Creating an Ansible Inventory and running a playbook

Even though it’s not the most flexible option I’m a great fan of using Ansible inventories. The use of an inventory saves me from typing a bunch of options on the command line.

Translating the Terraform output into the inventory format, this is what worked for me:

[blogpost]
privateinst ansible_host=10.0.2.39 ansible_user=opc ansible_ssh_common_args='-o ProxyCommand="ssh -i ~/.oci/oci_rsa -W %h:%p -p 22 ocid1.bastionsession.oc1.eu-frankfurt-1.a...@host.bastion.eu-frankfurt-1.oci.oraclecloud.com"'

Let’s run some Ansible code! Consider this playbook:

- hosts: blogpost
  tasks:
  - name: say hello
    ansible.builtin.debug:
      msg: hello from {{ ansible_hostname }}

With the inventory set, it’s now possible to run the playbook:

$ ansible-playbook -vi inventory.ini blogpost.yml 
Using /tmp/ansible/ansible.cfg as config file

PLAY [blogpost] *********************************************************************************************************

TASK [Gathering Facts] **************************************************************************************************
ok: [privateinst]

TASK [say hello] ********************************************************************************************************
ok: [privateinst] => {}

MSG:

hello from privateinst

PLAY RECAP **************************************************************************************************************
privateinst                : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

The playbook is of course very simple, but it can be easily extended. The tricky bit was establishing the connection, once the connection is established the sky is the limit!

Create an OCI bastion service via Terraform

Maintaining bastion hosts (a “jump box” or other network entry point directly exposed to the Internet) is somewhat frowned upon by security conscious architects, for good reasons. In my opinion the only way to connect on-premises systems to the cloud is by means of a dedicated, low-latency/high-bandwidth, and most importantly well-secured link.

I never liked the idea of exposing systems to the Internet – too much can go wrong and you’d be surprised about the number of port-scans you see, followed by attempts at breaking in. Sometimes of course opening a system to the Internet is unavoidable: a website offering services to the public is quite secure if it cannot be reached but won’t generate a lot of revenue that way. Thankfully there are ways to expose such applications safely to the Internet, a topic that’s out of scope of this post though.

My very personal need for the bastion service

I create lots of demos using Oracle Cloud Infrastructure (OCI) and setting up a dedicated link isn’t always practical. The solution for me is to use Oracle’s bastion service. This way I can ensure time-based secure access to my resources in a private subnet. Most importantly there is no need to connect a VM directly to the Internet. And since it’s all fully automated it doesn’t cause any more work than terraform up followed by a terraform destroy when the demo completed.

This blog post describes how I create a VCN with a private subnet containing a VM. The entire infrastructure is intended as a DEMO only. None of the resources will live longer than for the duration of a conference talk. Please don’t follow this approach if you would like to deploy systems in the cloud for > 45 minutes. Also be aware that it’s entirely possible for you to incur cost when calling terraform up on the code. As always, the code will be available on Github.

Creating a Bastion Service

The bastion service is created by Terraform. Following the advice from the excellent Terraform Up and Running (2nd ed) I separated the resource creation into three directories:

  • Network
  • Compute
  • Bastion

To keep things reasonably simple I refrained from creating modules.

Directory layout

Please have a look at the book for more details about the directory structure. You’ll notice that I simplified the example a little.

$ tree .
.
├── bastionsvc
│   ├── main.tf
│   ├── terraform.tfstate
│   └── variables.tf
├── compute
│   ├── compute.tf
│   ├── main.tf
│   ├── outputs.tf
│   ├── terraform.tfstate
│   ├── terraform.tfstate.backup
│   └── variables.tf
├── network
│   ├── network.tf
│   ├── outputs.tf
│   ├── terraform.tfstate
│   ├── terraform.tfstate.backup
│   └── variables.tf
├── readme.md
└── variables.tf

I decided to split the network code into a generic section and the bastion service for reason explained later.

Generic Network Code

The network code is responsible for creating the Virtual Cloud Network (VCN) including subnets, security lists, necessary gateways etc. When I initially used the bastion service I struggled a bit with Network Security Groups (NSG) and went with a security list instead. I guess I should re-visit that decision at some point.

The network must be created first. In addition to creating all the necessary infrastructure it exports an output variable used by the compute and bastion code. These read remote state to get the necessary OCIDs.

Note that the choice of a remote data source has its drawbacks as described in the documentation. These don’t apply for my demos as I’m the only user of the code. And while I’m at it, using local state is acceptable only because I know I’m the only one using the code. Local state doesn’t necessarily work terribly well for team-development.

Here are some key features of the network code. As these tend to go stale over time, have a look at the Github repository for the latest and greatest revision.

resource "oci_core_vcn" "vcn" {

  compartment_id = var.compartment_ocid
  cidr_block     = "10.0.2.0/24"
  defined_tags   = var.network_defined_tags
  display_name   = "demovcn"
  dns_label      = "demo"

}

# --------------------------------------------------------------------- subnet

resource "oci_core_subnet" "private_subnet" {

  cidr_block                 = var.private_sn_cidr_block
  compartment_id             = var.compartment_ocid
  vcn_id                     = oci_core_vcn.vcn.id
  defined_tags               = var.network_defined_tags
  display_name               = "private subnet"
  dns_label                  = "private"
  prohibit_public_ip_on_vnic = true
  prohibit_internet_ingress  = true
  route_table_id             = oci_core_route_table.private_rt.id
  security_list_ids          = [
    oci_core_security_list.private_sl.id
  ]
}

The security list allows SSH only from within the same subnet:

# --------------------------------------------------------------------- security list

resource "oci_core_security_list" "private_sl" {

  compartment_id = var.compartment_ocid
  vcn_id         = oci_core_vcn.vcn.id

...

  egress_security_rules {

    destination = var.private_sn_cidr_block
    protocol    = "6"

    description      = "SSH outgoing"
    destination_type = ""

    stateless = false
    tcp_options {

      max = 22
      min = 22

    }
  }

  ingress_security_rules {

    protocol = "6"
    source   = var.private_sn_cidr_block

    description = "SSH inbound"

    source_type = "CIDR_BLOCK"
    tcp_options {

      max = 22
      min = 22

    }

  }
}

The bastion service and its corresponding session are going to be created in the same private subnet as the compute instance for the sake of simplicity.

Compute Instance

The compute instance is created as a VM.Standard.E3.Flex shape with 2 OCPUs. There’s nothing too special about the resource, except maybe that I’m explicitly enabling the bastion plugin agent, a prerequisite for using the service.

resource "oci_core_instance" "private_instance" {
  agent_config {
    is_management_disabled = false
    is_monitoring_disabled = false

...

    plugins_config {
      desired_state = "ENABLED"
      name = "Bastion"
    }
  }

  defined_tags = var.compute_defined_tags

  create_vnic_details {
    
    assign_private_dns_record = true
    assign_public_ip = false
    hostname_label = "privateinst"
    subnet_id = data.terraform_remote_state.network_state.outputs.private_subnet_id
    nsg_ids = []
  }

...

Give it a couple of minutes for all agents to start.

Bastion Service

Once the VM’s bastion agent is up it is possible to create the bastion service:

resource "oci_bastion_bastion" "demo_bastionsrv" {

  bastion_type     = "STANDARD"
  compartment_id   = var.compartment_ocid
  target_subnet_id = data.terraform_remote_state.network_state.outputs.private_subnet_id

  client_cidr_block_allow_list = [
    var.local_laptop_id
  ]

  defined_tags = var.network_defined_tags

  name = "demobastionsrv"
}


resource "oci_bastion_session" "demo_bastionsession" {

  bastion_id = oci_bastion_bastion.demo_bastionsrv.id
  defined_tags = var.network_defined_tags
  
  key_details {
  
    public_key_content = var.ssh_bastion_key
  }

  target_resource_details {

    session_type       = "MANAGED_SSH"
    target_resource_id = data.terraform_remote_state.compute_state.outputs.private_instance_id

    target_resource_operating_system_user_name = "opc"
    target_resource_port                       = "22"
  }

  session_ttl_in_seconds = 3600

  display_name = "bastionsession-private-host"
}

output "connection_details" {
  value = oci_bastion_session.demo_bastionsession.ssh_metadata.command
}

The Bastion is set up in the private subnet created by the network code. Note that I’m defining the session’s client_cidr_block_allow_list specifically to only allow my external IP to access the service. The session is of type Managed SSH, thus requires a Linux host.

And this is all I can say about the creation of a bastion session in Terraform.

Terraform in action

Once all the resources have been created all I need to do is adapt the SSH command provided by my output variable shown here:

connection_details = "ssh -i <privateKey> -o ProxyCommand=\"ssh -i <privateKey> -W %h:%p -p 22 ocid1.bastionsession.oc1.eu-frankfurt-1.am...@host.bastion.eu-frankfurt-1.oci.oraclecloud.com\" -p 22 opc@10.0.2.94"

After adopting the SSH command I can connect to the instance.

$ ssh -i ...
The authenticity of host '10.0.2.94 (<no hostip for proxy command>)' can't be established.
ECDSA key fingerprint is SHA256:Ot...
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '10.0.2.94' (ECDSA) to the list of known hosts.
Activate the web console with: systemctl enable --now cockpit.socket

[opc@privateinst ~]$ hostname
privateinst
[opc@privateinst ~]$ logout

That’s it! I am connected to the instance and experiment with my demo.

Another reason I love Terraform: when the demo has concluded I can simply tear down all resources with very few commands.

Deploying I/O intensive workloads in the cloud: Oracle Automatic Storage Management (ASM)

Over the past month I wrote a few posts about deploying I/O intensive workloads in the cloud. Using standard Linux tools, mainly Logical Volume Manager (LVM) I tried to prevent certain pitfalls from occurring. Although I’m a great fan of LVM and RAID (and their combination), there are situations where LVM/Software RAID aren’t part the best solution. This is especially true when it comes to extending a VM’s storage configuration for an Oracle Database.

Striping, Mirroring and Risk

With LVM RAID (or LVM on top of Software RAID) it is possible to stripe an Oracle database-or any other I/O intensive workload-across multiple disks. At the risk of losing the RAID device (remember that RAID 0 offers exactly zero protection from disk failure) you can gain a performance advantage. The risk can be partially mitigated by using a proven, tested, and most importantly, rehearsed technique to still meet the RTO and RPO of the database.

The trouble with LVM RAID can potentially start as soon as you add more storage to the VM. I hope I managed to demonstrate the risk of I/O hotspots in my earlier posts.

Oracle’s ASM is different from stock-Linux tools, and it’s much less of a general purpose solution. Being an Oracle product it is also subject to a different license model. Which rules it out for most generic use cases, or at least that’s my experience. If, however, you want to deploy an Oracle database in the cloud, it is well worth considering ASM. I don’t want to say it’s free of drawbacks (no piece of software is) but in my opinion its benefits outweigh the disadvantages deploying a database.

For the sake of argument I’ll treat Oracle Restart and Grid Infrastructure as synonyms in this article. Oracle Restart is made up of ASM as well as a trimmed version of Oracle’s Clusterware as used in Real Application Clusters. Oracle Restart is installed into a separate Oracle Home, you usually install one database software home in addition. More on that later.

ASM vs LVM: a Question of Concepts

ASM has been around for quite some time and I like to think of it as a mature technology. In a way it is similar to LVM as you aggregate block devices (Physical Volumes in LVM) into Disk Groups (Volume Groups in LVM). Rather than creating another layer of abstraction on top of the ASM Disk Group as you do with LVM you simply point the database at a couple of Disk Groups and you are done. There is no need to maintain an equivalent of a Logical Volume or file system. A shorter code path to traverse tends to be less work. And it’s common knowledge that the fastest way to do something is not to do it in the first place. I should also point out that ASM does not perform I/O. It’s always the database session that does; otherwise ASM would never scale.

But what about protection from failure? Put very simply, in ASM you have a choice between striping and striping + mirroring. There are multiple so-called redundancy levels each with their own implications. If you are interested you can find the relevant details in Oracle’s Automatic Storage Management Administration Guide.

My Test Environment’s Setup

To keep things consistent with my previous posts I am installing Oracle Restart on my VM.Standard.E4.Flex VM in Oracle Cloud Infrastructure. Both Grid Infrastructure and database software are patched to 19.12.0, the current release at the time of writing. The underlying Linux version is 8.4 with kernel 5.4.17-2102.203.6.el8uek.x86_64. I decided to use UDEV rules for device name persistence and setting permissions rather than ASMLib or ASM Filter Driver. To keep things simple and also to follow the path I chose with my previous LVM/RAID posts I’m going to create the +DATA and +RECO Disk Groups with EXTERNAL redundancy. With external redundancy failure of a single block device in an ASM Disk Group will bring the entire Disk Group down, taking the database with it: game over. This is the same as with a RAID 0 configuration.

Again, and in line with the other posts about the topic, this article doesn’t concern itself with the durability of block devices in the cloud. External Redundancy should only be considered if approved in your organisation. You are most likely also required to put additional means in place to guarantee the database’s RTO and RPO. See my earlier comments and posts for details.

My +DATA disk group is currently made up of 2 block devices, +RECO consists of just 1 device. The database lives in +DATA with the Fast Recovery Area (FRA) located on +RECO.

SQL> select dg.name dg_name, dg.type, d.name disk_name, d.os_mb, d.path
  2   from v$asm_disk d join v$asm_diskgroup dg on (d.group_number = dg.group_number);

DG_NAME    TYPE   DISK_NAME       OS_MB PATH
---------- ------ ---------- ---------- ------------------------------
RECO       EXTERN RECO_0000      511998 /dev/oracleoci/oraclevde1
DATA       EXTERN DATA_0001      511998 /dev/oracleoci/oraclevdd1
DATA       EXTERN DATA_0000      511998 /dev/oracleoci/oraclevdc1

You can see from the volume sizes this is a lab/playground environment. The concepts however are independent of disk size. Just make sure the disks you use are of the same size and performance characteristics. Terraform is the most convenient way in the cloud to ensure they are.

Performance

Just as before I’ll start the familiar Swingbench workload. It isn’t meant to benchmark the system but to see which disks are in use. As in the previous examples I gave, Online Redo Logs aren’t multiplexed. This really is acceptable only in this scenario and shouldn’t be done with any serious deployments of the database. It helps me isolate I/O though, hence it’s why I did it.

Before getting detailed I/O performance figures I need to check the current device mapping:

SQL> !ls -l /dev/oracleoci/oraclevd{c,d}1
lrwxrwxrwx. 1 root root 7 Sep  1 15:21 /dev/oracleoci/oraclevdc1 -> ../sdc1
lrwxrwxrwx. 1 root root 7 Sep  1 15:21 /dev/oracleoci/oraclevdd1 -> ../sdd1

Looking at the iostat output I can see both /dev/sdc and /dev/sdd actively used:

[oracle@oracle-19c-asm ~]$ iostat -xmz 5 3
Linux 5.4.17-2102.203.6.el8uek.x86_64 (oracle-19c-asm)  09/01/2021      _x86_64_        (16 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.19    0.00    0.26    0.12    0.01   98.43

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  ... %util
sda              1.12    1.03      0.04      0.03     0.01     0.54  ...  0.10
dm-0             1.03    0.95      0.03      0.03     0.00     0.00  ...  0.08
dm-1             0.02    0.60      0.00      0.01     0.00     0.00  ...  0.01
sdb              0.87    0.51      0.04      0.00     0.00     0.12  ...  0.09
dm-2             0.86    0.63      0.04      0.00     0.00     0.00  ...  0.09
sdc            291.58    4.87     54.15      0.05     3.51     0.01  ... 22.92
sdd            289.95    4.05     53.63      0.04     3.37     0.01  ... 19.01
sde              0.13    0.00      0.00      0.00     0.00     0.00  ...  0.01
sdf              0.10    0.72      0.00      0.01     0.00     0.00  ...  0.13

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           4.23    0.00    7.77   23.90    0.33   63.78

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  ... %util
sda              0.00    2.40      0.00      0.05     0.00     1.20  ...  0.12
dm-0             0.00    0.60      0.00      0.00     0.00     0.00  ...  0.08
dm-1             0.00    3.00      0.00      0.05     0.00     0.00  ...  0.04
sdb              0.00    0.40      0.00      0.00     0.00     0.00  ...  0.02
dm-2             0.00    0.40      0.00      0.00     0.00     0.00  ...  0.02
sdc           24786.60   67.40    211.80      0.57  2319.60     0.00 ... 100.00
sdd           24575.40   72.00    210.01      0.55  2302.80     0.00 ...  97.70
sdf              0.00    0.40      0.00      0.00     0.00     0.00  ...  0.06

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           4.74    0.00    7.65   24.38    0.31   62.93

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  ... %util
sda              0.00    1.80      0.00      0.02     0.00     0.20  ...  0.04
dm-0             0.00    1.20      0.00      0.02     0.00     0.00  ...  0.02
dm-1             0.00    0.80      0.00      0.01     0.00     0.00  ...  0.02
sdc           24684.20   61.60    215.14      0.50  2844.40     0.40 ... 100.00
sdd           24399.80   68.40    212.41      0.55  2787.20     0.60 ...  95.74
sdf              0.00    0.80      0.00      0.01     0.00     0.00  ...  0.10

This should demonstrate the fact ASM stripes data across disks. Up to this point there isn’t any visible difference in the iostat output compared to my previous posts.

Extending Storage

The main difference between LVM/RAID and ASM is yet to come: what happens if I have to add storage to the +DATA disk group? Remember that with LVM you had to add as many additional devices as you had in use. In other words, if you used a RAID 0 consisting of 2 block devices, you need to add another 2. With ASM you don’t have the same restriction as you can see in a minute.

I have added another block device to the VM, named /dev/oracleoci/oraclevdf with the exact same size and performance characteristics as the existing 2 devices. After partitioning it and checking for device permissions I can add the device to the Disk Group. There are many ways to do so, I’m showing you the SQL interface.

[grid@oracle-19c-asm ~]$ sqlplus / as sysasm

SQL*Plus: Release 19.0.0.0.0 - Production on Thu Sep 2 06:21:08 2021
Version 19.12.0.0.0

Copyright (c) 1982, 2021, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.12.0.0.0

SQL> alter diskgroup data add disk '/dev/oracleoci/oraclevdf1' ; 

Diskgroup altered.

SQL>

The prompt returns immediately, however there is an asynchronous operation started in the background, a so-called re-balance task:

SQL> select dg.name, o.operation, o.state,o.sofar,o.est_work,o.est_minutes, o.error_code
  2   from v$asm_diskgroup dg join v$asm_operation o using (group_number)
  3  /

NAME                           OPERA STAT      SOFAR   EST_WORK EST_MINUTES ERROR_CODE
------------------------------ ----- ---- ---------- ---------- ----------- --------------------------------------------
DATA                           REBAL RUN       14608          0           0
DATA                           REBAL DONE          0          0           0
DATA                           REBAL DONE      33308      33308           0

Once completed, another disk has been added to the +DATA disk group:

SQL> select dg.name dg_name, dg.type, d.name disk_name, d.os_mb, d.path
  2   from v$asm_disk d join v$asm_diskgroup dg on (d.group_number = dg.group_number)
  3  where dg.name = 'DATA'
  4  /

DG_NAME    TYPE   DISK_NAME	  OS_MB PATH
---------- ------ ---------- ---------- ------------------------------
DATA	   EXTERN DATA_0002	 511998 /dev/oracleoci/oraclevdf1
DATA	   EXTERN DATA_0000	 511998 /dev/oracleoci/oraclevdc1
DATA	   EXTERN DATA_0001	 511998 /dev/oracleoci/oraclevdd1

SQL> 

The disk rebalance operation is an online operation by the way with a few tunables such as the so-called power limit: you can trade off completion time vs effect it has on ongoing I/O operations. For some time the maximum value of ASM’s power limit was 11 ;)

What does that mean for our Swingbench workload? Let’s have a look at iostat while the same workload is running. Please remember that /dev/oracleoci/oraclevd[cdf]1 are part of the ASM +DATA Disk Group:

[grid@oracle-19c-asm ~]$ ls -l /dev/oracleoci/oraclevd[cdf]1
lrwxrwxrwx. 1 root root 7 Sep  2 06:30 /dev/oracleoci/oraclevdc1 -> ../sdd1
lrwxrwxrwx. 1 root root 7 Sep  2 06:30 /dev/oracleoci/oraclevdd1 -> ../sdb1
lrwxrwxrwx. 1 root root 7 Sep  2 06:35 /dev/oracleoci/oraclevdf1 -> ../sdf1

Please bear this in mind when looking at the iostat output:

[grid@oracle-19c-asm ~]$ iostat -xmz 5 3
Linux 5.4.17-2102.203.6.el8uek.x86_64 (oracle-19c-asm) 	09/02/2021 	_x86_64_	(16 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.27    0.03    0.37    0.40    0.03   98.90

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   ...  %util
sda              4.92    1.21      0.14      0.08     0.03   ...   0.26
dm-0             4.53    0.68      0.13      0.07     0.00   ...   0.23
dm-1             0.12    0.75      0.00      0.01     0.00   ...   0.02
sdb            391.83    7.36     12.15      3.60    27.41   ...   6.90
sdc              0.15    0.71      0.00      0.01     0.00   ...   0.14
sdd            396.92    8.48     12.20      3.61    28.23   ...   6.85
sdf            383.58   13.97      3.22     10.71    27.53   ...   5.92
sde              3.74    0.85      0.19      0.01     0.00   ...   0.28
dm-2             3.75    1.02      0.19      0.01     0.00   ...   0.28

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           7.60    0.00   12.18   26.38    1.61   52.24

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   ...  %util
sda              0.00    0.40      0.00      0.00     0.00   ...   0.06
dm-0             0.00    0.40      0.00      0.00     0.00   ...   0.06
sdb           24375.60  176.80    203.25      1.39  1635.40  ...   97.62
sdc              0.00    0.80      0.00      0.01     0.00   ...   0.14
sdd           24654.60  172.40    205.89      1.45  1689.80  ...   99.96
sdf           24807.40  201.20    207.31      1.51  1718.20  ...   97.86
sde              0.00    1.00      0.00      0.01     0.00   ...   0.04
dm-2             0.00    1.20      0.00      0.01     0.00   ...   0.04

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           7.22    0.00   13.05   23.61    1.55   54.57

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   ...  %util
sda              0.00    0.60      0.00      0.00     0.00   ...   0.10
dm-0             0.00    0.40      0.00      0.00     0.00   ...   0.04
dm-1             0.00    0.20      0.00      0.00     0.00   ...   0.06
sdb           24783.40  145.40    212.17      1.15  2363.20  ...   97.48
sdc              0.00    0.60      0.00      0.00     0.00   ...   0.14
sdd           24795.40  113.60    213.19      1.00  2470.80  ...   99.90
sdf           24871.00  106.00    213.34      0.97  2426.00  ...   97.00
sde              0.00    2.40      0.00      0.02     0.00   ...   0.08
dm-2             0.00    2.60      0.00      0.02     0.00   ...   0.08

You can see that all 3 disks are more or less evenly used. This is the main difference to the use of LVM RAID. Thanks to the rebalance operation all data on the disk group is redistributed across the disks in the group.

Summary

When it comes to deploying an Oracle database in an Infrastructure as a Service (IaaS) scenario Oracle’s ASM offers lots of advantages over stock Linux tools. For example, it is possible to add storage to an ASM Disk Group as and when it’s needed without over-provisioning. ASM furthermore rebalances all data in the Disk Group across all disks as part of a configuration change as you just saw. That way it is much harder to create I/O hotspots I often see when ASM is not in use.

In addition to ASM you also get other amenities as a side effect. For example, Oracle Restart allows you to start databases and database services automatically when the system boots up. There is no need to write systemd unit files as it’s all done behind the covers. Should your database crash for some reason, provided it can, Oracle Restart automatically brings it up again without your intervention. It also works beautifully in conjunction with Oracle’s Universal Connection Pool (UCP) and Data Guard.

The use of ASM implies direct I/O. I said earlier that ASM doesn’t maintain a file system layer when used for the Oracle database (that’s not entirely correct but true for all the databases I saw) and as a result Linux can’t cache I/O. This is considered a good thing in the community by most. Oracle has its own buffer cache after all, as long as it’s sized appropriately for your workload, double-buffering isn’t the best use of precious DRAM.

So much for the plus side, but what about the implications of using Oracle Restart? First of all, it’s another Oracle software home you need to maintain. Given the high degree of automation possible these days that shouldn’t be an issue. An Ansible playbook is easy enough to write, patching all Oracle Restart components.

If your organisation mandates a separation of duties between database and storage/Linux administration your respective administrator might need to learn a new technology.

I’m sure you can think of additional downsides to using ASM, and I admit I won’t delve into the subject deeper as I’m quite biased. ASM has been one of the truly outstanding innovations for running Oracle in my opinion. The human aspect of introducing a new technology however isn’t to be under-estimated and the best technology doesn’t always win the race.

Deploying I/O intensive workloads in the cloud: mdadm (aka Software) RAID

The final part of my “avoiding pitfalls with Linux Logical Volume Manager” (LVM) series considers software RAID on Oracle Linux 8 as the basis for your LVM’s Physical Volume (PV). It’s still the very same VM.Standard.E4.Flex running Oracle 19.12.0 on top of Oracle Linux 8.4 with UEK6 (5.4.17-2102.203.6.el8uek.x86_64) I used for creating the earlier posts.

Previous articles in this series can be found here:

Storage Configuration

Rather than using LVM-RAID as in the previous article, the plan this time is to create a software RAID (pseudo-device) and use it as a Physical Volume. This is exactly what I have done before I learned about LVM RAID. Strictly speaking, it isn’t necessary to create a Volume Group on top of a RAID device as you can absolutely use such a device on its own. Having said that, growing a RAID 0 device doesn’t seem possible after my limited time studying the documentation. Speaking of which: you can read more about software RAID in Red Hat Linux 8 here.

In this post I’ll demonstrate how you could use a RAID 0 device for striping data across multiple disks. Please don’t implement the steps in this article unless software RAID is an approved solution in your organisation and you are aware of the implications. Kindly note this article does not concern itself with the durability of block devices in the cloud. In the cloud, you have a lot less control over the block devices you get, so make sure you have appropriate protection methods in place to guarantee your databases’ RTO and RPO. RAID 0 offers 0 protection from disk failure (it’s in the name ;), so as soon as you lose a disk from your software RAID, it’s game over.

Creating the RAID Device

The first step is to create the RAID device. For nostalgic reasons I named it /dev/md127, other sources name their devices /dev/md0. Not that it matters too much.

[opc@oracle-19c-fs ~]$ sudo mdadm --create /dev/md127 --level=0 \
> --raid-devices=2 /dev/oracleoci/oraclevdc1 /dev/oracleoci/oraclevdd1
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md127 started.
[opc@oracle-19c-fs ~]$ 

As you can see from the output above mdadm created the device for me. If you wondered what the funny device names imply, have a look at an earlier post I wrote about device name persistence in OCI.

You can always use mdadm --detail to get all the interesting details from a RAID device:

[opc@oracle-19c-fs ~]$ sudo mdadm --detail /dev/md127
/dev/md127:
           Version : 1.2
     Creation Time : Fri Aug  6 14:15:12 2021
        Raid Level : raid0
        Array Size : 524019712 (499.74 GiB 536.60 GB)
      Raid Devices : 2
     Total Devices : 2
       Persistence : Superblock is persistent

       Update Time : Fri Aug  6 14:15:12 2021
             State : clean 
    Active Devices : 2
   Working Devices : 2
    Failed Devices : 0
     Spare Devices : 0

            Layout : -unknown-
        Chunk Size : 512K

Consistency Policy : none

              Name : oracle-19c-fs:127  (local to host oracle-19c-fs)
              UUID : 30dc8f99...
            Events : 0

    Number   Major   Minor   RaidDevice State
       0       8       33        0      active sync   /dev/sdc1
       1       8       49        1      active sync   /dev/sdd1
[opc@oracle-19c-fs ~]$  

This is looking good – both devices are available and no errors have occurred.

Creating oradata_vg

With the future PV available it’s time to create the Volume Group and the Logical Volumes (LV) for the database and Fast Recovery Area. I’m listing the steps here for later reference, although they are the same as in part 1 of this article.

[opc@oracle-19c-fs ~]$ #
[opc@oracle-19c-fs ~]$ # step 1) create the PV
[opc@oracle-19c-fs ~]$ sudo pvcreate /dev/md127
  Physical volume "/dev/md127" successfully created.

[opc@oracle-19c-fs ~]$ #
[opc@oracle-19c-fs ~]$ # step 2) create the VG
[opc@oracle-19c-fs ~]$ sudo vgcreate oradata_vg /dev/md127
  Volume group "oradata_vg" successfully created

[opc@oracle-19c-fs ~]$ #
[opc@oracle-19c-fs ~]$ # step 3) create the first LV
[opc@oracle-19c-fs ~]$ sudo lvcreate --extents 80%FREE --name oradata_lv oradata_vg 
  Logical Volume "oradata_lv" created

[opc@oracle-19c-fs ~]$ #
[opc@oracle-19c-fs ~]$ # step 4) create the second LV
[opc@oracle-19c-fs ~]$ sudo lvcreate --extents 100%FREE --name orareco_lv oradata_vg 
  Logical volume "orareco_lv" created.

The end result are 2 LVs in oradata_vg:

[opc@oracle-19c-fs ~]$ sudo lvs oradata_vg
  LV         VG         Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  oradata_lv oradata_vg -wi-a----- 399.79g                                                    
  orareco_lv oradata_vg -wi-a----- <99.95g   

That’s it! The LVs require file systems before they can be mounted (not shown here).

Trying it out

After the final touches have been applied I restored the database and started the familiar Swingbench workload to see which disks are in use. Right before I did that I ensured I’m not multiplexing control files/online redo logs in the FRA for test purposes only. NOT multiplexing control files/online redo log members is probably a Bad Idea for serious Oracle deployments but ok for this scenario.

I am expecting to see both block devices making up /dev/md127 used. And sure enough, they are:

[opc@oracle-19c-fs ~]$ iostat -xmz 5 3
Linux 5.4.17-2102.203.6.el8uek.x86_64 (oracle-19c-fs)   13/08/21        _x86_64_        (16 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.23    0.01    0.35    0.57    0.01   98.83

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  ...  %util
sda              2.99    0.96      0.08      0.04     0.03     0.26  ...   0.21
dm-0             2.78    0.62      0.07      0.03     0.00     0.00  ...   0.20
dm-1             0.06    0.58      0.00      0.01     0.00     0.00  ...   0.02
sdb              1.28    0.22      0.06      0.00     0.00     0.02  ...   0.13
dm-2             1.26    0.24      0.06      0.00     0.00     0.00  ...   0.13
sdc            753.52   26.38      8.37      5.64    30.91     0.29  ...   7.36
md127         1573.79   53.30     17.44     12.01     0.00     0.00  ...   0.00
sdd            758.09   26.57      8.42      5.64    31.29     0.05  ...   9.34
sde             20.53    0.00      5.11      0.00     0.00     0.00  ...   1.79
dm-3            20.51    0.00      5.11      0.00     0.00     0.00  ...   1.79
dm-4          1558.54   28.25     12.20      5.97     0.00     0.00  ...   6.56
dm-5             4.69    2.61      4.58      5.26     0.00     0.00  ...   4.15

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           4.08    0.00    5.32    9.48    0.13   80.99

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  ... %util
sda              0.00    3.40      0.00      0.03     0.00     0.60  ...  0.08
dm-0             0.00    2.60      0.00      0.02     0.00     0.00  ...  0.08
dm-1             0.00    1.40      0.00      0.01     0.00     0.00  ...  0.04
sdc           16865.80  284.60    140.04      2.39  1059.60     0.20 ...  92.60
md127         36008.00  564.20    281.33      4.76     0.00     0.00 ...   0.00
sdd           16978.80  279.40    141.11      2.34  1081.40     0.00 ...  99.96
dm-4          36007.80  563.00    281.33      4.73     0.00     0.00 ... 100.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           4.07    0.00    5.51   10.52    0.16   79.74

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  ... %util
sdb              0.00    0.80      0.00      0.01     0.00     0.20  ...  0.04
dm-2             0.00    1.00      0.00      0.01     0.00     0.00  ...  0.04
sdc           17709.80  317.80    142.87      2.51   577.40     0.00 ...  93.90
md127         36657.80  661.60    286.41      5.31     0.00     0.00 ...   0.00
sdd           17790.00  343.40    143.69      2.77   599.00     0.00 ...  99.94
dm-4          36657.80  660.20    286.41      5.28     0.00     0.00 ... 100.00

[opc@oracle-19c-fs ~]$ 

No surprises here! Except maybe that /dev/md127 was somewhat underutilised ;) I guess that’s an instrumentation bug/feature. /dev/dm-4 – showing 100% utilisation – belongs to oradata_lv:

[opc@oracle-19c-fs ~]$ ls -l /dev/mapper | egrep dm-4
lrwxrwxrwx. 1 root root       7 Aug 13 09:37 oradata_vg-oradata_lv -> ../dm-4

Extending oradata_vg

Just as with each previous example I’d like to see what happens when I run out of space and have to extend oradata_vg. For this to happen I need a couple more block devices. These have to match the existing ones in size and performance characteristics for the best result. No difference to LVM-RAID I covered in the earlier article.

I created /dev/md128 in the same way as I did for the original RAID device and created a Physical Volume from it. oradata_vg looked like this prior to its extension:

[opc@oracle-19c-fs ~]$ sudo vgs oradata_vg
  VG         #PV #LV #SN Attr   VSize   VFree
  oradata_vg   1   2   0 wz--n- 499.74g    0 

In the next step I extended the Volume Group but only after I ensured I have a proven, working backup of everything. Don’t ever make changes to the storage layer without a backup and a known, tested, proven way to recover from unforeseen issues!

[opc@oracle-19c-fs ~]$ sudo vgextend oradata_vg /dev/md128
  Volume group "oradata_vg" successfully extended
[opc@oracle-19c-fs ~]$ sudo vgs oradata_vg
  VG         #PV #LV #SN Attr   VSize   VFree  
  oradata_vg   2   2   0 wz--n- 999.48g 499.74g

The VG now shows 2 PVs and plenty of free space. So let’s add 80% of the free space to oradata_lv.

[opc@oracle-19c-fs ~]$ sudo lvresize --extents +80%FREE --resizefs /dev/mapper/oradata_vg-oradata_lv
  Size of logical volume oradata_vg/oradata_lv changed from 399.79 GiB (102347 extents) to <799.59 GiB (204695 extents).
  Logical volume oradata_vg/oradata_lv successfully resized.
meta-data=/dev/mapper/oradata_vg-oradata_lv isize=512    agcount=16, agsize=6550144 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1
data     =                       bsize=4096   blocks=104802304, imaxpct=25
         =                       sunit=128    swidth=256 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=51173, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
data blocks changed from 104802304 to 209607680

The LV changes from its original size …

[opc@oracle-19c-fs ~]$ sudo lvs /dev/mapper/oradata_vg-oradata_lv
  LV         VG         Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert                                                         
  oradata_lv oradata_vg -wi-ao---- 399.79g

to its new size:

[opc@oracle-19c-fs ~]$ sudo lvs /dev/mapper/oradata_vg-oradata_lv
  LV         VG         Attr       LSize    Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  oradata_lv oradata_vg -wi-ao---- <799.59g                                                    

The same applies to the file system as well:

[opc@oracle-19c-fs ~]$ df -h /u01/oradata
Filesystem                         Size  Used Avail Use% Mounted on
/dev/mapper/oradata_vg-oradata_lv  800G   38G  762G   5% /u01/oradata

Does that change performance?

Based on my experience with LVM-RAID I did not expect a change in performance as my database wasn’t yet at a stage where it required the extra space yet. My assumption was confirmed by iostat:

[opc@oracle-19c-fs ~]$ iostat -xmz 5 3
Linux 5.4.17-2102.203.6.el8uek.x86_64 (oracle-19c-fs)   13/08/21        _x86_64_        (16 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.98    0.01    1.44    2.35    0.03   95.18

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  ... %util
sda              2.32    0.99      0.06      0.03     0.02     0.27  ...  0.17
dm-0             2.16    0.61      0.06      0.03     0.00     0.00  ...  0.16
dm-1             0.05    0.62      0.00      0.01     0.00     0.00  ...  0.02
sdb              0.99    0.20      0.05      0.00     0.00     0.02  ...  0.11
dm-2             0.98    0.22      0.04      0.00     0.00     0.00  ...  0.11
sdc           4538.44   73.12     38.69      4.78   190.85     0.23  ... 26.27
md127         9485.50  147.14     78.09     10.13     0.00     0.00  ...  0.00
sdd           4562.89   73.73     38.90      4.79   193.25     0.04  ... 29.88
sde             15.87    0.00      3.95      0.00     0.00     0.00  ...  1.39
dm-3            15.86    0.00      3.95      0.00     0.00     0.00  ...  1.39
dm-4          9473.71  127.63     74.04      5.46     0.00     0.00  ... 27.74
dm-5             3.63    2.02      3.54      4.07     0.00     0.00  ...  3.21
sdf              0.07    0.00      0.00      0.00     0.00     0.01  ...  0.01
sdg              0.08    0.00      0.00      0.00     0.00     0.01  ...  0.00
md128            0.06    0.02      0.00      0.00     0.00     0.00  ...  0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.96    0.00    5.44    8.52    0.08   82.00

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  ... %util
sdc           17652.60  306.80    141.15      2.52   414.40     0.00 ...  88.78
md127         36265.40  608.00    283.35      5.01     0.00     0.00 ...   0.00
sdd           17783.60  301.20    142.17      2.43   411.60     0.00 ... 100.00
dm-4          36267.40  607.00    283.37      4.95     0.00     0.00 ... 100.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           4.20    0.00    5.45    8.82    0.14   81.38

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  ... %util
sda              0.00    1.20      0.00      0.01     0.00     0.00  ...  0.04
dm-0             0.00    1.00      0.00      0.01     0.00     0.00  ...  0.04
dm-1             0.00    0.20      0.00      0.00     0.00     0.00  ...  0.02
sdc           18145.40  332.20    143.99      2.55   284.40     0.00 ...  92.22
md127         36865.20  650.20    288.04      5.00     0.00     0.00 ...   0.00
sdd           18161.20  318.00    144.14      2.45   285.20     0.00 ...  99.98
dm-4          36863.20  649.00    288.02      4.99     0.00     0.00 ...  99.98

[opc@oracle-19c-fs ~]$ 

As long as there aren’t any database files in the “extended” part of the LV, there won’t be a change in performance. As soon as your database spills over to the “new” disks, you should see a benefit from the newly added /dev/dm128.

Summary

Just as LVM-RAID does, using software RAID allows you to benefit from striping data across multiple devices. The iostat output is quite clear about the benefit, just look at the figures for /dev/sdc, /dev/sdd and how they accumulate in /dev/md127.

Using software RAID doesn’t come without a risk, it’s entirely possible to lose a block device and thus the RAID device. It’s imperative you protect against this scenario in a way that matches your database’s RTO and RPO.

My main problem with the solution as detailed in this post is the lack of a re-balance feature you get with Oracle’s Automatic Storage Management (ASM). It’s still possible to have I/O hotspots after a storage space expansion.

Install the Oracle Cloud Infrastructure CLI on Ubuntu 20.04 LTS

This is a short post on how to install/configure the Oracle Cloud Infrastructure (OCI) Command Line Interface (CLI) on Ubuntu 20.04 LTS. On a couple of my machines I noticed the default Python3 interpreter to be 3.8.x, so I’ll stick with this version. I used the Manual installation, users with higher security requirements might want to consider the offline installation.

Creating a virtual environment

The first step is to create a virtual environment to prevent the OCI CLI’s dependencies from messing up my python installation.

[martin@ubuntu: python]$ mkdir -p ~/development/python && cd ~/development/python
[martin@ubuntu: python]$ python3 -m venv oracle-cli

If this command throws an error you may have to install the virtual-env module via sudo apt install python3.8-venv

With the venv in place you need to activate it. This is a crucial step! Don’t forget to run it

[martin@ubuntu: python]$ source oracle-cli/bin/activate
(oracle-cli) [martin@ubuntu: python]$ 

As soon as the venv is activated you’ll notice its name has become a part of the prompt.

Downloading the OCI CLI

The next step is to download the latest OCI CLI release from Github. At the time of writing version 3.0.2 was the most current. Ensure you load the vanilla release, eg oci-cli-release.zip, not one of the distribution specific ones. They are to be used with the offline installation.

(oracle-cli) [martin@ubuntu: python]$ curl -L "https://github.com/oracle/oci-cli/releases/download/v3.0.2/oci-cli-3.0.2.zip" -o /tmp/oci-cli-3.0.2.zip
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   623  100   623    0     0   2806      0 --:--:-- --:--:-- --:--:--  2793
100 52.4M  100 52.4M    0     0  5929k      0  0:00:09  0:00:09 --:--:-- 6311k
(oracle-cli) [martin@ubuntu: python]$ 

Unzip the release in a temporary location and begin the installation by invoking pip using the “whl” file in the freshly unzipped directory. Just to make sure I always double-check I’m using the pip executable in the virtual environment before proceeding.

(oracle-cli) [martin@ubuntu: python]$ which pip
/home/martin/development/python/oracle-cli/bin/pip
(oracle-cli) [martin@ubuntu: python]$ pip install /tmp/oci-cli/oci_cli-3.0.2-py3-none-any.whl 
Processing /tmp/oci-cli/oci_cli-3.0.2-py3-none-any.whl
Collecting arrow==0.17.0
  Downloading arrow-0.17.0-py2.py3-none-any.whl (50 kB)
     |████████████████████████████████| 50 kB 2.7 MB/s 
...

You’ll notice additional packages are pulled into the virtual environment by the setup routine. As always, exercise care when using external packages. An offline installation is available as well if your security requirements mandate it.

At the end of the process you have a working installation of the command line interface.

Configuration

Before you can use the CLI you need to provide a configuration file. The default location is ~/.oci, which I’ll use as well.

(oracle-cli) [martin@ubuntu python]$ mkdir ~/.oci && cd ~/.oci

Inside of this directory you need to create a config file; the example below is taken from the documentation and should provide a starting point.

[DEFAULT]
user=ocid1.user.oc1..<unique_ID>
fingerprint=<your_fingerprint>
key_file=~/.oci/oci_api_key.pem
tenancy=ocid1.tenancy.oc1..<unique_ID>
region=us-ashburn-1

Make sure to update the values accordingly. Should you be unsure about the user OCID and/or API signing key to use, have a look at the documentation for instructions. Next time you invoke the CLI the DEFAULT configuration will be used. It is possible to add multiple configurations using the old Windows 3.11 .ini file format.

[DEFAULT]
user=...

[ANOTHERUSER]
user=...

Note that it’s strongly discouraged to store a potential passphrase (used for the API key) in the configuration file!

Happy Automation!

Deploying I/O intensive workloads in the cloud: LVM RAID

I recently blogged about a potential pitfall when deploying the Oracle database on LVM (Logical Volume Manager) with its default allocation policy. I promised a few more posts detailing how to potentially mitigate the effect of linear allocation in LVM. The post was written with the same Oracle 19.12.0 database deployed to Oracle Linux 8.4 with UEK6 on a VM.Standard.E4.Flex cloud system as used for creating the previous article.

If you found this article via a search engine, there are a few more posts about this topic here:

LVM RAID

In this post I’ll demonstrate how you could use LVM RAID level 0. Please don’t implement the steps in this article unless software (LVM-)RAID is an approved solution in your organisation and you are aware of the implications. Please note this article does not concern itself with the durability of block devices in the cloud. In the cloud, you have a lot less control over the block devices you get, so make sure you have appropriate protection methods in place to guarantee your databases’ RTO and RPO.

I found a hint in the SuSE Linux Enterprise Service 15 documentation recommending the use of software RAID over LVMRAID. I’ll leave that here as I don’t have sufficient information to deny or acknowledge that statement. I didn’t find a comparable warning in the Red Hat 8 documentation.

Implementing LVM RAID 0

The basics of LVM RAID levels are described in lvmraid(7):

lvm(8) RAID is a way to create a Logical Volume (LV) that uses multiple physical devices to improve performance or tolerate device failures. In LVM, the physical devices are Physical Volumes (PVs) in a single Volume Group (VG).

man 7 lvmraid

This is interesting and I haven’t really been aware of that not-really-new development. Previously I created a software RAID pseudo-device first, and used it as a physical volume in my LVM configuration. So instead of using a block device’s partition as a PV, I used the device created by mdadm (/dev/md0 for example). Let’s try the new way!

There were no changes required to oradata_vg on my Oracle Linux 8.4 system. The Logical Volume however was created differently. After struggling with the exact syntax for a bit I ended up with this command:

[opc@oracle-19c-fs ~]$ sudo lvcreate --type raid0 --extents 511998 --name oradata_lv \
> --stripesize 1m oradata_vg

Note that RAID 0 offers exactly 0 protection against disk failure. You need to ensure you have other means in place to guarantee your database’s RTO and RPO! I took me a little while to get the syntax for LVM RAID 0 right. The optional parameter –stripesize “specifies the Size of each stripe in kilobytes. This is the amount of data that is written to one device before moving to the next.” I’m unsure if 1 MB is the right value, I probably need to experiment with this a little more.

In the next step I created the XFS file system on top of the oradata_lv and mounted the new file system in /u01/oradata for use with the database.

The output of my lvs command changed quite a bit to what it was before:

[opc@oracle-19c-fs ~]$ sudo lvs --all --options name,copy_percent,devices,attr oradata_vg
  LV                    Cpy%Sync Devices                                       Attr      
  oradata_lv                     oradata_lv_rimage_0(0),oradata_lv_rimage_1(0) rwi-aor---
  [oradata_lv_rimage_0]          /dev/sdc1(0)                                  iwi-aor---
  [oradata_lv_rimage_1]          /dev/sde1(0)                                  iwi-aor---
[opc@oracle-19c-fs ~]$

The above output is specific to LVM RAID 0, higher RAID levels feature *_rmeta images in addition to the *_rimage above. Since I’m not planning on converting from RAID 0 to a higher RAID level I don’t need to concern myself with a meta image in this configuration. See lvmraid(7) for a more thorough description of LVM Sub-Volumes.

Since RAID 0 doesn’t offer any protection from disk failure it doesn’t have to wait for any synchronisation to be completed before making the volume available.

Disk Performance LVM RAID 0

After I finished the restore of my database to the newly created LVM RAID 0 mount point I ran the same Swingbench workload as before, still using the ridiculous small SGA forcing physical I/O. As in the previous article the aim wasn’t to see what the configuration is capable of, I wanted to find out more about disk utilisation.

This time iostat showed multiple busy devices:

[opc@oracle-19c-fs ~]$ iostat -xmz 5 3
Linux 5.4.17-2102.203.6.el8uek.x86_64 (oracle-19c-fs)   06/08/21        _x86_64_        (16 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.74    0.00    1.14    5.43    0.01   90.68

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  ...  %util
sda              0.27    0.85      0.01      0.02     0.00     0.49  ...   0.05
dm-0             0.27    0.77      0.01      0.01     0.00     0.00  ...   0.04
dm-1             0.00    0.57      0.00      0.01     0.00     0.00  ...   0.01
sdb              0.11    0.11      0.00      0.00     0.00     0.02  ...   0.02
sdc            993.21   14.94     14.18      0.28     0.00     0.02  ...  15.28
dm-2             0.11    0.13      0.00      0.00     0.00     0.00  ...   0.02
sdd              0.25    4.95      0.24      0.35     0.00     0.01  ...   1.63
dm-3             0.25    4.95      0.24      0.35     0.00     0.00  ...   1.63
sde           1013.79  424.90     15.25      3.79     0.00     0.04  ...  25.97
dm-4           991.89   14.54     14.16      0.27     0.00     0.00  ...  15.12
dm-5           992.43   14.65     14.19      0.27     0.00     0.00  ...  15.13
dm-6          1984.31   29.19     28.35      0.54     0.00     0.00  ...  15.25

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.30    0.00    2.93   29.68    0.03   66.06

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  ... %util
sdc           7210.60  119.80     56.23      0.90     0.00     0.00  ... 99.60
sdd              0.00   24.60      0.00      0.10     0.00     0.00  ...  7.60
dm-3             0.00   24.60      0.00      0.10     0.00     0.00  ...  7.60
sde           7204.80  102.60     56.20      0.82     0.00     0.00  ... 99.74
dm-4          7209.20  119.60     56.22      0.90     0.00     0.00  ... 99.60
dm-5          7205.40  102.60     56.21      0.82     0.00     0.00  ... 99.76
dm-6          14414.60  222.20    112.43      1.72     0.00     0.00 ... 100.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.11    0.00    2.92   27.86    0.01   67.10

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  ... %util
sdc           6771.60  103.60     52.81      0.62     0.00     0.00  ... 99.82
sdd              0.00   61.80      0.00      0.22     0.00     0.00  ... 18.02
dm-3             0.00   62.00      0.00      0.22     0.00     0.00  ... 18.02
sde           6806.20   45.80     53.09      0.49     0.00     0.00  ... 99.94
dm-4          6771.40  103.60     52.80      0.62     0.00     0.00  ... 99.82
dm-5          6806.00   45.80     53.09      0.49     0.00     0.00  ... 99.94
dm-6          13577.40  149.40    105.89      1.10     0.00     0.00 ... 100.00

In the above output, /dev/sdc1 and /dev/sde1 are part of oradata_vg, hosting the database. I still didn’t multiplex control files and online redo logs to ensure all I/O is reported against oradata_vg . At the risk of repeating myself not multiplexing control file/online redo log members might not be a good idea for serious Oracle deployments.

But what about /dev/dm-{4,5,6}? Why are there suddenly so many Device-Mapper devices in the above iostat output?

[opc@oracle-19c-fs ~]$ ls -l /dev/mapper | grep dm-[4-6]
lrwxrwxrwx. 1 root root       7 Aug  6 08:17 oradata_vg-oradata_lv -> ../dm-6
lrwxrwxrwx. 1 root root       7 Aug  6 08:15 oradata_vg-oradata_lv_rimage_0 -> ../dm-4
lrwxrwxrwx. 1 root root       7 Aug  6 08:15 oradata_vg-oradata_lv_rimage_1 -> ../dm-5
[opc@oracle-19c-fs ~]$ 

These match the previous output of the lvs command: all Device-Mapper meta-devices 4, 5 and 6 all belong to oradata_vg. Using the iostat output it should be apparent that more than 1 block device is used by the database, striping seems to be working fine.

What happens to performance when you extend the VG?

Assuming you run out of storage on your volume group, what next? With linear allocation it’s a no brainer: ensure the presence of a backup, then add another Physical Volume to the Volume Group and resize the Logical Volume + file system: capacity is increased immediately.

With LVM RAID 0 the story is a little different. According to the Red Hat 8 documentation it is possible to run lvresize on a striped LV provided the same number of stripes as originally present is added to the Volume Group. On my system I originally used 2 block devices = 2 stripes in oradata_vg. Adding a couple more of the same size and performance characteristics allows me to resize the Logical Volume after I ensured I had a proven and tested backup of all data depending on oradata_vg:

[opc@oracle-19c-fs ~]$ sudo lvresize --extents +461996 --resizefs /dev/mapper/oradata_vg-oradata_lv
  Using stripesize of last segment 1.00 MiB                                 
  Size of logical volume oradata_vg/oradata_lv changed from 2.14 TiB (561998 extents) to <3.91 TiB (1023994 extents)
  Logical volume oradata_vg/oradata_lv successfully resized.
meta-data=/dev/mapper/oradata_vg-oradata_lv isize=512    agcount=33, agsize=16382976 blks          
         =                       sectsz=4096  attr=2, projid32bit=1                                   
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1                                 
data     =                       bsize=4096   blocks=524285952, imaxpct=5                          
         =                       sunit=1024   swidth=2048 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1                              
log      =internal log           bsize=4096   blocks=255999, version=2                             
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
data blocks changed from 524285952 to 575485952
[opc@oracle-19c-fs ~]$

It really has to be the same number of additional PVs, or otherwise you get the following error:

[opc@oracle-19c-fs ~]$ sudo vgdisplay oradata_vg | grep Free               
  Free  PE / Size       255999 / <1000.00 GiB

[opc@oracle-19c-fs ~]$ sudo lvresize --extents +255998 --resizefs /dev/mapper/oradata_vg-oradata_lv
  Using stripesize of last segment 1.00 MiB
  Insufficient suitable allocatable extents for logical volume oradata_lv: 255998 more required

Even though I have been able to add additional space (see above) it doesn’t appear to make a difference in performance:

[opc@oracle-19c-fs ~]$ iostat -xmz 5 3
Linux 5.4.17-2102.203.6.el8uek.x86_64 (oracle-19c-fs)   06/08/21        _x86_64_        (16 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.72    0.00    1.19    6.03    0.01   90.05

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  ... %util
sda              0.27    0.85      0.01      0.02     0.00     0.48  ...  0.05
dm-0             0.26    0.76      0.01      0.01     0.00     0.00  ...  0.04
dm-1             0.00    0.57      0.00      0.01     0.00     0.00  ...  0.01
sdb              0.11    0.11      0.00      0.00     0.00     0.02  ...  0.02
sdc           1137.78   17.06     15.29      0.29     0.00     0.02  ... 17.33
dm-2             0.11    0.13      0.00      0.00     0.00     0.00  ...  0.02
sdd              0.24    5.62      0.24      0.34     0.00     0.01  ...  1.84
dm-3             0.24    5.63      0.24      0.34     0.00     0.00  ...  1.84
sde           1157.81  417.01     16.34      3.72     0.00     0.04  ... 27.76
dm-4          1136.49   16.67     15.27      0.28     0.00     0.00  ... 17.18
dm-5          1136.97   16.76     15.31      0.29     0.00     0.00  ... 17.19
dm-6          2273.46   33.42     30.58      0.57     0.00     0.00  ... 17.31
sdf              0.00    0.00      0.00      0.00     0.00     0.00  ...  0.00
sdg              0.00    0.00      0.00      0.00     0.00     0.00  ...  0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.11    0.00    3.18   31.19    0.01   64.51

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  ... %util
sda              0.00    2.40      0.00      0.02     0.00     0.40  ...  0.04
dm-0             0.00    2.60      0.00      0.01     0.00     0.00  ...  0.04
dm-1             0.00    0.20      0.00      0.00     0.00     0.00  ...  0.02
sdc           7545.40   32.40     58.83      0.28     0.00     0.00  ... 99.92
sdd              0.00   14.40      0.00      0.06     0.00     0.00  ...  4.16
dm-3             0.00   14.40      0.00      0.06     0.00     0.00  ...  4.16
sde           7519.80   52.60     58.65      0.47     0.00     0.00  ... 99.76
dm-4          7545.20   32.40     58.83      0.28     0.00     0.00  ... 99.90
dm-5          7519.80   52.60     58.65      0.47     0.00     0.00  ... 99.76
dm-6          15065.00   85.00    117.48      0.75     0.00     0.00 ... 100.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.62    0.00    3.06   30.02    0.01   65.29

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  ... %util
sdb              0.00    0.60      0.00      0.00     0.00     0.00  ...  0.02
sdc           7192.20  124.00     56.07      0.82     0.00     0.00  ... 99.78
dm-2             0.00    0.60      0.00      0.00     0.00     0.00  ...  0.02
sdd              0.00   46.60      0.00      0.17     0.00     0.00  ... 13.50
dm-3             0.00   46.60      0.00      0.17     0.00     0.00  ... 13.46
sde           7184.40   79.60     56.03      0.70     0.00     0.00  ... 99.78
dm-4          7193.60  124.00     56.08      0.82     0.00     0.00  ... 99.78
dm-5          7183.60   79.60     56.03      0.70     0.00     0.00  ... 99.78
dm-6          14377.20  203.60    112.11      1.51     0.00     0.00 ... 100.00

[opc@oracle-19c-fs ~]$ 

As you can see only those disks that were originally part of the volume group are in use. Unlike with Oracle’s Automatic Storage Management there is no automatic rebalancing of data.

Summary

Using LVM RAID 0 is an exciting new feature offering striping in LVM in a different way than previously possible. Compared to the linear allocation model demonstrated in the previous article it allows proper striping across disks in the Logical Volume. It should be noted though that RAID 0 – striping – does not offer any data protection. Failure of a single device in the RAID means all data is lost, immediately. Alternatives need to be in place to ensure your database’s RTO and RPO can be met.

Extending capacity of a LVM RAID 0 VG is possible provided you add the same number of devices (with the same size and performance characteristics) to the VG before executing the lvresize command.

The final article in this series cuts LVM out of the equation and focuses purely on Software RAID 0 and how it can be used in Oracle Linux 8.x and before.

Oracle Cloud Infrastructure: using the CLI to manipulate Network Security Groups

I frequently need to update security rules in one of my Network Security Groups (NSG). Rather than logging into the console and clicking my way through the user interface to eventually change the rule I decided to give it a go and automate the process using the Oracle Cloud Infrastructure (OCI) Command Line Interface (CLI). It took me slightly longer than I thought to get it right, so hopefully this post saves you 5 minutes. And me, later, when I forgot how I did it :)

In my defense I should point out this isn’t one of the terraform controlled environments I use but rather a cloud playground with a single network, a few of subnets, Network Security Groups (NSG) and security lists that have grown organically. If that sounds similar to what you are doing, read on. If not, please use terraform to control the state of your cloud infrastructure, it’s much better suited to the task, especially when working with others. The rule is: “once terraform, always terraform” when making changes to the infrastructure.

I have used Ubuntu 20.04 LTS as a host for version 3.0.0 of the CLI, the current version at the time of writing. It’s assumed you already set the CLI up and have the correct access policies granted to you to make changes to the NSG. I also defined a default compartment in ~/.oci/oci_cli_rc so I don’t have to add a --compartment-id to every call to the CLI.

Listing Network Security Groups

The landing page for NSGs in OCI CLI was my starting point. The list and rules list/rules update verbs are exactly what I need.

Before I can list the security rules for a given NSG I need to find its Oracle Cloud ID (OCID) first:

(oracle-cli) [martin@ubuntu: ~]$ oci network nsg list \
> --query 'data[].{id:id,"display-name":"display-name" }' \
> --output table
+-----------------------+-------------------------------------------------...---+
| display-name          | id                                              ...   |
+-----------------------+-------------------------------------------------...---+
| NSG1                  | ocid1.networksecuritygroup.oc1.eu-frankfurt-1.aa...vq |
| NSG2                  | ocid1.networksecuritygroup.oc1.eu-frankfurt-1.aa...5q |
...
| NSG5                  | ocid1.networksecuritygroup.oc1.eu-frankfurt-1.aa...vq |
| NSG6                  | ocid1.networksecuritygroup.oc1.eu-frankfurt-1.aa...3a |
+-----------------------+-------------------------------------------------...---+
(oracle-cli) [martin@ubuntu: ~]$ 

The table provides me with a list of NSGs and their OCIDs.

Getting a NSG’s Security Rules

Now that I have the NSG’s OCID, I can list its security rules:

(oracle-cli) [martin@ubuntu: ~]$ oci network nsg rules list \
> --nsg-id ocid1.networksecuritygroup.oc1.eu-frankfurt-1.aa...

The result is a potentially looong JSON document, containing a data[] array with the rules and their metadata:

(oracle-cli) [martin@ubuntu: ~]oci network nsg rules list --nsg-id ocid1.networksecuritygroup.oc1.eu-frankfurt-1.aa...
{
  "data": [
    {
      "description": "my first rule",
...

Updating a Security Rule

As per the documentation, I need to pass the NSG OCID as well as security rules to oci network nsg rules update. Which makes sense when you think about it … There is only one small caveat: the security rules are considered a complex type (= JSON document). Rather than passing a string on the command line, the suggestion is to create a JSON document with the appropriate parameters, store it on the file system and pass it via the file://payload.json directive.

But what exactly do I have to provide as part of the update request? The first thing I did was to look at the JSON document produced by oci network nsg rules list to identify the rule and payload I need to update. The documentation wasn’t 100% clear whether I can update just a single security rule so I thought I’d just try it. The API documentation has details about the various properties as well as links to the TcpOptions and UdpOptions. Not all of these are always required, have a look at the documentation for details. Using all the available sources I ended up with the following in /tmp/payload.json:

[
    {
        "description": "my first SSH rule",
        "direction": "INGRESS",
        "id": "04ABEC",
        "protocol": "6",
        "source": "192.168.10.0/24",
        "source-type": "CIDR_BLOCK",
        "tcp-options": {
            "destination-port-range": {
                "max": 22,
                "min": 22
            }
        }
    }
]

The actual contents of the file varies from use case to use case, however there are a couple of things worth pointing out:

  • Even though I intend to update a single rule, I need to provide a JSON array (containing a single object, the rule)
  • The security rule must be valid JSON
  • You absolutely NEED an id, otherwise OCI can’t update the existing rule

With these things in mind you can update the rule:

(oracle-cli) [martin@ubuntu: ~]$ oci network nsg rules update \
> --nsg-id ocid1.networksecuritygroup.oc1.eu-frankfurt-1.aaa... \
> --security-rules file:///tmp/payload.json 
{
  "data": {
    "security-rules": [
      {
        "description": "my first rule",
        "destination": null,
        "destination-type": null,
        "direction": "INGRESS",
        "icmp-options": null,
        "id": "04ABEC",
        "is-stateless": false,
        "is-valid": true,
        "protocol": "6",
        "source": "192.168.10.0/24",
        "source-type": "CIDR_BLOCK",
        "tcp-options": {
          "destination-port-range": {
            "max": 22,
            "min": 22
          },
          "source-port-range": null
        },
        "time-created": "2020-11-23T14:24:55.363000+00:00",
        "udp-options": null
      }
    ]
  }
}

In case of success you are presented with a JSON document listing the updated rule(s).