Category Archives: Linux

Installing Virtualbox Guest Additions for Oracle Linux 8.2

Since I can never remember how to install Virtualbox Guest Additions I thought I’d write it down. Maybe it’ll save you a few minutes; I know it will save me a lot of time ;)

For this post I used the latest versions at the time of writing:

  • Virtualbox 6.1.10 for Linux (my host is running Ubuntu 20.04 LTS)
  • Oracle Linux 8.2 (V996906-01.iso)

The VM was installed using the “minimal-environment” group and booted into UEK 6. I believe this change came with Oracle Linux (OL) 8.2 and I seem to remember OL 8.1 used the Red Hat Compatible Kernel (RHCK) by default. There’s nothing wrong with that of course, it’s just an observation. I am going to stick with UEK 6 in my lab, instructions are different from using RHCK.

As usual I run a “dnf update -y” (in a tmux session of course) to upgrade the software release to the latest and greatest software. After this completed, it’s time for a reboot and the installation of the Guest Additions. I hope you forgive me for not sharing screen output from the system upgrade, I thought it would have been boring …

Right after the system comes back online after the reboot, you might notice a new kernel: 5.4.17-2011.3.2.1.el8uek.x86_64 is now the default, previously it was 5.4.17-2011.1.2.el8.

Before you can install the Guest Additions a few additional packages are required.

[root@ol8base ~]# dnf install perl make bzip2 gzip unzip kernel-uek-devel-$(uname -r) tar
Last metadata expiration check: 0:22:15 ago on Fri 29 May 2020 14:11:17 BST.
Package gzip-1.9-9.el8.x86_64 is already installed.
Dependencies resolved.
================================================================================
 Package                      Arch   Version            Repository         Size
================================================================================
Installing:
 bzip2                        x86_64 1.0.6-26.el8       ol8_baseos_latest  60 k
 kernel-uek-devel             x86_64 5.4.17-2011.3.2.1.el8uek
                                                        ol8_UEKR6          17 M
 make                         x86_64 1:4.2.1-10.el8     ol8_baseos_latest 498 k
 perl                         x86_64 4:5.26.3-416.el8   ol8_appstream      72 k
 tar                          x86_64 2:1.30-4.el8       ol8_baseos_latest 838 k
 unzip                        x86_64 6.0-43.el8         ol8_baseos_latest 196 k
Installing dependencies:
 annobin                      x86_64 8.90-1.el8         ol8_appstream     202 k
 binutils                     x86_64 2.30-73.0.1.el8    ol8_baseos_latest 5.8 M
 binutils-devel               x86_64 2.30-73.0.1.el8    ol8_appstream     1.1 M
 cpp                          x86_64 8.3.1-5.0.3.el8    ol8_appstream      10 M
 dtrace                       x86_64 2.0.0-1.0.el8      ol8_UEKR6         1.3 M

[...]

  qt5-srpm-macros-5.12.5-3.el8.noarch                                           
  redhat-rpm-config-122-1.0.1.el8.noarch                                        
  rust-srpm-macros-5-2.el8.noarch                                               
  tar-2:1.30-4.el8.x86_64                                                       
  unzip-6.0-43.el8.x86_64                                                       
  zip-3.0-23.el8.x86_64                                                         
  zlib-devel-1.2.11-13.el8.x86_64                                               

Complete! 

The next step is to mount the Guest Additions (virtual) CD and run the installer:

[root@ol8base ~]# mount /dev/sr0 /mnt
mount: /mnt: WARNING: device write-protected, mounted read-only.
[root@ol8base ~]# /mnt/VBoxLinuxAdditions.run 
Verifying archive integrity... All good.
Uncompressing VirtualBox 6.1.10 Guest Additions for Linux................
VirtualBox Guest Additions installer
Copying additional installer modules ...
Installing additional modules ...
VirtualBox Guest Additions: Starting.
VirtualBox Guest Additions: Building the VirtualBox Guest Additions kernel 
modules.  This may take a while.
VirtualBox Guest Additions: To build modules for other installed kernels, run
VirtualBox Guest Additions:   /sbin/rcvboxadd quicksetup 
VirtualBox Guest Additions: or
VirtualBox Guest Additions:   /sbin/rcvboxadd quicksetup all
VirtualBox Guest Additions: Building the modules for kernel 
5.4.17-2011.3.2.1.el8uek.x86_64. 

This seems to have gone well, let’s check if the necessary kernel modules are present:

[root@ol8base ~]# lsmod  | egrep -i 'Module|vbox'
Module                  Size  Used by
vboxsf                 81920  0
vboxguest             339968  2 vboxsf
vboxvideo              36864  0
drm_kms_helper        184320  2 vmwgfx,vboxvideo
ttm                   106496  2 vmwgfx,vboxvideo
drm                   516096  5 vmwgfx,drm_kms_helper,vboxvideo,ttm
[root@ol8base ~]#  

This is it! Guest additions ready and working with Oracle Linux 8.2/ UEK 6

Copying a SQL Plan Baseline from one database to another

Hopefully this post saves you a few minutes looking the procedure up. I know it’ll save me some time ;) In this rather lengthy article I’d like to cover how I copied a SQL Plan Baseline from one database to another. If you find this procedure useful, please ensure your system is appropriately licensed for it and test it first!

My Setup

My source database is named ORA19NCDB, patched to 19.7.0 running on Oracle Linux 7x/UEK 5. As I do so often, I’m using Dominic Giles’s Swingbench as the source for this experiment. This is the query in question:

SELECT oi.order_id,
    SUM(oi.unit_price * oi.quantity)
FROM
    soe.orders         o,
    soe.order_items    oi
WHERE
        o.order_id = oi.order_id
    and o.order_date = DATE '2012-04-01'
    and o.delivery_type = 'Collection'
GROUP BY
    oi.order_id
ORDER BY
    2 DESC; 

I would like to make sure the execution plan for this statement is available to my destination database. Like the source, it’s a 19.7.0 database running on Linux.

Create the SQL Plan Baseline

In the first step I need to create the SQL Plan Baseline, for which I need the SQL ID and plan hash value. One way to get these is to run the query and check the cursor cache:

SQL> @/tmp/query

  ORDER_ID SUM(OI.UNIT_PRICE*OI.QUANTITY)
---------- ------------------------------
   1138103                          30648

...

    244488                           3696

32 rows selected.

SQL> select * from table(dbms_xplan.display_cursor(format=>'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------
SQL_ID  cbynz8srbydr7, child number 0
-------------------------------------
SELECT oi.order_id,     SUM(oi.unit_price * oi.quantity) FROM
soe.orders         o,     soe.order_items    oi WHERE
o.order_id = oi.order_id     and o.order_date = DATE '2012-04-01'
and o.delivery_type = 'Collection' GROUP BY     oi.order_id ORDER BY
 2 DESC

Plan hash value: 1126116122

---------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name              | E-Rows |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |                   |        |       |       |          |
|   1 |  SORT ORDER BY                          |                   |      1 |  2048 |  2048 | 2048  (0)|
|   2 |   HASH GROUP BY                         |                   |      1 |  1160K|  1160K| 1429K (0)|
|   3 |    NESTED LOOPS                         |                   |      1 |       |       |          |
|   4 |     NESTED LOOPS                        |                   |      3 |       |       |          |
|*  5 |      TABLE ACCESS BY INDEX ROWID BATCHED| ORDERS            |      1 |       |       |          |
|*  6 |       INDEX RANGE SCAN                  | ORD_ORDER_DATE_IX |      1 |       |       |          |
|*  7 |      INDEX RANGE SCAN                   | ITEM_ORDER_IX     |      3 |       |       |          |
|   8 |     TABLE ACCESS BY INDEX ROWID         | ORDER_ITEMS       |      3 |       |       |          |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - filter("O"."DELIVERY_TYPE"='Collection')
   6 - access("O"."ORDER_DATE"=TO_DATE(' 2012-04-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
   7 - access("O"."ORDER_ID"="OI"."ORDER_ID")

Note
-----
   - Warning: basic plan statistics not available. These are only collected when:
       * hint 'gather_plan_statistics' is used for the statement or
       * parameter 'statistics_level' is set to 'ALL', at session or system level


37 rows selected. 

With the SQL ID and PHV identified I can create a SQL Plan Baseline for this query. I have a little script to do that for me:

SQL> get create_baseline
  1  pro Create a SQL Plan Baseline based on SQL ID and Plan Hash Value
  2  var count number
  3  exec :count := dbms_spm.load_plans_from_cursor_cache(sql_id=>'&sqlid',plan_hash_value=>&phv)
  4* select :count || ' SQL Plan Baselines captured' as result from dual;

SQL> start create_baseline
Create a SQL Plan Baseline based on SQL ID and Plan Hash Value
Enter value for sqlid: cbynz8srbydr7
Enter value for phv: 1126116122

PL/SQL procedure successfully completed.


RESULT
--------------------------------------------------------------------
1 SQL Plan Baselines captured

SQL>  

The baseline should have been captured, as per the “result” column. It’s simple enough to verify.

SQL> select sql_handle, plan_name, origin, enabled, accepted, fixed 
  2  from dba_sql_plan_baselines where created > systimestamp - interval '5' minute;

SQL_HANDLE                     PLAN_NAME                      ORIGIN                         ENA ACC FIX
------------------------------ ------------------------------ ------------------------------ --- --- ---
SQL_6f18f4fc0f67407e           SQL_PLAN_6y67nzh7qfh3ya116ef60 MANUAL-LOAD-FROM-CURSOR-CACHE  YES YES NO

A quick check should be enough to show the baseline is used:

SQL> @/tmp/query

  ORDER_ID SUM(OI.UNIT_PRICE*OI.QUANTITY)
---------- ------------------------------
   1138103                          30648

...

    244488                           3696

32 rows selected.

SQL> select * from table(dbms_xplan.display_cursor(format=>'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------
SQL_ID  cbynz8srbydr7, child number 0
-------------------------------------
SELECT oi.order_id,     SUM(oi.unit_price * oi.quantity) FROM
soe.orders         o,     soe.order_items    oi WHERE
o.order_id = oi.order_id     and o.order_date = DATE '2012-04-01'
and o.delivery_type = 'Collection' GROUP BY     oi.order_id ORDER BY
 2 DESC

Plan hash value: 1126116122

---------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name              | E-Rows |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |                   |        |       |       |          |
|   1 |  SORT ORDER BY                          |                   |      1 |  2048 |  2048 | 2048  (0)|
|   2 |   HASH GROUP BY                         |                   |      1 |  1160K|  1160K| 1425K (0)|
|   3 |    NESTED LOOPS                         |                   |      1 |       |       |          |
|   4 |     NESTED LOOPS                        |                   |      3 |       |       |          |
|*  5 |      TABLE ACCESS BY INDEX ROWID BATCHED| ORDERS            |      1 |       |       |          |
|*  6 |       INDEX RANGE SCAN                  | ORD_ORDER_DATE_IX |      1 |       |       |          |
|*  7 |      INDEX RANGE SCAN                   | ITEM_ORDER_IX     |      3 |       |       |          |
|   8 |     TABLE ACCESS BY INDEX ROWID         | ORDER_ITEMS       |      3 |       |       |          |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - filter("O"."DELIVERY_TYPE"='Collection')
   6 - access("O"."ORDER_DATE"=TO_DATE(' 2012-04-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
   7 - access("O"."ORDER_ID"="OI"."ORDER_ID")

Note
-----
   - SQL plan baseline SQL_PLAN_6y67nzh7qfh3ya116ef60 used for this statement
   - Warning: basic plan statistics not available. These are only collected when:
       * hint 'gather_plan_statistics' is used for the statement or
       * parameter 'statistics_level' is set to 'ALL', at session or system level


38 rows selected. 

This looks good! I can see the SQL Plan Baseline used as per the “Note” section.

Transferring the SQL Plan Baseline

Now the fun begins. I need to transfer the SQL Plan Baseline from my source system to the target. The procedure is well documented, and I really like the 11.2 documentation. There is of course more out there, but this is the most concise piece of information in my opinion.

To copy the SQL Plan Baseline I need to first put it into a staging table and then export it from the source to import it into the destination database. I have chosen to store the baseline in the SOE schema, but that’s of course up to you. This little script can help save some time:

set serveroutput on
prompt prepare a SQL Plan Baseline for transport
prompt
accept v_table_name  prompt 'enter the name of the staging table to be created: ' 
accept v_table_owner prompt 'enter the schema name where the staging table is to be created: ' 
accept v_sql_handle  prompt 'which SQL handle should be exported? ' 
accept v_plan_name prompt 'enter the corresponding plan name: ' 

declare
  v_packed_baselines number;
begin
  dbms_spm.create_stgtab_baseline( 
    table_name => '&v_table_name', 
    table_owner => '&v_table_owner');

  v_packed_baselines := dbms_spm.pack_stgtab_baseline(
    table_name => '&v_table_name',
    table_owner => '&v_table_owner',
    sql_handle => '&v_sql_handle',
    plan_name => '&v_plan_name');
  dbms_output.put_line(v_packed_baselines || ' baselines have been staged in &v_table_owner..&v_table_name');
end;
/ 

set serveroutput off

This procedure should confirm a single SQL Plan Baseline to have been staged:

SQL> start prep_transport
prepare a SQL Plan Baseline for transport

enter the name of the staging table to be created: blogpost
enter the schema name where the staging table is to be created: soe
which SQL handle should be exported? SQL_6f18f4fc0f67407e
enter the corresponding plan name: SQL_PLAN_6y67nzh7qfh3ya116ef60
1 baselines have been staged in soe.blogpost

PL/SQL procedure successfully completed. 

Next, export the table and import it into the destination database. Here is the output from my system:

$ expdp directory=the_export_dir tables=soe.blogpost dumpfile=soe_blogpost.dmp logfile=soe_blogpost.log

Export: Release 19.0.0.0.0 - Production on Tue Jun 2 17:54:13 2020
Version 19.7.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

Username: martin  
Password: 

Connected to: Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Starting "MARTIN"."SYS_EXPORT_TABLE_01":  martin/******** directory=the_export_dir tables=soe.blogpost dumpfile=soe_blogpost.dmp logfile=soe_blogpost.log 
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Processing object type TABLE_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Processing object type TABLE_EXPORT/TABLE/STATISTICS/MARKER
Processing object type TABLE_EXPORT/TABLE/TABLE
. . exported "SOE"."BLOGPOST"                            51.74 KB       9 rows
Master table "MARTIN"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for MARTIN.SYS_EXPORT_TABLE_01 is:
  /u01/app/oracle/admin/ORA19NCDB/dpdump/soe_blogpost.dmp
Job "MARTIN"."SYS_EXPORT_TABLE_01" successfully completed at Tue Jun 2 17:54:39 2020 elapsed 0 00:00:22 

The import is equally simple:

$ impdp directory=the_import_dir logfile=soe_blogpost.log dumpfile=soe_blogpost.dmp

Import: Release 19.0.0.0.0 - Production on Tue Jun 2 16:59:39 2020
Version 19.7.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

Username: martin
Password: 

Connected to: Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Master table "MARTIN"."SYS_IMPORT_FULL_01" successfully loaded/unloaded
Starting "MARTIN"."SYS_IMPORT_FULL_01":  martin/******** directory=the_import_dir logfile=soe_blogpost.log dumpfile=soe_blogpost.dmp 
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
. . imported "SOE"."BLOGPOST"                            51.74 KB       9 rows
Processing object type TABLE_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Processing object type TABLE_EXPORT/TABLE/STATISTICS/MARKER
Job "MARTIN"."SYS_IMPORT_FULL_01" successfully completed at Tue Jun 2 17:00:22 2020 elapsed 0 00:00:26 

Making the SQL Plan Baseline available and using it

After the import completed successfully it’s time to unpack the SQL Plan Directive. The following code snippet did that for me

var num_unpacked number

begin
  :num_unpacked := dbms_spm.unpack_stgtab_baseline(
    table_name => '&table_name',
    table_owner => '&table_owner');
end;
/

print :num_unpacked 

After which you can see it in the database:

SQL> select sql_handle, plan_name, origin, enabled, accepted, fixed
  2   from dba_sql_plan_baselines;

SQL_HANDLE                     PLAN_NAME                      ORIGIN                        ENA ACC FIX
------------------------------ ------------------------------ ----------------------------- --- --- ---
SQL_6f18f4fc0f67407e           SQL_PLAN_6y67nzh7qfh3ya116ef60 MANUAL-LOAD-FROM-CURSOR-CACHE YES YES NO

SQL>  

Any execution of SQL ID cbynz8srbydr7 will now use the imported SQL Plan Baseline.

SQL> l
  1  SELECT oi.order_id,
  2     SUM(oi.unit_price * oi.quantity)
  3  FROM
  4     soe.orders     o,
  5     soe.order_items    oi
  6  WHERE
  7         o.order_id = oi.order_id
  8     and o.order_date = DATE '2012-04-01'
  9     and o.delivery_type = 'Collection'
 10  GROUP BY
 11     oi.order_id
 12  ORDER BY
 13*     2 DESC
SQL> /

[...]

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------
SQL_ID  cbynz8srbydr7, child number 0

SELECT oi.order_id,     SUM(oi.unit_price * oi.quantity) FROM
soe.orders         o,     soe.order_items    oi WHERE
o.order_id = oi.order_id     and o.order_date = DATE '2012-04-01'
and o.delivery_type = 'Collection' GROUP BY     oi.order_id ORDER BY
 2 DESC

Plan hash value: 1126116122

-------------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |                   |       |       |     3 (100)|          |
|   1 |  SORT ORDER BY                          |                   |     1 |    74 |     3  (67)| 00:00:01 |
|   2 |   HASH GROUP BY                         |                   |     1 |    74 |     3  (67)| 00:00:01 |
|   3 |    NESTED LOOPS                         |                   |     1 |    74 |     1   (0)| 00:00:01 |
|   4 |     NESTED LOOPS                        |                   |     1 |    74 |     1   (0)| 00:00:01 |
|*  5 |      TABLE ACCESS BY INDEX ROWID BATCHED| ORDERS            |     1 |    35 |     1   (0)| 00:00:01 |
|*  6 |       INDEX RANGE SCAN                  | ORD_ORDER_DATE_IX |     1 |       |     1   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN                   | ITEM_ORDER_IX     |     1 |       |     0   (0)|          |
|   8 |     TABLE ACCESS BY INDEX ROWID         | ORDER_ITEMS       |     1 |    39 |     0   (0)|          |
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - filter("O"."DELIVERY_TYPE"='Collection')
   6 - access("O"."ORDER_DATE"=TO_DATE(' 2012-04-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
   7 - access("O"."ORDER_ID"="OI"."ORDER_ID")

Note
-----
   - SQL plan baseline SQL_PLAN_6y67nzh7qfh3ya116ef60 used for this statement


35 rows selected.

This fact is also replicated in v$sql:

SQL> select sql_id, child_number, plan_hash_value, sql_plan_baseline, executions from v$sql where sql_id = 'cbynz8srbydr7';

SQL_ID        CHILD_NUMBER PLAN_HASH_VALUE SQL_PLAN_BASELINE              EXECUTIONS
------------- ------------ --------------- ------------------------------ ----------
cbynz8srbydr7            0      1126116122 SQL_PLAN_6y67nzh7qfh3ya116ef60          2 

Summary

It isn’t too hard to transfer a SQL Plan Baseline from one host to another, as this post hopes to demonstrate.

Silent installation: Oracle Restart 19c, ASM Filter Driver, UEK 5 edition

As promised in an earlier post here are my notes about installing Oracle Restart with ASM Filter Driver (ASMFD) 19c on Oracle Linux 7 using UEK 5.

Since the approach you are about to read isn’t explicitly covered in the documentation I suggest you ask Oracle Support whether it is supported before using this outside a playground/lab environment.

I also forgot about this post waiting to be published in my drafts folder, it should have gone out early April. Some components I used to put the post together aren’t the latest and greatest, please adjust accordingly.

My environment

The environment hasn’t changed massively compared to the RHCK edition of this post, except of course for the kernel used:

  • My lab environment consists of a KVM VM using the virtio driver
  • Oracle Linux 7.7
  • Kernel UEK 5 (patched to 4.14.35-1902.300.11.el7uek.x86_64)
  • All other packages up to date as of April 1st 2020
  • The first partition of /dev/vd[c-f] are to be used as ASM disks for +DATA
  • This is a fresh installation, no upgrade, no ASMLib has ever been in use

The challenge of using ASMFD with Oracle Restart and UEK5

The challenge installing Oracle Restart 19c together with ASMFD is lack of support in the base release:

[root@server5 bin]# ./acfsdriverstate supported
ACFS-9459: ADVM/ACFS is not supported on this OS version: '4.14.35-1902.300.11.el7uek.x86_64'
ACFS-9201: Not Supported
ACFS-9294: updating file /etc/sysconfig/oracledrivers.conf
[root@server5 bin]# uname -r
4.14.35-1902.300.11.el7uek.x86_64
[root@server5 bin]#  

Which is easy to run into since gridSetup.sh shipped with 19.3 doesn’t validate this for you when running in silent mode. The GUI version of the installer protects you from the mistake though. Upgrading to the latest UEK 5 doesn’t change this message as you just saw, but was a necessity in my case anyway as you can see later. As per My Oracle Support (MOS) certification matrix, DocID 1369107.1, Oracle 19.4.0 is the first release to support ASM Filter Driver (ASMFD). The base release, 19.3.0 does not support ASMFD (or ACFS for that matter) out of the box.

I’d like to iterate again that this post isn’t an endorsement for ASM Filter Driver, but since the documentation was a little unclear I thought I’d write up how I got to a working installation. It is up to you to ensure that ASMFD is a workable solution for your environment by following industry best known methods.

So how do I get to a working Oracle Restart 19c/ASMFD installation when I’m using UEK 5?

As I said in the first part of the series there are 2 options available for installing Oracle Restart 19c using ASMFD, at least in theory: the first one is to use UDEV to prepare ASM block devices, the second one is to label the ASM disks using asmcmd afd_label.

UDEV

Huh, UDEV? That hasn’t really been blogged about at all in the context of ASMFD, or at least I didn’t find anyone who did. I’m inferring the possibility of using UDEV from “Configuring Oracle ASM Filter Driver During Installation” (link to documentation):

If you do not use udev on the system where the Oracle Grid Infrastructure is installed, then you can also complete the following procedure to provision disks for Oracle ASMFD before the installer is launched

The docs then continue with an explanation of using disk labeling. I was very interested in the first part of the statement quoted above (… not using UDEV …) as it implies using UDEV is a viable option.

So I went ahead and tried to use ASMFD with UDEV rules. I have previously used UDEV rules without ASMFD when installing Oracle products so that wasn’t too hard to do.

After changing ownership of the relevant block devices to grid:asmadmin and mode to 0660 via UDEV, here’s the result on the changes. Permissions on my block devices to be used for ASM are now correct:

[root@server5 ~]# ls -l /dev/vd[cd]*
brw-rw----. 1 root disk     251, 32 Apr  1 15:49 /dev/vdc
brw-rw----. 1 grid asmadmin 251, 33 Apr  1 15:49 /dev/vdc1
brw-rw----. 1 root disk     251, 48 Apr  1 15:49 /dev/vdd
brw-rw----. 1 grid asmadmin 251, 49 Apr  1 15:49 /dev/vdd1
[root@server5 ~]#  

Preparing for patching and installing

Due to the issue of not being able to install Oracle Restart 19c/ASMFD with the base release I am using a slightly different approach this time. It’s rather subtle, but effective. There is only 1 difference in the call to gridSetup.sh compared to the last post. The goal is to merge the latest RU into the unzipped binaries prior to invoking the installer to create a supported configuration.

Since Oracle Grid Infrastructure 12.2 has been released it is possible to merge a Release Update (RU) into the unzipped installation image. This process has changed quite a bit over time, as reported by Ludovico Caldara for example. You can’t simply execute gridSetup.sh -applyRU and to merge the RU into the installation image, you also need to pass the parameters for an installation to avoid an error when trying a silent installation. In GUI mode, the graphical user interface starts after merging the patch into the binaries instead.

Combining the -applyRU flags and installation options as used in the previous post, I can both patch and install the software. My system has Oracle 19.6.0 available and I’m going to apply it. In preparation I have to update OPatch as the grid owner. Once that’s done I need to unzip the RU to a staging location, still logged in as the grid owner:

[grid@server5 ~]$ unzip -q /mnt/19.6.0/p6880880_190000_Linux-x86-64.zip -d /u01/app/grid/product/19.0.0/grid
replace /u01/app/grid/product/19.0.0/grid/OPatch/emdpatch.pl? [y]es, [n]o, [A]ll, [N]one, [r]ename: A
[grid@server5 ~]$
[grid@server5 ~]$ unzip -q /mnt/19.6.0/p30501910_190000_Linux-x86-64.zip -d /u01/stage
[grid@server5 ~]$ 

Patch and install

With all preparations in place, it’s time to call gridSetup.sh with the -applyRU flag necessary for UEK5:

[grid@server5 ~]$ cd /u01/app/grid/product/19.0.0/grid/
[grid@server5 grid]$ ./gridSetup.sh -silent \
> INVENTORY_LOCATION=/u01/app/oraInventory \
> SELECTED_LANGUAGES=en \
> ORACLE_BASE=/u01/app/grid \
> ORACLE_HOME_NAME=RESTART_ASMFD_UEK \
> -waitforcompletion -ignorePrereqFailure -lenientInstallMode \
> -applyRU /u01/stage/30501910 \
> oracle.install.option=HA_CONFIG \
> oracle.install.asm.OSDBA=asmdba \
> oracle.install.asm.OSASM=asmadmin \
> oracle.install.asm.diskGroup.name=DATA \
> oracle.install.asm.diskGroup.disks=/dev/vdc1,/dev/vdd1 \
> oracle.install.asm.diskGroup.diskDiscoveryString=/dev/vd* \
> oracle.install.asm.diskGroup.redundancy=EXTERNAL \
> oracle.install.asm.diskGroup.AUSize=4 \
> oracle.install.asm.configureAFD=true \
> oracle.install.crs.rootconfig.executeRootScript=false \
> oracle.install.asm.SYSASMPassword=thinkOfASuperSecretPassword \
> oracle.install.asm.monitorPassword=thinkOfASuperSecretPassword
Preparing the home to patch...
Applying the patch /u01/stage/30501910...
Successfully applied the patch.
The log can be found at: /tmp/GridSetupActions2020-04-01_04-03-20PM/installerPatchActions_2020-04-01_04-03-20PM.log
Launching Oracle Grid Infrastructure Setup Wizard...

The response file for this session can be found at:
 /u01/app/grid/product/19.0.0/grid/install/response/grid_2020-04-01_04-03-20PM.rsp

You can find the log of this install session at:
 /tmp/GridSetupActions2020-04-01_04-03-20PM/gridSetupActions2020-04-01_04-03-20PM.log

As a root user, execute the following script(s):
        1. /u01/app/oraInventory/orainstRoot.sh
        2. /u01/app/grid/product/19.0.0/grid/root.sh

Execute /u01/app/grid/product/19.0.0/grid/root.sh on the following nodes:
[server5]



Successfully Setup Software.
As install user, execute the following command to complete the configuration.
        /u01/app/grid/product/19.0.0/grid/gridSetup.sh -executeConfigTools -responseFile /u01/app/grid/product/19.0.0/grid/install/response/grid_2020-04-01_04-03-20PM.rsp [-silent]
Note: The required passwords need to be included in the response file.


Moved the install session logs to:
 /u01/app/oraInventory/logs/GridSetupActions2020-04-01_04-03-20PM
[grid@server5 grid]$ 

“Successfully Setup Software”, the message I was waiting for ;) From then on it’s the same as described with the previous blog post. Heed over there to read more about the execution of orainstRoot.sh, root.sh and the configuration tool step to finish the installation.

Verification

As with the previous post I’d like to share some of the results of the installation. First, what about the afddriverstate?

[grid@server5 ~]$ afddriverstate installed
AFD-9203: AFD device driver installed status: 'true'
[grid@server5 ~]$ afddriverstate loaded
AFD-9205: AFD device driver loaded status: 'true'
[grid@server5 ~]$ afddriverstate version
AFD-9325:     Driver OS kernel version = 4.14.35-1902.0.9.el7uek.x86_64.
AFD-9326:     Driver build number = 191219.
AFD-9212:     Driver build version = 19.0.0.0.0.
AFD-9547:     Driver available build number = 191219.
AFD-9548:     Driver available build version = 19.0.0.0.0.
[grid@server5 ~]$  

This looks all right. What about the patch level?

[grid@server5 ~]$ $ORACLE_HOME/OPatch/opatch lspatches
30655595;TOMCAT RELEASE UPDATE 19.0.0.0.0 (30655595)
30557433;Database Release Update : 19.6.0.0.200114 (30557433)
30489632;ACFS RELEASE UPDATE 19.6.0.0.0 (30489632)
30489227;OCW RELEASE UPDATE 19.6.0.0.0 (30489227)

OPatch succeeded. 

And finally, what about ASM? Is it using ASM Filter Driver for its disks?

SQL> col name for a20
SQL> col path for a10
SQL> col library for a50
SQL> set lines 120
SQL> select name, path, library from v$asm_disk where group_number <> 0;

NAME                 PATH       LIBRARY
-------------------- ---------- --------------------------------------------------
DATA1                AFD:DATA1  AFD Library - Generic , version 3 (KABI_V3)
DATA2                AFD:DATA2  AFD Library - Generic , version 3 (KABI_V3)

SQL> 
SQL> show parameter asm_diskstring

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
asm_diskstring                       string      /dev/vd*, AFD:* 

So all in all, the status isn’t different from the previous post, except I have already patched my environment, saving me time.

Failed attempts

I am keeping these as reminder to myself. Maybe they are of use to you as well when troubleshooting.

When trying to install Oracle Restart after labeling ASM disks as demonstrated in the previous post, the installation of the 19.3.0 (base release) fails:

[grid@server5 grid]$ ./gridSetup.sh -silent \
> INVENTORY_LOCATION=/u01/app/oraInventory \
> SELECTED_LANGUAGES=en \
> ORACLE_BASE=/u01/app/grid \
> ORACLE_HOME_NAME=RESTART_ASMFD_UEK \
> -waitforcompletion -ignorePrereqFailure -lenientInstallMode \
> -applyRU /u01/stage/30501910 \
> oracle.install.option=HA_CONFIG \
> oracle.install.asm.OSDBA=asmdba \
> oracle.install.asm.OSASM=asmadmin \
> oracle.install.asm.diskGroup.name=DATA \
> oracle.install.asm.diskGroup.disks=/dev/vdc1,/dev/vdd1 \
> oracle.install.asm.diskGroup.diskDiscoveryString=/dev/vd* \
> oracle.install.asm.diskGroup.redundancy=EXTERNAL \
> oracle.install.asm.diskGroup.AUSize=4 \
> oracle.install.asm.configureAFD=true \
> oracle.install.crs.rootconfig.executeRootScript=false \
> oracle.install.asm.SYSASMPassword=thinkOfASuperSecretPassword \
> oracle.install.asm.monitorPassword=thinkOfASuperSecretPassword
Preparing the home to patch...
Applying the patch /u01/stage/30501910...
Successfully applied the patch.
The log can be found at: /tmp/GridSetupActions2020-04-01_03-25-52PM/installerPatchActions_2020-04-01_03-25-52PM.log
Launching Oracle Grid Infrastructure Setup Wizard...

[FATAL] [INS-30508] Invalid ASM disks.
   CAUSE: The disks [/dev/vdd1, /dev/vdc1] were not valid.
   ACTION: Please choose or enter valid ASM disks.
[FATAL] [INS-30515] Insufficient space available in the selected disks.
   CAUSE: Insufficient space available in the selected Disks. At least, 32 MB of free space is required.
   ACTION: Choose additional disks such that the total size should be at least 32 MB.
Moved the install session logs to:
 /u01/app/oraInventory/logs/GridSetupActions2020-04-01_03-25-52PM
[grid@server5 grid]$  

Similarly, if you don’t run with a very recent UEK 5 kernel patch level (4.14.35-1902.300.11.el7uek.x86_64 and later), the call to gridSetup.sh also fails, even when specifying -applyRU:

[grid@server5 grid]$ ./gridSetup.sh -silent \
> INVENTORY_LOCATION=/u01/app/oraInventory \
> SELECTED_LANGUAGES=en \
> ORACLE_BASE=/u01/app/grid \
> ORACLE_HOME_NAME=RESTART_ASMFD_UEK \
> -waitforcompletion -ignorePrereqFailure -lenientInstallMode \
> -applyRU /u01/stage/30501910 \
> oracle.install.option=HA_CONFIG \
> oracle.install.asm.OSDBA=asmdba \
> oracle.install.asm.OSASM=asmadmin \
> oracle.install.asm.diskGroup.name=DATA \
> oracle.install.asm.diskGroup.disks=/dev/vdc1,/dev/vdd1 \
> oracle.install.asm.diskGroup.diskDiscoveryString=/dev/vd* \
> oracle.install.asm.diskGroup.redundancy=EXTERNAL \
> oracle.install.asm.diskGroup.AUSize=4 \
> oracle.install.asm.configureAFD=true \
> oracle.install.crs.rootconfig.executeRootScript=false \
> oracle.install.asm.SYSASMPassword=thinkOfASuperSecretPassword \
> oracle.install.asm.monitorPassword=thinkOfASuperSecretPassword
Preparing the home to patch...
Applying the patch /u01/stage/30501910...
Successfully applied the patch.
The log can be found at: /tmp/GridSetupActions2020-04-01_03-55-32PM/installerPatchActions_2020-04-01_03-55-32PM.log
Launching Oracle Grid Infrastructure Setup Wizard...

[FATAL] [INS-41223] ASM Filter Driver is not supported on this platform.
   ACTION: To proceed, do not specify or select the Oracle ASM Filter Driver option.
*ADDITIONAL INFORMATION:*
 - AFD-620: AFD is not supported on this operating system version: 'EL7'
 - AFD-9999: Cannot change to Not valid path:
 - 

Moved the install session logs to:
 /u01/app/oraInventory/logs/GridSetupActions2020-04-01_03-55-32PM 

The kernel I used in this case was the stock Oracle Linux 7.7 kernel:

[root@server5 ~]# uname -r
4.14.35-1818.3.3.el7uek.x86_64 

The nice touch is that it’s telling me the use of ASMFD on my old kernel isn’t supported.

Happy installing!

Versioning for your local Vagrant boxes: handling updates

In my last post I summarised how to enable versioning for Vagrant box outside Vagrant’s cloud. In this part I’d like to share how to update a box.

My environment

The environment hasn’t changed compared to the first post. In summary I’m using

  • Ubuntu 20.04 LTS
  • Virtualbox 6.1.6
  • Vagrant 2.2.7

Updating a box

Let’s assume it’s time to update the base box for whatever reason. I most commonly update my boxes every so often after having run an “yum upgrade -y” to bring it up to the most current software. A new drop of the Guest Additions also triggers a rebuild, and so on.

Packaging

Once the changes are made, you need to package the box again. Continuing the previous example I save all my boxes and their JSON metadata in ~/vagrant/boxes. The box comes first:

[martin@host ~]$ vagrant package --base oraclelinux7base --output ~/vagrant/boxes/ol7_7.8.1.box

This creates a second box right next to the existing one. Note I bumped the version number to 7.8.1 to avoid file naming problems:

[martin@host boxes]$ ls -1
ol7_7.8.0.box
ol7_7.8.1.box
ol7.json 

Updating metadata

The next step is to update the JSON document. At this point in time, it references version 7.8.0 of my box:

[martin@host boxes]$ cat ol7.json 
{
    "name": "ol7",
    "description": "Martins Oracle Linux 7",
    "versions": [
      {
        "version": "7.8.0",
        "providers": [
          {
            "name": "virtualbox",
            "url": "file:///home/martin/vagrant/boxes/ol7_7.8.0.box",
            "checksum": "db048c3d61c0b5a8ddf6b59ab189248a42bf9a5b51ded12b2153e0f9729dfaa4",
            "checksum_type": "sha256"
          }
        ]
      }
    ]
  } 

You probably suspected what’s next :) A new version is created by adding a new element into the versions array, like so:

{
  "name": "ol7",
  "description": "Martins Oracle Linux 7",
  "versions": [
    {
      "version": "7.8.0",
      "providers": [
        {
          "name": "virtualbox",
          "url": "file:///home/martin/vagrant/boxes/ol7_7.8.0.box",
          "checksum": "db048c3d61c0b5a8ddf6b59ab189248a42bf9a5b51ded12b2153e0f9729dfaa4",
          "checksum_type": "sha256"
        }
      ]
    },
    {
      "version": "7.8.1",
      "providers": [
        {
          "name": "virtualbox",
          "url": "file:///home/martin/vagrant/boxes/ol7_7.8.1.box",
          "checksum": "f9d74dbbe88eab2f6a76e96b2268086439d49cb776b407c91e4bd3b3dc4f3f49",
          "checksum_type": "sha256"
        }
      ]
    }
  ]
} 

Don’t forget to update the SHA256 checksum!

Check for box updates

Back in my VM directory I can now check if there is a new version of my box:

[martin@host versioning]$ vagrant box outdated
Checking if box 'ol7' version '7.8.0' is up to date...
A newer version of the box 'ol7' for provider 'virtualbox' is
available! You currently have version '7.8.0'. The latest is version
'7.8.1'. Run `vagrant box update` to update.
[martin@host versioning]$ 

And there is! Not entirely surprising though, so let’s update the box:

[martin@host versioning]$ vagrant box update
==> default: Checking for updates to 'ol7'
    default: Latest installed version: 7.8.0
    default: Version constraints: 
    default: Provider: virtualbox
==> default: Updating 'ol7' with provider 'virtualbox' from version
==> default: '7.8.0' to '7.8.1'...
==> default: Loading metadata for box 'file:///home/martin/vagrant/boxes/ol7.json'
==> default: Adding box 'ol7' (v7.8.1) for provider: virtualbox
    default: Unpacking necessary files from: file:///home/martin/vagrant/boxes/ol7_7.8.1.box
    default: Calculating and comparing box checksum...
==> default: Successfully added box 'ol7' (v7.8.1) for 'virtualbox'! 

At the end of this exercise both versions are available:

[martin@host versioning]$ vagrant box list | grep ^ol7
ol7               (virtualbox, 7.8.0)
ol7               (virtualbox, 7.8.1)
[martin@host versioning]$  

This is so much better than my previous approach!

What are the effects of box versioning?

You could read earlier when I created a Vagrant VM based on version 7.8.0 of my box. This VM hasn’t been removed. What happens if I start it up now that there’s a newer version of the ol7 box available?

Bringing machine 'default' up with 'virtualbox' provider...
==> default: Checking if box 'ol7' version '7.8.0' is up to date...
==> default: A newer version of the box 'ol7' is available and already
==> default: installed, but your Vagrant machine is running against
==> default: version '7.8.0'. To update to version '7.8.1',
==> default: destroy and recreate your machine.
==> default: Clearing any previously set forwarded ports...
==> default: Fixed port collision for 22 => 2222. Now on port 2200.
==> default: Clearing any previously set network interfaces...
==> default: Preparing network interfaces based on configuration...
    default: Adapter 1: nat
==> default: Forwarding ports...
    default: 22 (guest) => 2200 (host) (adapter 1)
==> default: Running 'pre-boot' VM customizations...
==> default: Booting VM...
==> default: Waiting for machine to boot. This may take a few minutes...
    default: SSH address: 127.0.0.1:2200
    default: SSH username: vagrant
    default: SSH auth method: private key
==> default: Machine booted and ready!
==> default: Checking for guest additions in VM...
==> default: Setting hostname...
==> default: Mounting shared folders...
    default: /vagrant => /home/martin/vagrant/versioning
==> default: Machine already provisioned. Run `vagrant provision` or use the `--provision`
==> default: flag to force provisioning. Provisioners marked to run always will still run. 

Vagrant tells me that I’m using an old version of the box, and how to switch to the new one. I think I’ll do this eventually, but I can still work with the old version.

And what if I create a new VM? By default, Vagrant creates the new VM based on the latest version of my box, 7.8.1. You can see this here:

Bringing machine 'default' up with 'virtualbox' provider...
==> default: Importing base box 'ol7'...
==> default: Matching MAC address for NAT networking...
==> default: Checking if box 'ol7' version '7.8.1' is up to date...
==> default: Setting the name of the VM: versioning2_default_1588259041745_89693
==> default: Fixed port collision for 22 => 2222. Now on port 2201.
==> default: Clearing any previously set network interfaces...
==> default: Preparing network interfaces based on configuration...
    default: Adapter 1: nat
==> default: Forwarding ports...
    default: 22 (guest) => 2201 (host) (adapter 1)
==> default: Running 'pre-boot' VM customizations...
==> default: Booting VM...
==> default: Waiting for machine to boot. This may take a few minutes...
    default: SSH address: 127.0.0.1:2201
    default: SSH username: vagrant
    default: SSH auth method: private key
==> default: Machine booted and ready!
==> default: Checking for guest additions in VM...
==> default: Setting hostname...
==> default: Mounting shared folders...
    default: /vagrant => /home/martin/vagrant/versioning2 

Cleaning up

As with every technology, housekeeping is essential to keep disk usage in check. Refer back to the official documentation for more details on housekeeping and local copies of Vagrant boxes.

Summary

In the past I really struggled maintaining my local Vagrant boxes. Updating a box proved quite tricky and came with undesired side effects. Using versioning as demonstrated in this post is a great way out of this dilemma. And contrary to what I thought for a long time uploading my boxes to Vagrant cloud is not needed.

There is of course a lot more to say about versioning as this feature can do so much more. Maybe I’ll write another post about that subject some other time, until then I kindly refer you to the documentation.

Versioning for your local Vagrant boxes: adding a new box

I have been using Vagrant for quite some time now can’t tell you how much of a productivity boost it has been. All the VMs I have on my laptop are either powered by Vagrant, or feed into the Vagrant workflow.

One thing I haven’t worked out though is how to use versioning outside of Vagrant’s cloud. I don’t think I have what it takes to publish a good OS image publicly, and rather keep my boxes to myself to prevent others from injury.

My environment

While putting this post together I used the following software:

  • Ubuntu 20.04 LTS acts as my host operating system
  • Virtualbox 6.1.6
  • Vagrant 2.2.7

This is probably as current as it gets at the time of writing.

The need for box versioning

Vagrant saves you time by providing “gold images” you can spin up quickly. I prefer to always have the latest and greatest software available without having to spend ages on updating kernels and/or other components. As a result, I update my “gold image” VM from time to time, before packaging it up for Vagrant. Until quite recently I haven’t figured out how to update a VM other than delete/recreated it. This isn’t the best idea though, as indicated by this error message:

$ vagrant box remove debianbase-slim
Box 'debianbase-slim' (v0) with provider 'virtualbox' appears
to still be in use by at least one Vagrant environment. Removing
the box could corrupt the environment. We recommend destroying
these environments first:

default (ID: ....)

Are you sure you want to remove this box? [y/N] n 

This week I finally sat down trying to work out a better way of refreshing my Vagrant boxes.

As I understand it, box versioning allows me to update my base box without having to trash any environments. So instead of removing the box and replacing it with another, I can add a new version to the box. Environments using the old version can do so until they are torn down. New environments can use the new version. This works remarkably easy, once you know how to set it up! I found a few good sources on the Internet and combined them into this article.

Box versioning for Oracle Linux 7

As an Oracle person I obviously run Oracle Linux a lot. Earlier I came up with a procedure to create my own base boxes. This article features “oraclelinux7base” as the source for my Vagrant boxes. It adheres to all the requirements for Vagrant base boxes to be used with the Virtualbox provider.

Packaging the base box

Once you are happy to release your Virtualbox VM to your host, you have to package it for use with Vagrant. All my Vagrant boxes go to ~/vagrant/boxes, so this command creates the package:

$ vagrant package --base oraclelinux7base --output ~/vagrant/boxes/ol7_7.8.0.box
==> oraclelinux7base: Attempting graceful shutdown of VM...
==> oraclelinux7base: Clearing any previously set forwarded ports...
==> oraclelinux7base: Exporting VM...
==> oraclelinux7base: Compressing package to: /home/martin/vagrant/boxes/ol7_1.0.0.box 

In plain English this command instructs Vagrant to take Virtualbox’s oraclelinux7base VM and package it into ~/vagrant/boxes/ol7_7.8.0.box. I am creating this VM as the first OL 7.8 system, the naming convention seems optional yet I think it’s best to indicate the purpose and version in the package name.

At this stage, DO NOT “vagrant add” the box!

Creating box metadata

The next step is to create a little metadata describing the box. This time it’s not to be written in YAML, but JSON for a change. I found a few conflicting sources and I couldn’t get them to work until I had a look at how Oracle solved the problem. If you navigate to yum.oracle.com/boxes, you can find the links to their metadata files. I really appreciate Oracle changing to using versioning of their boxes, too!

After a little trial-and-error I came up with this file. It’s probably just the bare minimum, but it works for me in my lab so I’m happy to keep it the way it is. The file lives in ~/vagrant/boxes alongside the box file itself.

$ cat ol7.json
{
    "name": "ol7",
    "description": "Martins Oracle Linux 7",
    "versions": [
      {
        "version": "7.8.0",
        "providers": [
          {
            "name": "virtualbox",
            "url": "file:///home/martin/vagrant/boxes/ol7_7.8.0.box",
            "checksum": "db048c3d61c0b5a8ddf6b59ab189248a42bf9a5b51ded12b2153e0f9729dfaa4",
            "checksum_type": "sha256"
          }
        ]
      }
    ]
  } 

The file should be self-explanatory. The only noteworthy issue to run into is an insufficient number of forward slashes in the URL the URI is composed of “file://” followed by the fully qualified path to the box file, 3 forward slashes in total.

I used “sha256sum /home/martin/vagrant/boxes/ol7_7.8.0.box” to calculate the checksum.

Creating a VM

Finally it’s time to create the VM. I tend to create a directory per Vagrant environment, in this example I called it “versioning”. Within ~/vagrant/versioning I can create a Vagrantfile with the VM’s definition. At this stage, the base box is unknown to Vagrant.

$ nl Vagrantfile 
     1    # -*- mode: ruby -*-
     2    # vi: set ft=ruby :

     3    Vagrant.configure("2") do |config|
     4      config.vm.box = "ol7"
     5      config.vm.box_url = "file:///home/martin/vagrant/boxes/ol7.json"
     6      
     7      config.ssh.private_key_path = '/home/martin/.ssh/vagrantkey'

     8      config.vm.hostname = "server1"

     9      config.vm.provider "virtualbox" do |vb|
    10        vb.cpus = 2
    11        vb.memory = "4096"
    12      end

    13    end
 

The difference to my earlier post is the reference to the JSON file in line 5. The JSON file tells vagrant where to find the Vagrant box. The remaining configuration isn’t different from using non-versioned Vagrant boxes.

Based on this configuration file I can finally spin up my VM:

$ vagrant up
Bringing machine 'default' up with 'virtualbox' provider...
==> default: Box 'ol7' could not be found. Attempting to find and install...
    default: Box Provider: virtualbox
    default: Box Version: >= 0
==> default: Loading metadata for box 'file:///home/martin/vagrant/boxes/ol7.json'
    default: URL: file:///home/martin/vagrant/boxes/ol7.json
==> default: Adding box 'ol7' (v7.8.0) for provider: virtualbox
    default: Unpacking necessary files from: file:///home/martin/vagrant/boxes/ol7_7.8.0.box
    default: Calculating and comparing box checksum...
==> default: Successfully added box 'ol7' (v7.8.0) for 'virtualbox'!
==> default: Importing base box 'ol7'...
==> default: Matching MAC address for NAT networking...
==> default: Checking if box 'ol7' version '7.8.0' is up to date...
==> default: Setting the name of the VM: versioning_default_1588251635800_49095
==> default: Fixed port collision for 22 => 2222. Now on port 2200.
==> default: Clearing any previously set network interfaces...
==> default: Preparing network interfaces based on configuration...
    default: Adapter 1: nat
==> default: Forwarding ports...
    default: 22 (guest) => 2200 (host) (adapter 1)
==> default: Running 'pre-boot' VM customizations...
==> default: Booting VM...
==> default: Waiting for machine to boot. This may take a few minutes...
    default: SSH address: 127.0.0.1:2200
    default: SSH username: vagrant
    default: SSH auth method: private key
==> default: Machine booted and ready!
==> default: Checking for guest additions in VM...
==> default: Setting hostname...
==> default: Mounting shared folders...
    default: /vagrant => /home/martin/vagrant/versioning 

Right at the beginning you can see that Vagrant loads “metadata for box ‘file:///home/martin/vagrant/boxes/ol7.json'” and then loads the box from the location specified in the JSON file.

Once the machine is started, I can also see it available for future use:

$ vagrant box list | grep ^ol7
ol7               (virtualbox, 7.8.0) 

The box is registered as ol7, using the Virtualbox provider in version 7.8.0.

Summary

In this post I summarised (mainly for my own later use ;) how to use box versioning on my development laptop. It really isn’t that much of a difference compared to the previous way I worked and the benefit will become apparent once you update the box. I’m going to cover upgrading my “ol7” box in another post.

Fresh install: Autonomous Health Framework for single instance Oracle 19c

I have been deploying Tracefile Anlyzer (TFA) to every system I’m looking after, whenever possible. Going back in the history of this weblog I created quite a few posts about how you can do so, too. When writing these posts back in 2018, TFA was available on its own.

In the meantime TFA has been merged into another, even more powerful tool: Autonomous Health Framewowrk (AHF). Its documentation appears a little bit scattered over Oracle’s website, and since this post is about Oracle 19c I decided to use the documentation as embedded in the 19c database documentation bookshelf. If I missed a more current version, please let me know via my twitter handle @MartinDBA.

This post covers a fresh installation of AHF 20.1.2 – the current version at the time of writing – on Oracle Linux 7.8/UEK 5 with my LANGuage set to en_US.UTF-8. I have a single instance 19c database, patched to 19.7.0 running on the same host. This is purely an RDBMS deployment, the host does not contain any trace of Grid Infrastructure. I’ll write two more posts covering the installation on Oracle Restart and Real Application Clusters. I’m only covering Oracle release 19c as it’s the long-term-support release for the 12c Release 2 family.

Warning

AHF, or rather its subcomponents, can easily consume plenty of space. I create my systems with a separate logical volume for /opt, where I’m installing AHF. This way I can’t run out of space on my root file system, a situation to be avoided at all cost. That’s as bad as running out of space on the logical volumes containing the Oracle binaries and ADR… So please make sure you have enough space for AHF and its components, and separate them from other critical system components!

Installation

Before you can install AHF, you need to first download it from My Oracle Support (MOS). AHF requires perl-Data-Dumper and perl-Digest-MD5 to be present on your system.

Downloading AHF

This is pretty straight forward: simply navigate to MOS note 2550798.1 and download the version for your platform. At the time of writing, 20.1.2 was the current release for Linux.

Avoiding problems during the installation

Before you can start the installation, you need to absolutely make sure to have perl’s Data::Dumper and Digest::MD5 installed or you’ll run into nasty trouble during the installation.

[root@server5 ~]# yum install -y perl-Digest-MD5 perl-Data-Dumper
Loaded plugins: ulninfo
Resolving Dependencies
--> Running transaction check
---> Package perl-Data-Dumper.x86_64 0:2.145-3.el7 will be installed
---> Package perl-Digest-MD5.x86_64 0:2.52-3.el7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

=================================================================================================================
 Package                        Arch                 Version                      Repository                Size
=================================================================================================================
Installing:
 perl-Data-Dumper               x86_64               2.145-3.el7                  ol7_latest                47 k
 perl-Digest-MD5                x86_64               2.52-3.el7                   ol7_latest                29 k

Transaction Summary
=================================================================================================================
Install  2 Packages

Total download size: 76 k
Installed size: 151 k
Downloading packages:
(1/2): perl-Data-Dumper-2.145-3.el7.x86_64.rpm                                            |  47 kB  00:00:00     
(2/2): perl-Digest-MD5-2.52-3.el7.x86_64.rpm                                              |  29 kB  00:00:00     
-----------------------------------------------------------------------------------------------------------------
Total                                                                            426 kB/s |  76 kB  00:00:00     
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Warning: RPMDB altered outside of yum.
  Installing : perl-Digest-MD5-2.52-3.el7.x86_64                                                             1/2 
  Installing : perl-Data-Dumper-2.145-3.el7.x86_64                                                           2/2 
  Verifying  : perl-Data-Dumper-2.145-3.el7.x86_64                                                           1/2 
  Verifying  : perl-Digest-MD5-2.52-3.el7.x86_64                                                             2/2 

Installed:
  perl-Data-Dumper.x86_64 0:2.145-3.el7                    perl-Digest-MD5.x86_64 0:2.52-3.el7                   

Complete!
[root@server5 ~]# 

Install AHF

Once you finished downloading AHF, copy the zipfile to the machine where you want it to be installed. After unzipping it, you will file the installer (ahf_setup) and a README file. Please make sure you check the README before invoking the installer.

The installation can be automated using Ansible for example, this is the list of options offered:

[root@server5 ahf]# ./ahf_setup -h

AHF Installer for Platform Linux Architecture x86_64

   Usage for ./ahf_setup

   ./ahf_setup [[-ahf_loc ][-data_dir ][-nodes ][-extract[orachk|exachk|-notfasetup] [-force]][-local][-silent][-tmp_loc ][-debug [-level <1-6>]]]

        -ahf_loc          -    Install into the directory supplied. (Default /opt/oracle.ahf)
        -data_dir         -    AHF Data Directory where all the collections, metadata, etc. will be stored 
        -nodes            -    Comma separated Remote Node List 
        -extract          -    Extract only files from Installer. (Default for non-root users)
        -notfasetup       -    Do not Configure TFA when used with -extract
        -force            -    Force the mentioned compliance type with -extract
        -local            -    Only install on the local node
        -silent           -    Do not ask any install questions 
        -tmp_loc          -    Temporary location directory for AHF to extract the install archive to (must exist) (Default /tmp)
        -perlhome         -    Custom location of perl binaries
        -debug            -    Debug AHF Install Script 
        -level            -    AHF Instal Debug Level 1-6 (Default 4 with option -debug) 
                               [FATAL - 1, ERROR - 2, WARNING - 3, INFO - 4, DEBUG - 5, TRACE - 6] 

   Note : 
   1. Without parameters AHF will take you through an interview process for installation 
   2. If -silent option is used, then ensure that you use the -data_dir option, otherwise, the installer script will fail 
   3. The -force option is applicable only when -extract orachk|exachk option is passed else it is ignored. 

Alternatively you can install AHF interactively, I’ll leave that as an exercise to the reader.

I tend to install AHF in /opt; please see above for a quick warning about space usage! All my systems use LVM for /opt, a configuration setting I’m enforcing via the kickstart file.

Oracle recommends installing AHF as root, and I’ll comply with the suggestion:

[root@server4 ahf]# ./ahf_setup -ahf_loc /opt/oracle.ahf -data_dir /opt -silent

AHF Installer for Platform Linux Architecture x86_64

AHF Installation Log : /tmp/ahf_install_24137_2020_04_24-10_04_50.log

Starting Autonomous Health Framework (AHF) Installation

AHF Version: 20.1.2 Build Date: 202004031134

AHF Location : /opt/oracle.ahf

AHF Data Directory : /opt/oracle.ahf/data

Extracting AHF to /opt/oracle.ahf

Configuring TFA Services

Discovering Nodes and Oracle Resources
Successfully generated certificates. 

Starting TFA Services
Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.

.-----------------------------------------------------------------------------.
| Host    | Status of TFA | PID   | Port  | Version    | Build ID             |
+---------+---------------+-------+-------+------------+----------------------+
| server4 | RUNNING       | 25035 | 11631 | 20.1.2.0.0 | 20120020200403113404 |
'---------+---------------+-------+-------+------------+----------------------'

Running TFA Inventory...

Adding default users to TFA Access list...

.-----------------------------------------------------.
|             Summary of AHF Configuration            |
+-----------------+-----------------------------------+
| Parameter       | Value                             |
+-----------------+-----------------------------------+
| AHF Location    | /opt/oracle.ahf                   |
| TFA Location    | /opt/oracle.ahf/tfa               |
| Orachk Location | /opt/oracle.ahf/orachk            |
| Data Directory  | /opt/oracle.ahf/data              |
| Repository      | /opt/oracle.ahf/data/repository   |
| Diag Directory  | /opt/oracle.ahf/data/server4/diag |
'-----------------+-----------------------------------'


Starting orachk daemon from AHF ...

AHF binaries are available in /opt/oracle.ahf/bin

AHF is successfully installed

Moving /tmp/ahf_install_24137_2020_04_24-10_04_50.log to /opt/oracle.ahf/data/server4/diag/ahf/ 

AHF can send notification emails to an address you provide when running the interactive installer. I haven’t seen an option to provide an email address as part of the silent installation, but it’s not a problem as you can always update the configuration later.

If you are running AHF 20.1.2 OSWatcher won’t start automatically and you need to fix your configuration manually as described in a separate post.

Once this has completed, you should check if TFA is running.

[oracle@server4 ~]$ tfactl print status

.-----------------------------------------------------------------------------------------------.
| Host    | Status of TFA | PID  | Port  | Version    | Build ID             | Inventory Status |
+---------+---------------+------+-------+------------+----------------------+------------------+
| server4 | RUNNING       | 1986 | 21511 | 20.1.2.0.0 | 20120020200403113404 | COMPLETE         |
'---------+---------------+------+-------+------------+----------------------+------------------'
[oracle@server4 ~]$ 

This looks as if it does indeed be ready for business. And I have all the support tools at my disposal:

[oracle@server4 ~]$ tfactl toolstatus

.------------------------------------------------------------------.
|                   TOOLS STATUS - HOST : server4                  |
+----------------------+--------------+--------------+-------------+
| Tool Type            | Tool         | Version      | Status      |
+----------------------+--------------+--------------+-------------+
| Development Tools    | orachk       |   19.3.0.0.0 | DEPLOYED    |
|                      | oratop       |       14.1.2 | DEPLOYED    |
+----------------------+--------------+--------------+-------------+
| Support Tools Bundle | darda        | 2.10.0.R6036 | DEPLOYED    |
|                      | oswbb        |        8.3.2 | RUNNING     |
|                      | prw          | 12.1.13.11.4 | NOT RUNNING |
+----------------------+--------------+--------------+-------------+
| TFA Utilities        | alertsummary |   19.3.0.0.0 | DEPLOYED    |
|                      | calog        |   19.3.0.0.0 | DEPLOYED    |
|                      | dbcheck      |   18.3.0.0.0 | DEPLOYED    |
|                      | dbglevel     |   19.3.0.0.0 | DEPLOYED    |
|                      | grep         |   19.3.0.0.0 | DEPLOYED    |
|                      | history      |   19.3.0.0.0 | DEPLOYED    |
|                      | ls           |   19.3.0.0.0 | DEPLOYED    |
|                      | managelogs   |   19.3.0.0.0 | DEPLOYED    |
|                      | menu         |   19.3.0.0.0 | DEPLOYED    |
|                      | param        |   19.3.0.0.0 | DEPLOYED    |
|                      | ps           |   19.3.0.0.0 | DEPLOYED    |
|                      | pstack       |   19.3.0.0.0 | DEPLOYED    |
|                      | summary      |   19.3.0.0.0 | DEPLOYED    |
|                      | tail         |   19.3.0.0.0 | DEPLOYED    |
|                      | triage       |   19.3.0.0.0 | DEPLOYED    |
|                      | vi           |   19.3.0.0.0 | DEPLOYED    |
'----------------------+--------------+--------------+-------------'

Note :-
  DEPLOYED    : Installed and Available - To be configured or run interactively.
  NOT RUNNING : Configured and Available - Currently turned off interactively.
  RUNNING     : Configured and Available. 

I guess that’s it for single instance deployments.

Didn’t you say TFA was present in 12.2 and later?

I did, and it was, at least with 12.2.0.1 single instance. So I was a little bit surprised to see a change in behaviour during the database installation. After invoking runInstaller, you need to run 2 scripts as root, one to set the permissions on the inventory, the second one performs database specific tasks; it’s called root.sh. TFA can potentially be installed when invoking root.sh:

[oracle@server5 ~]$ cat /u01/app/oracle/product/19.0.0/dbhome_1/root.sh
#!/bin/sh
unset WAS_ROOTMACRO_CALL_MADE
. /u01/app/oracle/product/19.0.0/dbhome_1/install/utl/rootmacro.sh "$@"
. /u01/app/oracle/product/19.0.0/dbhome_1/install/utl/rootinstall.sh
/u01/app/oracle/product/19.0.0/dbhome_1/suptools/tfa/release/tfa_home/install/roottfa.sh
/u01/app/oracle/product/19.0.0/dbhome_1/install/root_schagent.sh

#
# Root Actions related to network
#
/u01/app/oracle/product/19.0.0/dbhome_1/network/install/sqlnet/setowner.sh 

#
# Invoke standalone rootadd_rdbms.sh
#
/u01/app/oracle/product/19.0.0/dbhome_1/rdbms/install/rootadd_rdbms.sh

/u01/app/oracle/product/19.0.0/dbhome_1/rdbms/install/rootadd_filemap.sh 

I didn’t build this environment, Ansible did. All Ansible calls to Oracle software invoke the tools in silent mode. Invoking root.sh in silent mode merely emits a message at the end, asking you to check the logfile for more information. So I did, and this is what it read:

Performing root user operation.

The following environment variables are set as:
    ORACLE_OWNER= oracle
    ORACLE_HOME=  /u01/app/oracle/product/19.0.0/dbhome_1
   Copying dbhome to /usr/local/bin ...
   Copying oraenv to /usr/local/bin ...
   Copying coraenv to /usr/local/bin ...


Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Oracle Trace File Analyzer (TFA) is available at : /u01/app/oracle/product/19.0.0/dbhome_1/bin/tfactl  

It looks like TFA is available after the database software is deployed. I wanted to see if it could interfere with AHF (at this point AHF wasn’t deployed):

[oracle@server5 ~]$ /u01/app/oracle/product/19.0.0/dbhome_1/bin/tfactl status
TFA-00519 Oracle Trace File Analyzer (TFA) is not installed. 

It doesn’t seem as if TFA was present in Oracle 19c single instance. It will become usable once AHF has been deployed.

OSWatcher as included in AHF 20.1.2 fails to start for single instance Oracle

I am about to update my posts detailing the use of Tracefile Analyzer (TFA) now that Oracle has merged it into its Autonomous Health Framework (AHF) and came across an interesting observation worth blogging about upfront.

After completing a fresh installation of AHF 20.1.2, the current version at the time of writing, I noticed OSWatcher didn’t start on my VM. I am operating a single instance Oracle 19.7.0 database, running on Oracle Linux 7.8/UEK 5. The system does not contain any traces of Grid Infrastructure.

I fully expect this problem to be transient, but until Oracle provides a fix I wanted to share my workaround. I didn’t find this problem covered in the usual sources, including My Oracle Support.

For the record, this is the version I can confirm to be affected:

[root@server4 ~]# /opt/oracle.ahf/bin/tfactl status

.-----------------------------------------------------------------------------------------------.
| Host    | Status of TFA | PID  | Port  | Version    | Build ID             | Inventory Status |
+---------+---------------+------+-------+------------+----------------------+------------------+
| server4 | RUNNING       | 9808 | 10905 | 20.1.2.0.0 | 20120020200403113404 | COMPLETE         |
'---------+---------------+------+-------+------------+----------------------+------------------'
[root@server4 ~]#  

The Problem

Right after ahf_setup completes, I can see that OSWatcher (oswbb) isn’t running:

[oracle@server4 ~]$ tfactl toolstatus

.------------------------------------------------------------------.
|                   TOOLS STATUS - HOST : server4                  |
+----------------------+--------------+--------------+-------------+
| Tool Type            | Tool         | Version      | Status      |
+----------------------+--------------+--------------+-------------+
| Development Tools    | orachk       |   19.3.0.0.0 | DEPLOYED    |
|                      | oratop       |       14.1.2 | DEPLOYED    |
+----------------------+--------------+--------------+-------------+
| Support Tools Bundle | darda        | 2.10.0.R6036 | DEPLOYED    |
|                      | oswbb        |        8.3.2 | NOT RUNNING |
|                      | prw          | 12.1.13.11.4 | NOT RUNNING |
+----------------------+--------------+--------------+-------------+
| TFA Utilities        | alertsummary |   19.3.0.0.0 | DEPLOYED    |
|                      | calog        |   19.3.0.0.0 | DEPLOYED    |
|                      | dbcheck      |   18.3.0.0.0 | DEPLOYED    |
|                      | dbglevel     |   19.3.0.0.0 | DEPLOYED    |
|                      | grep         |   19.3.0.0.0 | DEPLOYED    |
|                      | history      |   19.3.0.0.0 | DEPLOYED    |
|                      | ls           |   19.3.0.0.0 | DEPLOYED    |
|                      | managelogs   |   19.3.0.0.0 | DEPLOYED    |
|                      | menu         |   19.3.0.0.0 | DEPLOYED    |
|                      | param        |   19.3.0.0.0 | DEPLOYED    |
|                      | ps           |   19.3.0.0.0 | DEPLOYED    |
|                      | pstack       |   19.3.0.0.0 | DEPLOYED    |
|                      | summary      |   19.3.0.0.0 | DEPLOYED    |
|                      | tail         |   19.3.0.0.0 | DEPLOYED    |
|                      | triage       |   19.3.0.0.0 | DEPLOYED    |
|                      | vi           |   19.3.0.0.0 | DEPLOYED    |
'----------------------+--------------+--------------+-------------'

Note :-
  DEPLOYED    : Installed and Available - To be configured or run interactively.
  NOT RUNNING : Configured and Available - Currently turned off interactively.
  RUNNING     : Configured and Available. 

[oracle@server4 ~]$ ps -ef | grep -i osw
oracle   14567 13566  0 19:34 pts/3    00:00:00 grep --color=auto -i osw 

Please refrain from simply starting OSWatcher as root or in any other way in fact. This can have undesired consequences, as outlined in the Autonomous Health Framework documentation. It has the ability to mess up file system permissions in the OSWatcher archive directory which is really, really hard to recover from. Plus it can also change your configuration.

Troubleshooting

The reason for OSWatcher not starting can be found in the system’s journal:

[root@server4 ~]# systemctl status oracle-tfa
● oracle-tfa.service - Oracle Trace File Analyzer
   Loaded: loaded (/etc/systemd/system/oracle-tfa.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2020-04-23 19:30:29 BST; 8min ago
 Main PID: 9697 (init.tfa)

...

Apr 23 19:36:14 server4 init.tfa[9697]: . . .
Apr 23 19:36:14 server4 init.tfa[9697]: Successfully stopped TFA..
Apr 23 19:36:14 server4 init.tfa[9697]: Starting TFA..
Apr 23 19:36:15 server4 init.tfa[9697]: Starting TFA out of init, Should be running in 10 seconds
Apr 23 19:36:15 server4 init.tfa[9697]: Successfully updated jvmXmx to 128 in TFA...
Apr 23 19:36:22 server4 init.tfa[9697]: OSWatcher is already deployed at /opt/oracle.ahf/tfa/ext/oswbb
Apr 23 19:36:22 server4 init.tfa[9697]: Cannot find valid Non root user to run OSWatcher 

The last line wasn’t expected: “cannot find valid Non root user to run OSWatcher”.

Digging around a bit I noticed that OSWatcher doesn’t recognise the database home properly before firing the error message you just read. A long story short, TFA uses a file named tfa_setup.txt containing information about the system it’s running on when starting. In my case the file could be found in $TFA_DATA_DIR/$(hostname)/tfa/tfa_setup.txt.

This is what it looks like right now, the exact location where to find it depends on the AHF data dir you specified during the installation.

[root@server4 ~]# cat /opt/oracle.ahf/data/server4/tfa/tfa_setup.txt
INSTALL_TYPE=TYPICAL
PERL=/bin/perl
TFA_BASE=/tmp
JAVA_HOME=/opt/oracle.ahf/jre
DAEMON_OWNER=root
ENV_TYPE=TYPICAL
SIM_ENV_TYPE=none
SIMATP=FALSE
NODE_TYPE=TYPICAL
CRS_ACTIVE_VERSION=
[root@server4 ~]#  

I compared the file with a RAC system where OSWatcher just worked, and noticed the absence of an RDBMS home. I added the RDBMS home entry to the file:

[root@server4 ~]# cat /opt/oracle.ahf/data/server4/tfa/tfa_setup.txt
INSTALL_TYPE=TYPICAL
PERL=/bin/perl
TFA_BASE=/tmp
JAVA_HOME=/opt/oracle.ahf/jre
DAEMON_OWNER=root
ENV_TYPE=TYPICAL
SIM_ENV_TYPE=none
SIMATP=FALSE
NODE_TYPE=TYPICAL
CRS_ACTIVE_VERSION=
RDBMS_ORACLE_HOME=/u01/app/oracle/product/19.0.0/dbhome_1||
[root@server4 ~]#  

After which I restarted the TFA service, and voila! I now have a working OSWatcher configuration!

[oracle@server4 ~]$ tfactl toolstatus

.------------------------------------------------------------------.
|                   TOOLS STATUS - HOST : server4                  |
+----------------------+--------------+--------------+-------------+
| Tool Type            | Tool         | Version      | Status      |
+----------------------+--------------+--------------+-------------+
| Development Tools    | orachk       |   19.3.0.0.0 | DEPLOYED    |
|                      | oratop       |       14.1.2 | DEPLOYED    |
+----------------------+--------------+--------------+-------------+
| Support Tools Bundle | darda        | 2.10.0.R6036 | DEPLOYED    |
|                      | oswbb        |        8.3.2 | RUNNING     |
|                      | prw          | 12.1.13.11.4 | NOT RUNNING |
+----------------------+--------------+--------------+-------------+
| TFA Utilities        | alertsummary |   19.3.0.0.0 | DEPLOYED    |
|                      | calog        |   19.3.0.0.0 | DEPLOYED    |
|                      | dbcheck      |   18.3.0.0.0 | DEPLOYED    |
|                      | dbglevel     |   19.3.0.0.0 | DEPLOYED    |
|                      | grep         |   19.3.0.0.0 | DEPLOYED    |
|                      | history      |   19.3.0.0.0 | DEPLOYED    |
|                      | ls           |   19.3.0.0.0 | DEPLOYED    |
|                      | managelogs   |   19.3.0.0.0 | DEPLOYED    |
|                      | menu         |   19.3.0.0.0 | DEPLOYED    |
|                      | param        |   19.3.0.0.0 | DEPLOYED    |
|                      | ps           |   19.3.0.0.0 | DEPLOYED    |
|                      | pstack       |   19.3.0.0.0 | DEPLOYED    |
|                      | summary      |   19.3.0.0.0 | DEPLOYED    |
|                      | tail         |   19.3.0.0.0 | DEPLOYED    |
|                      | triage       |   19.3.0.0.0 | DEPLOYED    |
|                      | vi           |   19.3.0.0.0 | DEPLOYED    |
'----------------------+--------------+--------------+-------------'

Note :-
  DEPLOYED    : Installed and Available - To be configured or run interactively.
  NOT RUNNING : Configured and Available - Currently turned off interactively.
  RUNNING     : Configured and Available.

[oracle@server4 ~]$ ps -ef | grep -i osw
oracle   19770     1  0 19:43 ?        00:00:00 /bin/sh ./OSWatcher.sh 30 48 NONE /opt/oracle.ahf/data/repository/suptools/server4/oswbb/oracle/archive
oracle   19944 19770  0 19:44 ?        00:00:00 /bin/sh ./OSWatcherFM.sh 48 /opt/oracle.ahf/data/repository/suptools/server4/oswbb/oracle/archive
oracle   22672 19770  0 19:47 ?        00:00:00 /bin/sh ./vmsub.sh /opt/oracle.ahf/data/repository/suptools/server4/oswbb/oracle/archive/oswvmstat/server4_vmstat_20.04.23.1900.dat vmstat 1 3 1
oracle   22674 19770  0 19:47 ?        00:00:00 /bin/sh ./mpsub.sh /opt/oracle.ahf/data/repository/suptools/server4/oswbb/oracle/archive/oswmpstat/server4_mpstat_20.04.23.1900.dat mpstat -P ALL 1 2 1
oracle   22682 19770  0 19:47 ?        00:00:00 /bin/sh ./iosub.sh /opt/oracle.ahf/data/repository/suptools/server4/oswbb/oracle/archive/oswiostat/server4_iostat_20.04.23.1900.dat iostat -xk 1 3 1
oracle   22686 19770  0 19:47 ?        00:00:00 /bin/sh ./piddsub.sh /opt/oracle.ahf/data/repository/suptools/server4/oswbb/oracle/archive/oswpidstatd/server4_pidstatd_20.04.23.1900.dat  1
oracle   22692 19770  0 19:47 ?        00:00:00 /bin/sh ./pidsub.sh /opt/oracle.ahf/data/repository/suptools/server4/oswbb/oracle/archive/oswpidstat/server4_pidstat_20.04.23.1900.dat pidstat 1
oracle   22695 19770  0 19:47 ?        00:00:00 /bin/sh ./nfssub.sh /opt/oracle.ahf/data/repository/suptools/server4/oswbb/oracle/archive/oswnfsiostat/server4_nfsiostat_20.04.23.1900.dat nfsiostat 1 3 1
oracle   22705 19770  0 19:47 ?        00:00:00 /bin/sh ./xtop.sh /opt/oracle.ahf/data/repository/suptools/server4/oswbb/oracle/archive/oswtop/server4_top_20.04.23.1900.dat 1
oracle   22730 13566  0 19:47 pts/3    00:00:00 grep --color=auto -i osw
[oracle@server4 ~]$  

I don’t know what went wrong with my environment, but I do know this happens consistently with my single instance deployments. I’ll raise a service request with support to see if they can get the problem fixed.

Happy Troubleshooting!

Creating a new disk group for use with ASM Filter Driver on the command line in Oracle 19c

In my previous post I shared my surprise when I learned that calling gridSetup.sh 19c for use with Oracle ASM Filter Driver (ASMFD) required me to specify the names of the native block devices. This is definitely different from installing ASM with ASMLib where you pass ASM disks as “ORCL:diskname” to the installer.

Um, that’s great, but why did I write this post? Well, once the installation/configuration steps are completed you most likely need to create at least a second disk group. In my case that’s going to be RECO, for use with the Fast Recovery Area (FRA). This post details the necessary steps to get there, as they are different compared to the initial call to gridSetup.sh.

And while I might sound like a broken record, I would like to remind you that I’m not endorsing ASM Filter Driver. I merely found the documentation unclear in some places, and this post hopes to clarify certain aspects around the use of ASMFD. Pleae remember that ASMFD is new-ish technology and it’s up to every user to apply industry best known methods to ensure everything works as expected.

My environment

The lab environment hasn’t changed, I’m still using the same Oracle Linux 7.7 KVM VM I prepared for use with the last post. Storage is still made accessible via the virtio driver. The VM boots into the Red Hat Kernel.

Previously I installed the base release, Oracle Restart 19.3.0. Since the base release has been made available, quite a few issues have been addressed in later Release Updates (RU). To keep up with the latest fixes my system has since been patched to 19.6.0. Oracle 19.6.0 was the current RU at the time of writing.

Creating a new disk group

Since I’m using Ansible for most things these days I had to come up with a straight-forward method of creating a disk group. ASM has shipped with ASM Configuration Assistant (asmca) for quite a while now, and it can be used to create a disk group in a simple, elegant call (link to documentation). I could of course have created the disk group in sqlplus but this would have required a lot more typing, and I’m inherently lazy.

Unlike the initial call to gridSetup.sh where you pass native block devices along with a request to configure ASMFD, the steps for creating the disk group require you to label the disks beforehand. This is pretty trivial, and more importantly, easy to automate with Ansible.

Labeling the disks

As per my earlier post, I’m planning on using /dev/vde1 and /dev/vdf1 for RECO. The first step is to label the disks. The call is similar to the one you read about earlier:

[root@server4 ~]# . oraenv
ORACLE_SID = [+ASM] ? +ASM
The Oracle base remains unchanged with value /u01/app/grid
[root@server4 ~]# asmcmd afd_label RECO1 /dev/vde1
[root@server4 ~]# asmcmd afd_label RECO2 /dev/vdf1
[root@server4 ~]# asmcmd afd_lslbl
--------------------------------------------------------------------------------
Label                     Duplicate  Path
================================================================================
DATA1                                 /dev/vdc1
DATA2                                 /dev/vdd1
RECO1                                 /dev/vde1
RECO2                                 /dev/vdf1
[root@server4 ~]#  

Note the absence of the “–init” flag when invoking asmcmd afd_label … The way I understand it, this flag is used only during the initial installation.

Creating the disk group

Once the disks are labeled, you can create the disk group. Using the documentation reference I shared earlier I ended up with this call to asmca:

[grid@server4 ~]$ asmca -silent \
> -createDiskGroup -diskGroupName RECO \
> -disk 'AFD:RECO*' -redundancy EXTERNAL \
> -au_size 4 -compatible.asm 19.0.0 -compatible.rdbms 19.0.0

[DBT-30001] Disk groups created successfully. Check /u01/app/grid/cfgtoollogs/asmca/asmca-200402AM115509.log for details.

[grid@server4 ~]$  

Thanks to ASMFD I don’t have to specify individual disks, I can simply tell it to use all disks that go by the name of RECO* – RECO1 and RECO2 in this example. The actual number of ASM disks doesn’t matter using this call, again helping me automate the process.

This this environment is exclusively used for Oracle 19c I can safely set the compatibility to 19c both for ASM as well as the database. Refer to the ASM documentation for further information about the disk group compatibility properties.

Verification

The output of the command indicates success, so let’s have a look at the ASM configuration:

SQL> select d.name, d.path, d.library, dg.name
  2  from v$asm_disk d left join v$asm_diskgroup dg on (dg.group_number = d.group_number)
  3  where dg.name = 'RECO';

NAME       PATH            LIBRARY                                                      NAME
---------- --------------- ------------------------------------------------------------ ----------
RECO1      AFD:RECO1       AFD Library - Generic , version 3 (KABI_V3)                  RECO
RECO2      AFD:RECO2       AFD Library - Generic , version 3 (KABI_V3)                  RECO

SQL> select d.name, dg.name, dg.compatibility, dg.database_compatibility
  2  from v$asm_disk d left join v$asm_diskgroup dg on (dg.group_number = d.group_number)
  3  where dg.name = 'RECO';

NAME       NAME       COMPATIBILITY        DATABASE_COMPATIBILI
---------- ---------- -------------------- --------------------
RECO1      RECO       19.0.0.0.0           19.0.0.0.0
RECO2      RECO       19.0.0.0.0           19.0.0.0.0 

This seems to have worked. I can also see the disk groups registered in Clusterware:

[grid@server4 ~]$ crsctl stat res -t -w "TYPE == ora.diskgroup.type"
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE       server4                  STABLE
ora.RECO.dg
               ONLINE  ONLINE       server4                  STABLE
--------------------------------------------------------------------------------
[grid@server4 ~]$ 

Silent installation: Oracle Restart 19c, ASM Filter Driver, RHCK edition

As promised in the earlier post here are my notes about installing Oracle Restart 19c on Oracle Linux 7.7 using the RedHat compatible kernel (RHCK). Please consult the ACFS/ASMFD compatibility matrix, My Oracle Support DocID 1369107.1 for the latest information about ASMFD compatibility with various kernels as well.

Why am I starting the series with a seemingly “odd” kernel, at least from the point of view of Oracle Linux? If you try to install the Oracle Restart base release with UEK 5, you get strange error messages back from gridSetup telling you about invalid ASM disks. While that’s probably true, it’s a secondary error. The main cause of the problem is this:

[root@server5 bin]# ./afddriverstate supported
AFD-620: AFD is not supported on this operating system version: '4.14.35-1902.300.11.el7uek.x86_64'
AFD-9201: Not Supported
AFD-9294: updating file /etc/sysconfig/oracledrivers.conf 

Which is easy to run into since gridSetup.sh doesn’t validate this for you when running in silent mode. The GUI version of the installer protects you from the mistake though. Upgrading to the latest UEK 5 doesn’t change this message, you need to check the certification matrix to learn that Oracle Restart 19.4.0 and later are required for UEK 5 if you’d like to use ASMFD (or ACFS for that matter). This scenario will be covered in a later post.

Using the Red Hat Compatible Kernel alleviates this problem for me. Just be aware of the usual caveats when using the Red Hat Kernel on Oracle Linux such as YUM changing the default kernel during yum upgrade etc. I’d also like to iterate that this post isn’t an endorsement for ASM Filter Driver, but since the documentation was a little unclear I thought I’d write up how I got to a working installation. It is up to you to ensure that ASMFD is a workable solution for your environment by following industry best known practices.

Configuration Options

In the post introducing this series I claimed to have identified 2 options for installing Oracle Restart 19c using ASMFD: the first one is to use UDEV to prepare ASM block devices, the second one is to label the ASM disks using asmcmd afd_label.

Huh, UDEV? That hasn’t really been blogged about at all in the context of ASMFD, or at least I didn’t find anyone who did. I’m inferring the possibility of using UDEV from “Configuring Oracle ASM Filter Driver During Installation” (link to documentation):

“If you do not use udev on the system where the Oracle Grid Infrastructure is installed, then you can also complete the following procedure to provision disks for Oracle ASMFD before the installer is launched”

You actually only have to choose one of them. Let’s start with the more frequently covered approach of labelling disks using asmcmd.

My environment

I have applied all the patches to this environment up to March 26th to my lab enviroment. The Oracle Linux release I’m using is 7.7:

[root@server4 ~]# cat /etc/oracle-release
Oracle Linux Server release 7.7 

The KVM VM I’m using for this blog post uses the latest Red Hat Compatible Kernel at the time of writing (kernel-3.10.0-1062.18.1.el7.x86_64). You will notice that I’m using the virtio driver, leading to “strange” device names. Instead of /dev/sd it’s /dev/vd. My first two block devices are reserved for the O/S and Oracle, the remaining ones are going to be used for ASM. I have an old (bad?) habit of partitioning block devices for ASM as you might notice. Most of the Oracle setup is done by the 19c preinstall RPM, which I used.

I created a grid owner – grid – to own the Oracle Restart installation. Quite a few blog posts I came across referenced group membership, and I’d like to do the same:

[root@server4 ~]# id -a grid 
uid=54322(grid) gid=54321(oinstall) groups=54321(oinstall),54322(dba),54328(asmadmin),54327(asmdba) 

The block devices I’m intending to use for ASM are /dev/vdc to /dev/vdf – the first 2 are intended for +DATA, the other 2 will become part of +RECO. As you can see they are partitioned:

[root@server4 ~]# lsblk --ascii
NAME                  MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
vdf                   251:80   0   10G  0 disk 
`-vdf1                251:81   0   10G  0 part 
vdd                   251:48   0   10G  0 disk 
`-vdd1                251:49   0   10G  0 part 
vdb                   251:16   0   50G  0 disk 
`-vdb1                251:17   0   50G  0 part 
  `-oraclevg-orabinlv 252:2    0   50G  0 lvm  /u01
sr0                    11:0    1 1024M  0 rom  
vde                   251:64   0   10G  0 disk 
`-vde1                251:65   0   10G  0 part 
vdc                   251:32   0   10G  0 disk 
`-vdc1                251:33   0   10G  0 part 
vda                   251:0    0   12G  0 disk 
|-vda2                251:2    0 11.5G  0 part 
| |-rootvg-swaplv     252:1    0  768M  0 lvm  [SWAP]
| `-rootvg-rootlv     252:0    0 10.8G  0 lvm  /
`-vda1                251:1    0  500M  0 part /boot  

With all that out of the way it is time to cover the installation.

Labeling disks

I’m following the procedure documented in the 19c Administrator’s Guide chapter 20, section “Configuring Oracle ASM Filter Driver During Installation”. I have prepared my environment up to the step where I’d have to launch gridSetup.sh. This is a fairly well known process, and I won’t repeat it here.

Once the 19c install image has been extracted to my future Grid Home, the first step is to check if my system is supported:

[root@server4 ~]# cd /u01/app/grid/product/19.0.0/grid/bin
[root@server4 bin]# ./afddriverstate supported
AFD-9200: Supported 
[root@server4 bin]# uname -r
3.10.0-1062.18.1.el7.x86_64 

“AFD-9200: Supported” tells me that I can start labeling disks. This requires me to be root, and I have to set ORACLE_HOME and ORACLE_BASE. For some reason, the documentation suggests using /tmp as ORACLE_BASE, which I’ll use as well:

[root@server4 bin]# pwd
/u01/app/grid/product/19.0.0/grid/bin
[root@server4 bin]# export ORACLE_BASE=/tmp
[root@server4 bin]# export ORACLE_HOME=/u01/app/grid/product/19.0.0/grid
[root@server4 bin]# ./asmcmd afd_label DATA1 /dev/vdc1 --init
[root@server4 bin]# ./asmcmd afd_label DATA2 /dev/vdd1 --init 

[root@server4 bin]# ./asmcmd afd_lslbl /dev/vdc1
--------------------------------------------------------------------------------
Label                     Duplicate  Path
================================================================================
DATA1                                 /dev/vdc1

[root@server4 bin]# ./asmcmd afd_lslbl /dev/vdd1
--------------------------------------------------------------------------------
Label                     Duplicate  Path
================================================================================
DATA2                                 /dev/vdd1  

Note the use of the –init flag. This is only needed if Grid Infrastructure isn’t installed yet.

Labeling the disks did not have an effect on the block devices’ permissions. Right after finishing the 2 calls to label my 2 block devices, this is the output from my file system:

[root@server4 bin]# ls -l /dev/vd[c-d]*
brw-rw----. 1 root disk 252, 32 Mar 27 09:46 /dev/vdc
brw-rw----. 1 root disk 252, 33 Mar 27 12:55 /dev/vdc1
brw-rw----. 1 root disk 252, 48 Mar 27 09:46 /dev/vdd
brw-rw----. 1 root disk 252, 49 Mar 27 12:58 /dev/vdd1
[root@server4 bin]#  

The output of afd_lslbl indicated that both of my disks are ready to become part of an ASM disk group, so let’s start the installer.

Call gridSetup.sh

I haven’t been able to make sense of the options in the response file until I started the installer in GUI mode and created a response file based on my choices. To cut a long story short, here is my call to gridSetup.sh:

[grid@server4 ~]$ /u01/app/grid/product/19.0.0/grid/gridSetup.sh -silent \
> INVENTORY_LOCATION=/u01/app/oraInventory \
> SELECTED_LANGUAGES=en \
> ORACLE_BASE=/u01/app/grid \
> ORACLE_HOME_NAME=ASMFD_RHCK \
> -waitforcompletion -ignorePrereqFailure -lenientInstallMode \
> oracle.install.option=HA_CONFIG \
> oracle.install.asm.OSDBA=asmdba \
> oracle.install.asm.OSASM=asmadmin \
> oracle.install.asm.diskGroup.name=DATA \
> oracle.install.asm.diskGroup.disks=/dev/vdc1,/dev/vdd1 \
> oracle.install.asm.diskGroup.diskDiscoveryString=/dev/vd* \
> oracle.install.asm.diskGroup.redundancy=EXTERNAL \
> oracle.install.asm.diskGroup.AUSize=4 \
> oracle.install.asm.configureAFD=true \
> oracle.install.crs.rootconfig.executeRootScript=false \
> oracle.install.asm.SYSASMPassword=thinkOfASuperSecretPassword \
> oracle.install.asm.monitorPassword=thinkOfASuperSecretPassword
Launching Oracle Grid Infrastructure Setup Wizard...

The response file for this session can be found at:
 /u01/app/grid/product/19.0.0/grid/install/response/grid_2020-03-27_01-06-14PM.rsp

You can find the log of this install session at:
 /tmp/GridSetupActions2020-03-27_01-06-14PM/gridSetupActions2020-03-27_01-06-14PM.log

As a root user, execute the following script(s):
        1. /u01/app/oraInventory/orainstRoot.sh
        2. /u01/app/grid/product/19.0.0/grid/root.sh

Execute /u01/app/grid/product/19.0.0/grid/root.sh on the following nodes:
[server4]

Successfully Setup Software.
As install user, execute the following command to complete the configuration.
/u01/app/grid/product/19.0.0/grid/gridSetup.sh -executeConfigTools -responseFile /u01/app/grid/product/19.0.0/grid/install/response/grid_2020-03-27_01-06-14PM.rsp [-silent]
Note: The required passwords need to be included in the response file.
Moved the install session logs to:
/u01/app/oraInventory/logs/GridSetupActions2020-03-27_01-06-14PM
[grid@server4 ~]$

It took a little while to work out that despite labeling the disks for ASMFD I didn’t have to put any reference to AFD into the call to gridSetup.sh. Have a look at the ASM disk string and the block devices: that’s what I’d use if I were using UDEV rules for device name persistence. The syntax might appear counter-intuitive. However there’s a “configureAFD” flag you need to set to true.

Since this is a lab environment I’m ok with external redundancy. Make sure you pick a redundancy level appropriate for your use case.

Running the configuration tools

The remaining steps are identical to a non ASMFD setup. First you run orainstRoot.sh followed by root.sh. The output of the latter showed this for me, indicating success:

[root@server4 ~]# /u01/app/grid/product/19.0.0/grid/root.sh
Check /u01/app/grid/product/19.0.0/grid/install/root_server4_2020-03-27_13-11-05-865019723.log for the output of root script

[root@server4 ~]#
[root@server4 ~]# cat /u01/app/grid/product/19.0.0/grid/install/root_server4_2020-03-27_13-11-05-865019723.log
Performing root user operation.

The following environment variables are set as:
    ORACLE_OWNER= grid
    ORACLE_HOME=  /u01/app/grid/product/19.0.0/grid
   Copying dbhome to /usr/local/bin ...
   Copying oraenv to /usr/local/bin ...
   Copying coraenv to /usr/local/bin ...


Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/grid/product/19.0.0/grid/crs/install/crsconfig_params
The log of current session can be found at:
  /u01/app/grid/crsdata/server4/crsconfig/roothas_2020-03-27_01-11-06PM.log
2020/03/27 13:11:13 CLSRSC-363: User ignored prerequisites during installation
LOCAL ADD MODE
Creating OCR keys for user 'grid', privgrp 'oinstall'..
Operation successful.
LOCAL ONLY MODE
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
CRS-4664: Node server4 successfully pinned.
2020/03/27 13:13:55 CLSRSC-330: Adding Clusterware entries to file 'oracle-ohasd.service'

server4     2020/03/27 13:16:59     /u01/app/grid/crsdata/server4/olr/backup_20200327_131659.olr     724960844
2020/03/27 13:17:54 CLSRSC-327: Successfully configured Oracle Restart for a standalone server
[root@server4 ~]# 

Well that looks ok, now on to the final step, configuration! As indicated in the output, you need to update the response (/u01/app/grid/product/19.0.0/grid/install/response/grid_2020-03-27_01-06-14PM.rsp) file with the required passwords. For me that was oracle.install.asm.monitorPassword and oracle.install.asm.SYSASMPassword. Once the response file was updated, I called gridSetup.sh once again:

[grid@server4 ~]$ /u01/app/grid/product/19.0.0/grid/gridSetup.sh -executeConfigTools -responseFile /u01/app/grid/product/19.0.0/grid/install/response/grid_2020-03-27_01-06-14PM.rsp -silent
Launching Oracle Grid Infrastructure Setup Wizard...

You can find the logs of this session at:
/u01/app/oraInventory/logs/GridSetupActions2020-03-27_01-20-47PM

You can find the log of this install session at:
 /u01/app/oraInventory/logs/UpdateNodeList2020-03-27_01-20-47PM.log
Successfully Configured Software. 

And that’s it! The software has been configured successfully. Don’t forget to remove the passwords from the response file!

Verification

After a little while I have been able to configure Oracle Restart 19c/ASMFD on Oracle Linux 7.7/RHCK. Let’s check what this implies.

I’ll first look at the status of ASM Filter Driver:

[grid@server4 ~]$ . oraenv
ORACLE_SID = [grid] ? +ASM
The Oracle base has been set to /u01/app/grid
[grid@server4 ~]$ afddriverstate installed
AFD-9203: AFD device driver installed status: 'true'
[grid@server4 ~]$ afddriverstate loaded
AFD-9205: AFD device driver loaded status: 'true'
[grid@server4 ~]$ afddriverstate version
AFD-9325:     Driver OS kernel version = 3.10.0-862.el7.x86_64.
AFD-9326:     Driver build number = 190222.
AFD-9212:     Driver build version = 19.0.0.0.0.
AFD-9547:     Driver available build number = 190222.
AFD-9548:     Driver available build version = 19.0.0.0.0.
[grid@server4 ~]$  

That’s encouraging: ASMFD is loaded and works on top of kernel-3.10 (RHCK)

I am indeed using the base release (and have to patch now!)

[grid@server4 ~]$ $ORACLE_HOME/OPatch/opatch lspatches
29585399;OCW RELEASE UPDATE 19.3.0.0.0 (29585399)
29517247;ACFS RELEASE UPDATE 19.3.0.0.0 (29517247)
29517242;Database Release Update : 19.3.0.0.190416 (29517242)
29401763;TOMCAT RELEASE UPDATE 19.0.0.0.0 (29401763)

OPatch succeeded. 

And … I’m also using ASMFD:

SQL> col name for a20
SQL> col path for a10
SQL> col library for a50
SQL> set lines 120
SQL> select name, path, library from v$asm_disk where group_number <> 0;

NAME                 PATH       LIBRARY
-------------------- ---------- --------------------------------------------------
DATA1                AFD:DATA1  AFD Library - Generic , version 3 (KABI_V3)
DATA2                AFD:DATA2  AFD Library - Generic , version 3 (KABI_V3)

SQL> show parameter asm_diskstring

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
asm_diskstring                       string      /dev/vd*, AFD:*
SQL>  

This concludes the setup of my lab environment.

Vagrant tips’n’tricks: changing /etc/hosts automatically for Oracle Universal Installer

Oracle Universal Installer, or OUI for short, doesn’t at all like it if the hostname resolves to an IP address in the 127.0.0.0/0 range. At best it complains, at worst it starts installing and configuring software only to abort and bury the real cause deep in the logs.

I am a great fan of HashiCorp’s Vagrant as you might have guessed reading some of the previous articles, and as such wanted a scripted solution to changing the hostname to something more sensible before I begin provisioning software. I should probably add that I’m using my own base boxes; the techniques in this post should equally apply to other boxes as well.

Each of the Vagrant VMs I’m creating is given a private network for communication with its peers. This is mainly done to prevent me from having to deal with port forwarding on the NAT device. If you haven’t used Vagrant before you might not know that by default, each Vagrant VM will come up with a single NIC that has to use NAT. The end goal for this post is to ensure that my VM’s hostname maps to the private network’s IP address, not 127.0.0.1 as it would normally do.

Setting the scene

By default, Vagrant doesn’t seem to mess with the hostname of the VM. This can be changed by using a configuration variable. Let’s start with the Vagrantfile for my Oracle Linux 7 box:

# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure("2") do |config|
  config.vm.define "ol7guest" do |u|
    # this is a base box I created and stored locally
    u.vm.box = "oracleLinux7Base"

    u.ssh.private_key_path = "/path/to/key"

    u.vm.hostname = "ol7guest"
    u.vm.network "private_network", ip: "192.168.56.204"

    u.vm.provider "virtualbox" do |v|
      v.memory = 2048
      v.name = "ol7guest"
      v.cpus = 1
    end
  end
end 

Please ignore the fact that my Vagrantfile is slightly more complex than it needs to be. I do like having speaking names for my VMs, rather than “default” showing up in vagrant status. Using this terminology in the Vagrantfile also makes it easier to add more VMs to the configuration should I so need.

Apart from you just read the only remarkable thing to mention about this file is this line:

    u.vm.hostname = "ol7guest"

As per the Vagrant documentation, I can use this directive to set the hostname of the VM. And indeed, it does:

$ vagrant ssh ol7guest
Last login: Thu Jan 09 21:14:59 2020 from 10.0.2.2
[vagrant@ol7guest ~]$  

The hostname is set, however it resolves to 127.0.0.1 as per /etc/hosts:

[vagrant@ol7guest ~]$ cat /etc/hosts
127.0.0.1    ol7guest    ol7guest
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6 

Not quite what I had in mind, but apparently expected behaviour. So the next step is to change the first line in /etc/hosts to match the private IP address I assigned to the second NIC. As an Ansible fan I am naturally leaning towards using a playbook, but I also understand that not everyone has Ansible installed on the host and using the ansible_local provisioner might take longer than necessary unless your box has Ansible pre-installed.

The remainder of this post deals with an Ansible solution and the least common denominator, the shell provisioner.

Using an Ansible playbook

Many times I’m using Ansible playbooks to deploy software to Vagrant VMs anyway, so embedding a little piece of code into my playbooks to change /etc/hosts isn’t a lot of work. The first step is to amend the Vagrantfile to reference the Ansible provisioner. One possible way to do this in the context of my example is this:

# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure("2") do |config|
  config.vm.define "ol7guest" do |u|
    # this is a base box I created and stored locally
    u.vm.box = "oracleLinux7Base"

    u.ssh.private_key_path = "/path/to/key"

    u.vm.hostname = "ol7guest"
    u.vm.network "private_network", ip: "192.168.56.204"

    u.vm.provision "ansible" do |ansible|
      ansible.playbook = "change_etc_hosts.yml"
      ansible.verbose = "v"
    end

    u.vm.provider "virtualbox" do |v|
      v.memory = 2048
      v.name = "ol7guest"
      v.cpus = 1
    end
  end
end  

It is mostly the same file with the addition of the call to Ansible. As you can imagine the playbook is rather simple:

---
- hosts: ol7guest
  become: yes
  tasks:
  - name: change /etc/hosts
    lineinfile:
      path: '/etc/hosts'
      regexp: '.*ol7guest.*' 
      line: '192.168.56.204   ol7guest.example.com   ol7guest' 
      backup: yes

It uses the lineinfile module to find lines containing ol7guest and replaces that line with the “correct” IP address. The resulting hosts file is exactly what I need:

[vagrant@ol7guest ~]$ cat /etc/hosts
192.168.56.204   ol7guest.example.com   ol7guest
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
[vagrant@ol7guest ~]$ 

The first line of the original file has been replaced with the private IP which should enable OUI to progress past this potential stumbling block.

Using the shell provisioner

The second solution involves the shell provisioner, which – unlike Ansible – isn’t distribution agnostic and needs to be tailored to the target platform. On Oracle Linux, the following worked for me:

# -*- mode: ruby -*-
# vi: set ft=ruby :

$script = <<-SCRIPT
/usr/bin/cp /etc/hosts /root && \
/usr/bin/sed -ie '/ol7guest/d' /etc/hosts && \
/usr/bin/echo '192.168.56.204 ol7guest.example.com ol7guest' >> /etc/hosts
SCRIPT

Vagrant.configure("2") do |config|
  config.vm.define "ol7guest" do |u|
    # this is a base box I created and stored locally
    u.vm.box = "oracleLinux7Base"

    u.ssh.private_key_path = "/path/to/key"

    u.vm.hostname = "ol7guest"
    u.vm.network "private_network", ip: "192.168.56.204"

    u.vm.provision "shell", inline: $script

    u.vm.provider "virtualbox" do |v|
      v.memory = 2048
      v.name = "ol7guest"
      v.cpus = 1
    end
  end
end 

The script copies /etc/hosts to root’s home directory and then changes it to match my needs. At the end, the file is in exactly the shape I need it to be in.

Summary

Whether you go with the shell provisioner or embed the change to the hostname in an (existing) Ansible playbook doesn’t matter much. I would definitely argue in support of having the code embedded in a playbook if that’s what will provision additional software anyways. If installing Ansible on the host isn’t an option, using the shell as a fallback mechanism is perfectly fine, too. Happy hacking!