I have been assessing the “best” storage option for Oracle databases deployed on Infrastructure as a Service (IaaS) VMs quite a few times now. As part of the discussion DBAs often pose the question whether using Linux’s Logical Volume Manager (LVM) is a good choice for data files.
It can be, depending on the circumstances but there are a few things to consider. This short series of posts tries to highlight some of them. Should I find the time I’ll write another post about using Oracle Automatic Storage Management (ASM) and how it’s different from LVM.
- Deploying I/O intensive workloads in the cloud: don’t fall for the LVM trap
- Deploying I/O intensive workloads in the cloud: LVM RAID
- Deploying I/O intensive workloads in the cloud: mdadm (aka Software) RAID
Please note this article does not concern itself with the durability of block devices in the cloud. I’ll leave it at this: in the cloud, you have a lot less control over the block devices you get, so make sure you have appropriate protection methods in place to guarantee your databases’ RTO and RPO.
LVM basics: a quick refresher
The main point of LVM as I see it is to dynamically grow file systems. Instead of creating file systems on partitioned block devices, with LVM you create them on top of Logical Volumes. To work with LVM you need to aggregate one or more so-called Physical Volumes (usually partitions on a block device) into a Volume Group. Once that’s done you can carve out Logical Volumes from the Volume Group. File systems such as ext4 and xfs are created on top of the Logical Volume (LV) just as you would on block devices:
+-------------------------------------------------------------+ | VG1 | VG2 | free | +-------------------------------------------------------------+ | VG1 | +-------------------------------------------------------------+ | PV1 | PV2 | PV3 | +-------------------------------------------------------------+
Logical Volumes can be resized as long as there is free space in the Volume Group. Should the Volume Group run out of space you can extend it, too, by adding a new Physical Volume to it.
That’s really a high-level overview, there is plenty of LVM specific documentation out there I recommend you have a look at if you want to learn more.
LVM is perfect for installing software
LVM is nothing short of perfect for installing software: running out of space is almost impossible when using LVM. I always use Logical Volumes (LVs) to install the Oracle binaries. Should I run out of space I can simply grow the LV and the file system – job done. Nowadays it is possible to resize the file system online, provided the necessary precautions (BACKUP!) are in place.
LVM for Oracle Data Files
The situation changes when talking about storing Oracle database data files or other I/O intensive workloads. The basic idea for using LVM with Oracle data files is the same: being able to dynamically grow a mount point if the existing storage allocation is insufficient. Which is mostly a cosmetic thing, there is nothing wrong with creating a new block device and using it as the source for data files in tablespaces.
If you create a Volume Group (VG) on top of 2 block devices (for the sake of argument) and thought you might benefit of a SAME (stripe and mirror everything) approach you might be disappointed to hear that you won’t (if you use the defaults).
Consider the following example of my lab VM. I installed Oracle 19.12.0 on an OCI VM.Standard.E4.Flex instance running Oracle Linux 8.4/UEK6 with all patches up to August 03, 2021. Please note this is just an example, what I’m demonstrating next is true for systems running on-premises and the cloud.
[opc@oracle-19c-fs ~]$ sudo vgs -o +vg_attr | egrep 'ora|VG' VG #PV #LV #SN Attr VSize VFree Attr orabin_vg 1 1 0 wz--n- <100.00g 0 wz--n- oradata_vg 2 1 0 wz--n- 1.95t 0 wz--n- orareco_vg 1 1 0 wz--n- <1000.00g 0 wz--n-
As you can see I have 3 VGs:
- orabin_vg holds the (single instance) binaries
- oradata_vg will be used for data files
- orareco_vg … well you got it.
You may have noticed that oradata_vg
is the only VG with 2 PVs, which is confirmed in the following output:
[opc@oracle-19c-fs ~]$ sudo pvs -o +pv_attr | egrep 'PV|oradata' PV VG Fmt Attr PSize PFree Attr /dev/sdb1 oradata_vg lvm2 a-- <1000.00g 0 a-- /dev/sdd1 oradata_vg lvm2 a-- <1000.00g 0 a-- [opc@oracle-19c-fs ~]$
As soon as I start running Swingbench against a database with a vastly undersized SGA forcing physical I/O, iostat
draws as rather unpleasant picture:
[opc@oracle-19c-fs ~]$ iostat -xmz 5 Linux 5.4.17-2102.203.6.el8uek.x86_64 (oracle-19c-fs) 04/08/21 _x86_64_ (16 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 14.47 0.01 4.26 6.11 0.05 75.10 Device r/s w/s rMB/s wMB/s rrqm/s wrqm/s ... %util sda 1.63 1.12 0.05 0.03 0.03 0.42 ... 0.15 dm-0 1.59 0.95 0.04 0.02 0.00 0.00 ... 0.13 dm-1 0.02 0.58 0.00 0.01 0.00 0.00 ... 0.02 sdb 85.70 2903.19 0.94 25.58 0.00 0.22 ... 71.99 sdc 0.02 0.00 0.00 0.00 0.00 0.00 ... 0.00 sdd 5.67 287.03 0.02 2.27 0.00 0.84 ... 33.25 sde 0.50 0.12 0.02 0.00 0.00 0.03 ... 0.04 dm-2 0.49 0.15 0.02 0.00 0.00 0.00 ... 0.04 dm-3 0.01 0.00 0.00 0.00 0.00 0.00 ... 0.00 dm-4 91.35 3191.47 0.97 27.85 0.00 0.00 ... 72.02 avg-cpu: %user %nice %system %iowait %steal %idle 20.25 0.00 6.01 8.44 0.10 65.20 Device r/s w/s rMB/s wMB/s rrqm/s wrqm/s ... %util sda 0.00 0.20 0.00 0.00 0.00 0.00 ... 0.04 dm-0 0.00 0.20 0.00 0.00 0.00 0.00 ... 0.04 sdb 9.40 4108.80 0.08 37.29 0.00 0.20 ... 99.70 sdd 0.00 0.40 0.00 0.01 0.00 0.00 ... 0.08 dm-4 9.40 4109.80 0.08 37.29 0.00 0.00 ... 99.70 avg-cpu: %user %nice %system %iowait %steal %idle 20.85 0.00 6.00 5.75 0.06 67.34 Device r/s w/s rMB/s wMB/s rrqm/s wrqm/s ... %util sda 0.00 1.40 0.00 0.01 0.00 0.20 ... 0.06 dm-0 0.00 0.20 0.00 0.00 0.00 0.00 ... 0.02 dm-1 0.00 1.40 0.00 0.01 0.00 0.00 ... 0.04 sdb 9.80 4098.40 0.10 36.41 0.00 1.20 ... 99.86 sdd 0.00 1.20 0.00 0.02 0.00 0.00 ... 0.20 dm-4 9.80 4100.60 0.10 36.43 0.00 0.00 ... 99.84 ^C
This is of course not peak performance when it comes to the block devices, I’m running a low-key workload pumping a few I/O requests here :) You may notice that the only block device really used is /dev/sdb
, the first block device in the Volume Group. /dev/dm-4
is the device mapper representation of the Logical Volume:
[opc@oracle-19c-fs ~]$ ls -l /dev/oradata_vg/ total 0 lrwxrwxrwx. 1 root root 7 Aug 4 07:47 oradata_lv -> ../dm-4 [opc@oracle-19c-fs ~]$
Long story short, my database performs all of its I/O requests on a single block device. From a performance point of view this is usually undesirable. I should add that I did not multiplex online redo logs or the control file into the Fast Recovery Area, creating the perfect storm. Data placement on the Logical Volume also depends on some other factors, however it’s possible for you to see negative impact on performance if you use LVM for data files and sticking to the defaults. You certainly won’t see striping across multiple block devices, which is often needed for better performance. For reference, the oradata_vg
was created using the following Ansible tasks:
- name: hostconfig | partition block devices (data files) parted: device: "{{ item }}" number: 1 state: present align: optimal label: gpt loop: "{{ database_storage_block_devs }}" - name: hostconfig | create an "oradata" volume group lvg: vg: oradata_vg pvs: "{{ database_storage_block_devs | join('1,') }}1" state: present - name: hostconfig | create logical volume on top of oradata_vg lvol: vg: oradata_vg lv: oradata_lv size: 100%VG state: present
What’s next?
In the next couple of articles I’ll describe possible workarounds to the issue of LVM usage and how to get more performance out of the system. Stay tuned!
You must be logged in to post a comment.