Using colplot to visualise performance data

Back in 2011 I wrote a blog post about colplot but at that time focused on running the plot engine backed by a web server. However some people might not want to take this approach, and thinking about security it might not be the best idea in the world anyway. A port that isn’t opened can’t be scanned for vulnerabilities…

So what is colplot anyway? And why this follow-up to a 7 year old post?

Some background

Back in the day I learned about collectl: a small, light-weight performance monitoring tool that’s easily portable to many platforms. Collectl is very versatile and has capabilities to record a lot of detail, especially on Linux. Collectl comes with a companion, colplot, which I introduced in said post in 2011.

A typical use case – at least for me – is to start collectl recording, do some work, stop recording, and plotting the results. This can be handy if you don’t get permission to install TFA (think “OSWatcher”) at short notice, and still need performance data!

In this blog post I’d like to share how you could go about doing this. I am using Debian as my workstation O/S and Oracle Linux to capture data.

The workload

I wanted a really simple workload, so for lack of ingenuity I came up with the idea of running fio to drive storage and a tiny bit of CPU.

I am running a very basic random read I/O test and record it with collectl. It is important to understand that it’s your responsibility to define what you want to capture with regards to the plots you want to generate. I’ll explain this in a bit, but for now it’ll have to suffice that I need detail data for CPU and disk.

Before kicking off the load generator I start collectl to record data in (P)lot mode, saving data to my data subdirectory:

[oracle@server1 ~]$ collectl -sCD -f /home/oracle/data -P

(Another, probably cleverer option is to record the data without the -P switch, then replay the recording and writing the output to a file in plot format. That way you can have a lot more control over the process and as an added advantage have the raw data)

Once the workload has completed, I can transfer the data captured to my workstation for analysis with colplot. For each detail switch you select (C and D in the example) you’ll get a (compressed) file:

$ ls -l data
total 84
-rw-r--r-- 1 oracle oinstall 21790 Oct 30 08:26 server1-20181030.cpu
-rw-r--r-- 1 oracle oinstall 59357 Oct 30 08:26 server1-20181030.dsk

Plotting details

I imagine most people use the web-frontend to colplot, but that doesn’t mean there aren’t other ways of generating visualisations of your performance data. But first of all, what plots can you create? This depends on the version of colplot in use, for me it’s this:

$ colplot --version
colplot: V5.1.0, gnuplot: V:5.0[png,x11]

Copyright 2004-2014 Hewlett-Packard Development Company, L.P.
colplot may be copied only under the terms of either the Artistic License
or the GNU General Public License, which may be found in the source kit

Getting back to the list of plots supported in my version, it’s super impressive!

$ colplot -showplots
cpu         cpu     s  Cpu Utilization. Other
cpu20       cpu     s  Cpu Utilization, 0-20%
cpu80       cpu     s  Cpu Utilization, 80-100%
cpudold     cpu     s  Old format ony reports individual fields
cpumid      cpu     s  Cpu Utilization, 40-60%
cpumore     cpu     s  Additional types of use
cpuold      cpu     s  Old format ony reports individual fields
loadavg     cpu     s  Load Averages for 1,5,15 min
disk        disk    s  Disk Summary
diskio      disk    s  Disk I/O
disksize    disk    s  Bandwidth and transfer sizes
elan        elan    s  Quadrics ELAN Bandwidth
elanio      elan    s  Quadrics ELAN Packet Rates
ib          ib      s  Infiniband Bandwidth
ibio        ib      s  Infiniband Packet Rates
ibsize      ib      s  Bandwidth and transfer sizes
nvidia      import  s  nvidia GPU stats
inode       inode   s  Inode Summary
luscltio    lustre  s  Lustre Client Summary, I/O only
cltmeta     lustre  s  Lustre Client Meta Summary
cltreada    lustre  s  Lustre Client Read-Ahead Summary
lusmds      lustre  s  lustre Lustre MDS Summary
lusoss      lustre  s  Lustre OSS Data Rates
ossio       lustre  s  Lustre OSS I/Os
faults      mem     s  Page Faults
mem         mem     s  Memory
memanon     mem     s  Anon Memory
membuf      mem     s  Buffered Memory
memcache    mem     s  Cached Memory
memdirty    mem     s  Dirty Memory
memmap      mem     s  Mapped Memory
memslab     mem     s  Slab Memory
paging      mem     s  Paging
swap        mem     s  Swap Utilization
misc1       misc    s  Miscellaneous ctrs from '--import misc'
misc2       misc    s  CPU Frequency from '--import misc'
net         net     s  Network Summary
netpkt      net     s  Network packets
netsize     net     s  Bandwidth and transfer sizes
nfsV2c      nfs     s  older NFS V2 Client Summary
nfsV2s      nfs     s  older NFS V2 Server Summary
nfsV3c      nfs     s  older NFS V3 Client Summary
nfsV3s      nfs     s  older NFS V3 Server Summary
nfsV4c      nfs     s  older NFS V4 Client Summary
nfsV4s      nfs     s  older NFS V4 Server Summary
nfsmeta     nfs     s   NFS Metadata and Commits
nfsrpc      nfs     s  NFS RPC Summary
nfssum      nfs     s   NFS Aggregate Summary Data
ctxint      proc    s  Context and Interruputs
proc        proc    s  Processes
sock        sock    s  Socket Usage
accaudt     swift   s  Account Auditor
accreap     swift   s  Account Reaper
accrepl     swift   s  Account Replicator
accsrvr     swift   s  Account Server
conaudt     swift   s  Container Auditor
conrepl     swift   s  Container Replicator
consrvr     swift   s  Container Server
consync     swift   s  Container Sync
conupdt     swift   s  Container Updater
objaudt     swift   s  Object Auditor
objexpr     swift   s  Object Expirer
objrepl     swift   s  Object Replicator
objsrv2     swift   s  Object Server2
objsrvr     swift   s  Object Server
objupdt     swift   s  Object Updater
prxyacc     swift   s  Proxy Account
prxycon     swift   s  Proxy Container
prxyobj     swift   s  Proxy Object
tcp         tcp     s  TCP errors count summary
tcpold      tcp     s  old TCP acks & packet failures
cpudet      cpu     d  Cpu Details, Other
cpuint      cpu     d  Interrupts by Cpu
cpumored    cpu     d  Additional types of use
diskdet     disk    d  Disk Details
diskdsize   disk    d  Disk I/O Size Details
diskque     disk    d  Disk request queue depth
disktimes   disk    d  Disk wait/service times
diskutil    disk    d  Disk utilization
fans        env     d  Fan Speeds
power       env     d  Power Sensor
temps       env     d  Temperature Sensors
ibdet       inter   d  IB interconnect detail data
ibdsize     inter   d  IB packet size detail
elandio     inter   d  Elan interconnect IOs (get/put/comp)
elandmb     inter   d  Elan interconnect MBs (get/put/comp)
cltdet      lustre  d  Lustre Client FS I/O Detail
cltdetL     lustre  d  Lustre Client OST I/O Detail
ossdet      lustre  d  Lustre OSS Detail
netdet      net     d  Network Details
netdsize    net     d  Network Packet Size Details
nfsV2cd     nfs     d  NFS Version 2 Client Detail
nfsV2sd     nfs     d  NFS Version 2 Server Detail
nfsV3cd     nfs     d  NFS Version 3 Client Detail
nfsV3sd     nfs     d  NFS Version 3 Server Detail
nfsV4cd     nfs     d  NFS Version 4 Client Detail
nfsV4sd     nfs     d  NFS Version 4 Server Detail
cltbrw      macro      Lustre Client BRW stats
cltbrwD     macro      Lustre Client BRW detail stats
detall      macro      All detail plots except nfs and lustre
detlus      macro      Lustre detail plots (there can be a LOT of these!)
detnfs      macro      NFS detail plots, colplot only
inter       macro      Interconnect summary plots
interdet    macro      Interconnect detail plots
lusblkDR    macro      Lustre Block I/O read detail stats (there can be LOTS of these!)
lusblkDW    macro      Lustre Block I/O write detail stats (there can be LOTS of these!)
lusblkR     macro      Lustre Block I/O read summary stats
lusblkW     macro      Lustre Block I/O write summary stats
misc        macro      All misc counters from '--import misc'
ossbrw      macro      Lustre OSS BRW stats
ossbrwD     macro      Lustre OSS BRW detail stats
sumall      macro      All summary plots, except nfs client/server stats
sumlus      macro      Lustre summary plots for clients, mds and oss
summin      macro      Minimal set of summary plots (cpu, disk, mem and disk
sumnfs      macro      NFS summary plots, colplot only

I don’t think I promised too much!

One thing to keep in mind though: you cannot plot charts for which you don’t have data. For example, if I wanted to plot CPU summary data, colplot would tell that it can’t:

$ colplot -dir /home/vagrant/data -plots cpu
No plottable files match your selection criteria.
Are your dir, selection dates and/or file protections right?

I am certain I have set the permissions correctly, but I also know that I didn’t capture CPU summary information (this would be lower-case “c”, as opposed to upper case “C” for the detailed recording). I suggest you run a few tests until you are comfortable with collectl’s command line switches to avoid later disappointment when trying to plot :)

With the collected performance data transferred to ~/data and now can plot some CPU and disk details:

$ colplot -dir /home/vagrant/data -plots cpudet,diskdet \
> -filedir /home/vagrant/data --time 08:15:00-08:25:00
Your Plot(s) have been written to /home/vagrant/data/5376-colplot.pdf

The resulting file created by this particular command is a PDF. I like this format simply because it’s easy to store it for later reference. I also wanted to limit the plots to a specific time, otherwise my little 5m test would have been hardly noticeable.

This is what it looks like, please don’t try to read anything from the charts, they are included for illustration purposes only, taken from a lab VM without any resemblance to a real-world system.

colplot example

It’s also possible to plot interactively by omitting the -filedir switch in colplot. Output is generated in your X-session and you can export it in different formats.

There is of course more to colplot than I could show in a single post, but I hope I have managed to give you a first impression.

Happy benchmarking!