Back in 2011 I wrote a blog post about colplot but at that time focused on running the plot engine backed by a web server. However some people might not want to take this approach, and thinking about security it might not be the best idea in the world anyway. A port that isn’t opened can’t be scanned for vulnerabilities…
So what is colplot anyway? And why this follow-up to a 7 year old post?
Some background
Back in the day I learned about collectl: a small, light-weight performance monitoring tool that’s easily portable to many platforms. Collectl is very versatile and has capabilities to record a lot of detail, especially on Linux. Collectl comes with a companion, colplot, which I introduced in said post in 2011.
A typical use case – at least for me – is to start collectl recording, do some work, stop recording, and plotting the results. This can be handy if you don’t get permission to install TFA (think “OSWatcher”) at short notice, and still need performance data!
In this blog post I’d like to share how you could go about doing this. I am using Debian as my workstation O/S and Oracle Linux to capture data.
The workload
I wanted a really simple workload, so for lack of ingenuity I came up with the idea of running fio to drive storage and a tiny bit of CPU.
I am running a very basic random read I/O test and record it with collectl. It is important to understand that it’s your responsibility to define what you want to capture with regards to the plots you want to generate. I’ll explain this in a bit, but for now it’ll have to suffice that I need detail data for CPU and disk.
Before kicking off the load generator I start collectl to record data in (P)lot mode, saving data to my data subdirectory:
[oracle@server1 ~]$ collectl -sCD -f /home/oracle/data -P
(Another, probably cleverer option is to record the data without the -P switch, then replay the recording and writing the output to a file in plot format. That way you can have a lot more control over the process and as an added advantage have the raw data)
Once the workload has completed, I can transfer the data captured to my workstation for analysis with colplot. For each detail switch you select (C and D in the example) you’ll get a (compressed) file:
$ ls -l data total 84 -rw-r--r-- 1 oracle oinstall 21790 Oct 30 08:26 server1-20181030.cpu -rw-r--r-- 1 oracle oinstall 59357 Oct 30 08:26 server1-20181030.dsk
Plotting details
I imagine most people use the web-frontend to colplot, but that doesn’t mean there aren’t other ways of generating visualisations of your performance data. But first of all, what plots can you create? This depends on the version of colplot in use, for me it’s this:
$ colplot --version colplot: V5.1.0, gnuplot: V:5.0[png,x11] Copyright 2004-2014 Hewlett-Packard Development Company, L.P. colplot may be copied only under the terms of either the Artistic License or the GNU General Public License, which may be found in the source kit
Getting back to the list of plots supported in my version, it’s super impressive!
$ colplot -showplots cpu cpu s Cpu Utilization. Other cpu20 cpu s Cpu Utilization, 0-20% cpu80 cpu s Cpu Utilization, 80-100% cpudold cpu s Old format ony reports individual fields cpumid cpu s Cpu Utilization, 40-60% cpumore cpu s Additional types of use cpuold cpu s Old format ony reports individual fields loadavg cpu s Load Averages for 1,5,15 min disk disk s Disk Summary diskio disk s Disk I/O disksize disk s Bandwidth and transfer sizes elan elan s Quadrics ELAN Bandwidth elanio elan s Quadrics ELAN Packet Rates ib ib s Infiniband Bandwidth ibio ib s Infiniband Packet Rates ibsize ib s Bandwidth and transfer sizes nvidia import s nvidia GPU stats inode inode s Inode Summary luscltio lustre s Lustre Client Summary, I/O only cltmeta lustre s Lustre Client Meta Summary cltreada lustre s Lustre Client Read-Ahead Summary lusmds lustre s lustre Lustre MDS Summary lusoss lustre s Lustre OSS Data Rates ossio lustre s Lustre OSS I/Os faults mem s Page Faults mem mem s Memory memanon mem s Anon Memory membuf mem s Buffered Memory memcache mem s Cached Memory memdirty mem s Dirty Memory memmap mem s Mapped Memory memslab mem s Slab Memory paging mem s Paging swap mem s Swap Utilization misc1 misc s Miscellaneous ctrs from '--import misc' misc2 misc s CPU Frequency from '--import misc' net net s Network Summary netpkt net s Network packets netsize net s Bandwidth and transfer sizes nfsV2c nfs s older NFS V2 Client Summary nfsV2s nfs s older NFS V2 Server Summary nfsV3c nfs s older NFS V3 Client Summary nfsV3s nfs s older NFS V3 Server Summary nfsV4c nfs s older NFS V4 Client Summary nfsV4s nfs s older NFS V4 Server Summary nfsmeta nfs s NFS Metadata and Commits nfsrpc nfs s NFS RPC Summary nfssum nfs s NFS Aggregate Summary Data ctxint proc s Context and Interruputs proc proc s Processes sock sock s Socket Usage accaudt swift s Account Auditor accreap swift s Account Reaper accrepl swift s Account Replicator accsrvr swift s Account Server conaudt swift s Container Auditor conrepl swift s Container Replicator consrvr swift s Container Server consync swift s Container Sync conupdt swift s Container Updater objaudt swift s Object Auditor objexpr swift s Object Expirer objrepl swift s Object Replicator objsrv2 swift s Object Server2 objsrvr swift s Object Server objupdt swift s Object Updater prxyacc swift s Proxy Account prxycon swift s Proxy Container prxyobj swift s Proxy Object tcp tcp s TCP errors count summary tcpold tcp s old TCP acks & packet failures cpudet cpu d Cpu Details, Other cpuint cpu d Interrupts by Cpu cpumored cpu d Additional types of use diskdet disk d Disk Details diskdsize disk d Disk I/O Size Details diskque disk d Disk request queue depth disktimes disk d Disk wait/service times diskutil disk d Disk utilization fans env d Fan Speeds power env d Power Sensor temps env d Temperature Sensors ibdet inter d IB interconnect detail data ibdsize inter d IB packet size detail elandio inter d Elan interconnect IOs (get/put/comp) elandmb inter d Elan interconnect MBs (get/put/comp) cltdet lustre d Lustre Client FS I/O Detail cltdetL lustre d Lustre Client OST I/O Detail ossdet lustre d Lustre OSS Detail netdet net d Network Details netdsize net d Network Packet Size Details nfsV2cd nfs d NFS Version 2 Client Detail nfsV2sd nfs d NFS Version 2 Server Detail nfsV3cd nfs d NFS Version 3 Client Detail nfsV3sd nfs d NFS Version 3 Server Detail nfsV4cd nfs d NFS Version 4 Client Detail nfsV4sd nfs d NFS Version 4 Server Detail cltbrw macro Lustre Client BRW stats cltbrwD macro Lustre Client BRW detail stats detall macro All detail plots except nfs and lustre detlus macro Lustre detail plots (there can be a LOT of these!) detnfs macro NFS detail plots, colplot only inter macro Interconnect summary plots interdet macro Interconnect detail plots lusblkDR macro Lustre Block I/O read detail stats (there can be LOTS of these!) lusblkDW macro Lustre Block I/O write detail stats (there can be LOTS of these!) lusblkR macro Lustre Block I/O read summary stats lusblkW macro Lustre Block I/O write summary stats misc macro All misc counters from '--import misc' ossbrw macro Lustre OSS BRW stats ossbrwD macro Lustre OSS BRW detail stats sumall macro All summary plots, except nfs client/server stats sumlus macro Lustre summary plots for clients, mds and oss summin macro Minimal set of summary plots (cpu, disk, mem and disk sumnfs macro NFS summary plots, colplot only
I don’t think I promised too much!
One thing to keep in mind though: you cannot plot charts for which you don’t have data. For example, if I wanted to plot CPU summary data, colplot would tell that it can’t:
$ colplot -dir /home/vagrant/data -plots cpu No plottable files match your selection criteria. Are your dir, selection dates and/or file protections right?
I am certain I have set the permissions correctly, but I also know that I didn’t capture CPU summary information (this would be lower-case “c”, as opposed to upper case “C” for the detailed recording). I suggest you run a few tests until you are comfortable with collectl’s command line switches to avoid later disappointment when trying to plot :)
With the collected performance data transferred to ~/data and now can plot some CPU and disk details:
$ colplot -dir /home/vagrant/data -plots cpudet,diskdet \ > -filedir /home/vagrant/data --time 08:15:00-08:25:00 Your Plot(s) have been written to /home/vagrant/data/5376-colplot.pdf
The resulting file created by this particular command is a PDF. I like this format simply because it’s easy to store it for later reference. I also wanted to limit the plots to a specific time, otherwise my little 5m test would have been hardly noticeable.
This is what it looks like, please don’t try to read anything from the charts, they are included for illustration purposes only, taken from a lab VM without any resemblance to a real-world system.
It’s also possible to plot interactively by omitting the -filedir switch in colplot. Output is generated in your X-session and you can export it in different formats.
There is of course more to colplot than I could show in a single post, but I hope I have managed to give you a first impression.
Happy benchmarking!