Monthly Archives: June 2013

Troubleshooting high load average on Linux

This morning I was quite surprised to see one of my lab servers suffering from a high load average. To be fair I haven’t looked at it for a while, it’s just one of these demo systems ticking along … However this was an interesting case!

High load averages are usually first felt when you get sluggish response from the terminal. I got very sluggish response but attributed it to the SSH connection. Nope, it turned out that the other VMs were a lot more responsive, and I know that they are on the same host.

A few options exist in Linux to see what is going on. The uptime command is a pretty good first indicator.

# uptime
 09:14:05 up 28 days, 11:29,  1 user,  load average: 19.13, 19.15, 19.13

OK so this has been going on for a little while (hey remember its my lab server!). But what’s contributing? The next stop was top for me. Here’s the abridged output:

# top -b -n 1 | head
top - 03:11:40 up 28 days, 11:32,  1 user,  load average: 19.08, 19.15, 19.13
Tasks: 217 total,   1 running, 215 sleeping,   0 stopped,   1 zombie
Cpu0  :  5.5%us,  2.9%sy,  0.0%ni, 91.0%id,  0.1%wa,  0.0%hi,  0.5%si,  0.0%st
Cpu1  :  1.5%us,  1.2%sy,  0.0%ni, 97.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  1.1%us,  0.9%sy,  0.0%ni, 98.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :  0.8%us,  0.7%sy,  0.0%ni, 98.4%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu4  :  1.5%us,  1.1%sy,  0.0%ni, 97.3%id,  0.2%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu5  :  1.1%us,  0.9%sy,  0.0%ni, 98.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu6  :  0.8%us,  0.7%sy,  0.0%ni, 98.5%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu7  :  0.6%us,  0.6%sy,  0.0%ni, 98.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu8  :  2.9%us,  1.8%sy,  0.0%ni, 95.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
...
Swap: 4194304k total,        0k used, 4194304k free, 4194302k cached

The interesting bit here is that the CPUs are mostly idle. No single process consumed more than a few percent of CPU, most a lot less actually. Only 1 zombie process listed so that’s not an issue either. In 2.4 kernels zombie processes could cause problems because they a) sometimes couldn’t be killed (“super zombies”) and b) easily could hog a CPU. No swap is in use which in combination with the absence of kswapd activity allowed me to discard a memory shortage.

I have seen high load averages in the past as the result of database activity (lots of sessions with lots of nested loop joins hammering the CPUs with access requests to the same set of hot blocks) or swapping. Both could be ruled out here. Swapping would most likely show as high %SYS percentages. Additionally poorly performing Oracle queries would also show up using %USR cycles and list high in the output of top.

So maybe it’s something else then?

So it had to be something else then. If the CPUs are idle then maybe what’s contributing to the load is a “hung” process with high priority or maybe something generating a lot of I/O.

Two approaches came to my mind:

  • using iotop
  • using top in batch mode and filtering processes in suspicious states

The output of iotop was not revealing any further clues to the problem so I carried on with a suggestion I found using my favorite search engine. The suggestion was to filter the output of top for processes in state “D”. According to the man page of top, the following states are defined

       w: S  --  Process Status
          The status of the task which can be one of:
             ’D’ = uninterruptible sleep
             ’R’ = running
             ’S’ = sleeping
             ’T’ = traced or stopped
             ’Z’ = zombie

So a process in state D is in uninterruptible sleep-that does not sound too good because if anything is in an uninterruptible sleep then it hogs the CPU. Running the following on the server revealed the culprits very quickly:

> top -n 1 -b | egrep "PID|updatedb"
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 1535 root      39  19  3868  796  524 D  0.0  0.0   0:00.03 /usr/bin/updatedb -f ...
 7138 root      39  19  3868  788  524 D  0.0  0.0   0:00.02 /usr/bin/updatedb -f ...
 8246 root      39  19  3868  768  508 D  0.0  0.0   0:00.57 /usr/bin/updatedb -f ...
 9279 root      39  19  3868  788  524 D  0.0  0.0   0:00.03 /usr/bin/updatedb -f ...
12125 root      39  19  3868  772  508 D  0.0  0.0   0:00.02 /usr/bin/updatedb -f ...
12860 root      39  19  3868  772  508 D  0.0  0.0   0:00.03 /usr/bin/updatedb -f ...
13480 root      39  19  3868  788  524 D  0.0  0.0   0:00.03 /usr/bin/updatedb -f ...
14632 root      39  19  3868  796  524 D  0.0  0.0   0:00.02 /usr/bin/updatedb -f ...
16745 root      39  19  3868  788  524 D  0.0  0.0   0:00.03 /usr/bin/updatedb -f ...
18962 root      20   0 61192  716  624 S  0.0  0.0   0:00.00 egrep PID|updatedb
19052 root      39  19  3868  788  524 D  0.0  0.0   0:00.00 /usr/bin/updatedb -f ...
19988 root      39  19  3868  792  524 D  0.0  0.0   0:00.01 /usr/bin/updatedb -f ...
21520 root      39  19  3868  796  524 D  0.0  0.0   0:00.03 /usr/bin/updatedb -f ...
26830 root      39  19  3868  788  524 D  0.0  0.0   0:00.02 /usr/bin/updatedb -f ...
27192 root      39  19  3868  768  508 D  0.0  0.0   0:00.02 /usr/bin/updatedb -f ...
28528 root      39  19  3868  796  524 D  0.0  0.0   0:00.02 /usr/bin/updatedb -f ...
29751 root      39  19  3868  768  508 D  0.0  0.0   0:00.02 /usr/bin/updatedb -f ...
32080 root      39  19  3868  772  508 D  0.0  0.0   0:00.02 /usr/bin/updatedb -f ...
32386 root      39  19  3868  792  524 D  0.0  0.0   0:00.02 /usr/bin/updatedb -f ...
32649 root      39  19  3868  792  524 D  0.0  0.0   0:00.02 /usr/bin/updatedb -f ...

There are 19 of these processes occupying the machine. Although they literally don’t do anything, their process state is worrying me. And I only have 8 threads in total, so 19 is definitely too high a number. I decided to use a kill -9 approach removing the processes from the system. At worst it would leave me with a stale index for use with locate. And sure enough the load average went down.

# uptime
 09:18:22 up 28 days, 13:03,  2 users,  load average: 0.53, 3.72, 11.54

RESULT!

Oh, I should have said that updatedb is part of the findutils-locate package, updatedb indexes your disk allowing you to run “locate -i ” to find a file without running a find command across “/”. The call to updatedb is scheduled in /etc/cron.daily/mlocate.cron. I didn’t spend time to find out why the processes got stuck

An introduction to Policy Managed Databases in 11.2 RAC

I just realised this week that I haven’t really detailed anything about policy managed RAC databases. I remembered having done some research about server pools way back when 11.2.0.1 came out. I promised to spend some time looking at the new type of database that comes with server pools: policy managed databases but somehow didn’t get around to doing it. Since I’m lazy I’ll refer to these databases as PMDs from now on as it saves a fair bit of typing.

So how are PMDs different from Administrator Managed Databases?

First of all you can have PMDs with RAC only, i.e. in a multi-instance active/active configuration. Before 11.2 RAC you had to tie an Oracle instance to a cluster node. This is why you see instance prefixes in a RAC spfile. Here is an example from my lab 11.2.0.3.6 cluster:

DEMO1.__db_cache_size=1073741824
[...]
DEMO2.__streams_pool_size=0
*.audit_file_dest='/u01/app/oracle/admin/DEMO/adump'
*.audit_trail='db'
*.cluster_database=true
*.compatible='11.2.0.0.0'
*.control_files='...'
*.db_block_size=8192
*.db_create_file_dest='+DATA'
*.db_domain=''
*.db_name='DEMO'
*.db_recovery_file_dest='+RECO'
*.db_recovery_file_dest_size=4558159872
*.diagnostic_dest='/u01/app/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=DEMOXDB)'
DEMO2.instance_number=2
DEMO1.instance_number=1
*.nls_language='ENGLISH'
*.nls_territory='UNITED KINGDOM'
*.open_cursors=300
*.pga_aggregate_target=310378496
*.processes=300
*.remote_listener='rac11gr2scan.example.com:1521'
*.remote_login_passwordfile='exclusive'
*.sessions=335
*.sga_target=1610612736
DEMO1.thread=1
DEMO2.thread=2
DEMO1.undo_tablespace='UNDOTBS2'
DEMO2.undo_tablespace='UNDOTBS1'

Note that the instance_number, thread and undo tablespace are manually (=administrator) managed. If these aren’t set or configured incorrectly you will run into all sorts of fun. Continue reading