Kernel UEK 2 on Oracle Linux 6.2 fixed lab server memory loss

A few days ago I wrote about my new lab server and the misfortune with kernel UEK (aka 2.6.32 + backports). It simply wouldn’t recognise the memory in the server:

# free -m
             total       used       free     shared    buffers     cached
Mem:          3385        426       2958          0          9        233
-/+ buffers/cache:        184       3200
Swap:          511          0        511

Ouch. Today I gave it another go, especially since my new M4 SSD has arrived. My first idea was to upgrade to UEK2. And indeed, following the instructions on Wim Coekaerts’s blog (see references), it worked:

[root@ol62 ~]# uname -a
Linux ol62.localdomain 2.6.39-100.7.1.el6uek.x86_64 #1 SMP Wed May 16 04:04:37 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
[root@ol62 ~]# free -m
             total       used       free     shared    buffers     cached
Mem:         32221        495      31725          0          5         34
-/+ buffers/cache:        456      31764
Swap:          511          0        511

Note the 2.6.39-100.7.1! It’s actually past that and version 3.x, but to preserve compatibility with a lot of software parsing the kernel revision number in 3 tuples Oracle decided to stick with 2.6.39. But then the big distributions don’t really follow the mainstream kernel numbers anyway.

Now if anyone could tell me if UEK2 is out of beta? I know it’s not supported for the database yet, but it’s a cool kernel release and I can finally play around with the “perf” utility Kevin Closson and Frits Hoogland have mentioned so much about recently.

This here is a lot more like it:

top - 13:57:56 up  1:14,  5 users,  load average: 0.22, 0.55, 0.96
Tasks: 240 total,   2 running, 238 sleeping,   0 stopped,   0 zombie
Cpu0  : 31.8%us,  5.3%sy,  0.0%ni, 62.3%id,  0.3%wa,  0.0%hi,  0.3%si,  0.0%st
Cpu1  :  1.8%us,  0.6%sy,  0.0%ni, 97.6%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  0.0%us,  0.3%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :  6.7%us,  4.8%sy,  0.0%ni, 88.5%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu4  :  0.6%us,  0.0%sy,  0.0%ni, 99.4%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu5  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu6  :  2.8%us,  2.0%sy,  0.0%ni, 95.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu7  :  0.3%us,  0.3%sy,  0.0%ni, 99.0%id,  0.3%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu8  :  0.4%us,  0.0%sy,  0.0%ni, 99.6%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu9  :  3.3%us,  6.0%sy,  0.0%ni, 90.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu10 :  0.3%us,  0.7%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu11 :  0.5%us,  0.5%sy,  0.0%ni, 99.1%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu12 : 14.7%us,  1.5%sy,  0.0%ni, 83.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu13 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu14 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu15 :  3.9%us,  5.7%sy,  0.0%ni, 90.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu16 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu17 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu18 :  2.7%us,  1.6%sy,  0.0%ni, 95.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu19 :  1.7%us,  0.7%sy,  0.0%ni, 97.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu20 :  0.0%us,  0.2%sy,  0.0%ni, 99.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu21 :  6.3%us,  4.4%sy,  0.0%ni, 89.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu22 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu23 :  3.9%us,  0.4%sy,  0.0%ni, 95.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  32994384k total, 17078956k used, 15915428k free,   115172k buffers
Swap:   524284k total,        0k used,   524284k free, 14839688k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
14908 oracle    20   0 10.2g 113m 106m S 29.6  0.4   0:00.89 oracleorcl (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
...

And whilst I’m on it-here is the output from collectl during the database creation for a SLOB run on my SSD (sda). It created the 4GB online redo logs at the time.

### RECORD   35 >>> ol62 <<< (1339593441.001)  ###

# SINGLE CPU[HYPER] STATISTICS
#   Cpu  User Nice  Sys Wait IRQ  Soft Steal Idle
      0     0    0    0    0    0    0     0  100
      1     0    0    0    0    0    0     0  100
      2     0    0    0    0    0    0     0  100
      3     0    0    0    0    0    0     0    0
      4     0    0    0    0    0    0     0    0
      5     0    0    0    0    0    0     0    0
      6     0    0    0    0    0    0     0   98
      7     0    0    0    0    0    0     0    0
      8     0    0    0    0    0    0     0  100
      9     0    0    0    0    0    0     0    0
     10     0    0    0    0    0    0     0    0
     11     0    0    0    0    0    0     0    0
     12     0    0    1    0    0    0     0   99
     13     3    0    3    0    0    0     0   92
     14     0    0    0    0    0    0     0  100
     15     0    0    0    0    0    0     0  100
     16     0    0    0    0    0    0     0  100
     17     0    0    0    0    0    0     0    0
     18     0    0    0    0    0    0     0    0
     19     0    0    0    0    0    0     0    0
     20     0    0    0    0    0    0     0    0
     21     0    0    0    0    0    0     0  100
     22     0    0    0    0    0    0     0    0
     23     0    0    0    0    0    0     0    0

# DISK STATISTICS (/sec)
#           Pct
#Name       KBytes Merged  IOs Size  KBytes Merged  IOs Size  RWSize  QLen  Wait SvcTim Util
sda              0      0    0    0  189575      0  554  342     342     2     4      1   95
sdb              0      0    0    0       0      0    0    0       0     0     0      0    0
sdd              0      0    0    0       0      0    0    0       0     0     0      0    0
sde              0      0    0    0       0      0    0    0       0     0     0      0    0
sdc              0      0    0    0       0      0    0    0       0     0     0      0    0
sdf              0      0    0    0       0      0    0    0       0     0     0      0    0

References

Responses

  1. […] direction to the original content. « Performance testing with Virident PCIe SCM Kernel UEK 2 on Oracle Linux 6.2 fixed lab server memory loss […]

  2. Martin, do you know if the transparrent huge pages feature from 2.6.38 is going to be ported back to 2.6.32? Also, will Oracle be able to use it automatically? In 11G, there is a parameter use_large_pages which doesn’t make sense, unless RDBMS wants to take advantage of the huge pages and smaller page tables.
    Also, when I did strace on oracle server process, I noticed that io_submit (linux specific implementation of AIO library) will only submit requests of 1MB in size, regardless of what I specify in db_file_multiblock_read_count. Is there any way to change this given, for instance, that 11G ASM can use IO’s up to 4MB in size.

    1. Hi Mladen!

      Let me try and answer your questions.

      Transparent huge pages: don’t have any information if they are going to be back-ported to UEK 1. But since UEK 2 does it, and 11.2.0.3 is supported on UEK2 I wouldn’t bet on a back port. See here for more information about UEK2: http://oss.oracle.com/ol6/docs/RELEASE-NOTES-UEK2-en.html.

      Actually I don’t even know if Oracle is currently making use of transparent huge pages in its code, although I very much like the idea.

      You are very lucky to see AIO requests of 1MB, this is indeed the upper limit. dm-multipath can break these down to 256/512k chunks by the way, I know that EMC PowerPath doesn’t. Out of interest, where did you read that ASM can do 4 MB?

      Martin

  3. UEKR2 has been out of beta since March 13th: https://blogs.oracle.com/linux/entry/oracle_unbreakable_enterprise_kernel_release

    Unfortunately transparent huge pages can not be used for Oracle SGA, as it uses shared memory. See this thread on the Oracle Linux Forums for more info: https://forums.oracle.com/forums/thread.jspa?threadID=2393003

Blog at WordPress.com.