jump to navigation

Tuning KVM with hugepages agosto 4, 2011

Posted by krustynet in IT.
add a comment

KVM (Kernel Virtual Machine) is becoming one of the Main Virtualization solutions because of his simplicity, performance and reliability.

It is part of almost any Linux Distro around today and after the aquisition by Redhat it is growing fast as the main competitor of Vmware.

Instead of explaining how to install and configure it’s basics, as this argument is vastly documented on the internet, I decided to write how to squeeze the maximum performance out of it.

From the host point of view a kvm virtual machine is a common process which follow the normal scheduling and memory allocation mechanism, but in the enterprise it could happen that the guest allocates plenty of memory and use a lot of cpu, which could lead to some performance problems.

Linux by default uses 4 kb memory pages which mean that if a guest has 16 gb of memory you have around 4 milion pages.

And here we have the problem, as you probably know each process thinks to be alone on the system, and use a defined address space, the operating system, to permit more processes to run at the same time, creates a virtual memory space where it maps the physical memory to the virtual one which the process see.

Each time a process need to read or write data to memory a translation happen from virtual to physical memory. To speed up this process the modern cpu usually has a so called TLB (Translation lookaside buffer) where the most recent referred memory pages translations are saved, if the needed address is in the TLB you have a TLB hit, the physical address is returned and the process go on, if the page is not in the tlb ( which by the way isn’t that big) you have a miss, and a so called pagewalk occurs.

Pagewalk is a really expensive, in terms of cpu time, activity so it can impact the performance.

In the example before where the guest has 16 GB of ram the cpu has to check 4 Milion-TLB size pages.

how we can reduce TLB misses ? Clearly using bigger memory pages size.

Linux as an enterprise operating system has the so called Hugepages, which when activated instead of having 4kb memory pages provide 2Mb ones.

So if we use 2Mb pages we have 8000 instead of 4Milion pages for the 16 Gb guest which greatly reduce TLB misses and can improve performance for memory intensive guests.

So after this introduction we can actually implement it..

I personally used Ubuntu 11.04, but basically it will work in all recent Distros.

              Reserve HugePages

As root add the following lines to the /etc/systcl.conf file:

###### HugePages #########
vm.nr_hugepages = 1000

This actually will reserve 1000 hugepages

1000*2MB = 2GB

if you need more, like in our example 16 GB

16000MB/ 2= 8000

so:

###### HugePages #########
vm.nr_hugepages = 8000

at this point reboot the system

Login as root

execute cat /proc/meminfo

you should see

HugePages_Total:    1000
HugePages_Free:     1000

it means that now you have reserved 1000 2MB pages

Mount the Mermory !?!?

Create a folder

mkdir /HugePageKvm

mount -t hugetlbfs hugetlbfs /HugePageKvm

check if it’s working:

mount

you should see:

hugetlbfs on /HugePageKvm type hugetlbfs (rw)

add the folowing line to the /etc/fstab so that after reboot the mount will automatically done.

hugetlbfs       /HugePageKvm  hugetlbfs       defaults        0 0

Uninstall apparmor ( I got an error otherwise, any suggestion is welcome )

apt-get purge apparmor

Modify the Guest XML file:

Adding the following lines

<memoryBacking>
    <hugepages/>
</memoryBacking>

The file should look similar to the following:

<domain type=’kvm’>
  <name>centos-6</name>
  <uuid>62cf9ccd-1964-6b06-9546-b8ce178dccd5</uuid>
  <memory>1048576</memory>
  <currentMemory>1048576</currentMemory>
  <memoryBacking>
    <hugepages/>
  </memoryBacking>
  <vcpu cpuset=’1′>1</vcpu>
  <os>
    <type arch=’x86_64′ machine=’pc-0.14′>hvm</type>
    <boot dev=’hd’/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <clock offset=’utc’/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/bin/kvm</emulator>
    <disk type=’block’ device=’disk’>
      <driver name=’qemu’ type=’raw’ cache=’none’/>
      <source dev=’/dev/VMFS/centos-6′/>
      <target dev=’vda’ bus=’virtio’/>
      <address type=’pci’ domain=’0×0000′ bus=’0×00′ slot=’0×05′ function=’0×0′/>
    </disk>
    <disk type=’block’ device=’cdrom’>
      <driver name=’qemu’ type=’raw’/>
      <target dev=’hdc’ bus=’ide’/>
      <readonly/>
      <address type=’drive’ controller=’0′ bus=’1′ unit=’0′/>
    </disk>
    <disk type=’block’ device=’disk’>
      <driver name=’qemu’ type=’raw’ cache=’none’/>
      <source dev=’/dev/VMFS/Shared’/>
      <target dev=’vdb’ bus=’virtio’/>
      <shareable/>
      <address type=’pci’ domain=’0×0000′ bus=’0×00′ slot=’0×07′ function=’0×0′/>
    </disk>
    <controller type=’ide’ index=’0′>
      <address type=’pci’ domain=’0×0000′ bus=’0×00′ slot=’0×01′ function=’0×1′/>
    </controller>
    <interface type=’network’>
      <mac address=’52:54:00:83:96:3d’/>
      <source network=’default’/>
      <model type=’virtio’/>
      <address type=’pci’ domain=’0×0000′ bus=’0×00′ slot=’0×03′ function=’0×0′/>
    </interface>
    <serial type=’pty’>
      <target port=’0′/>
    </serial>
    <console type=’pty’>
      <target type=’serial’ port=’0′/>
    </console>
    <input type=’tablet’ bus=’usb’/>
    <input type=’mouse’ bus=’ps2′/>
    <graphics type=’vnc’ port=’-1′ autoport=’yes’/>
    <sound model=’ac97′>
      <address type=’pci’ domain=’0×0000′ bus=’0×00′ slot=’0×04′ function=’0×0′/>
    </sound>
    <video>
      <model type=’cirrus’ vram=’9216′ heads=’1′/>
      <address type=’pci’ domain=’0×0000′ bus=’0×00′ slot=’0×02′ function=’0×0′/>
    </video>
    <memballoon model=’virtio’>
      <address type=’pci’ domain=’0×0000′ bus=’0×00′ slot=’0×06′ function=’0×0′/>
    </memballoon>
  </devices>
</domain>

 

Start the virtual Machine

In my case:

virsh start centos-6

check if the machine is using Hugepages

cat /proc/meminfo

HugePages_Total:    1000
HugePages_Free:      480

you should se that the number of free pages is reduced, in my case I’m using 520 huge pages.

 

Any suggestion or question is welcome

Some Benchmark

 

 

Iscriviti

Get every new post delivered to your Inbox.