General principles of QEMU-KVM operation

General principles of QEMU-KVM operation

My current understanding:

1) KVM

KVM (Kernel-based Virtual Machine) is a hypervisor (VMM - Virtual Machine Manager) operating as a module on Linux OS. A hypervisor is needed in order to run some software in a non-existent (virtual) environment and, at the same time, hide from this software the real physical hardware on which this software runs. The hypervisor acts as a "layer" between the physical hardware (host) and the virtual OS (guest).

Since KVM is a standard module of the Linux kernel, it gets all the perks (memory management, scheduler, etc.) from the kernel. And accordingly, in the end, all these benefits go to the guests (because the guests work on a hypervisor that runs on / in the Linux kernel).

KVM is very fast, but by itself it is not enough to run a virtual OS, because this requires I/O emulation. For I/O (processor, disks, network, video, PCI, USB, serial ports, etc.) KVM uses QEMU.

2) QEMU

QEMU (Quick Emulator) is an emulator of various devices that allows you to run operating systems designed for one architecture on another (for example, ARM –> x86). In addition to the processor, QEMU emulates various peripheral devices: network cards, HDD, video cards, PCI, USB, etc.

It works like this:

Instructions/binary code (for example, ARM) are converted into intermediate platform-independent code using the TCG (Tiny Code Generator) converter, and then this platform-independent binary code is converted into target instructions/code (for example, x86).

ARM –> staging –> x86

In fact, you can run virtual machines on QEMU on any host, even with older processor models that do not support Intel VT-x (Intel Virtualization Technology) / AMD SVM (AMD Secure Virtual Machine). However, in this case, it will work very slowly, due to the fact that the executable binary code needs to be recompiled on the fly twice, using TCG (TCG is a Just-in-Time compiler).

Those. QEMU itself is mega cool, but it works very slowly.

3) Protection rings

General principles of QEMU-KVM operation

Binary program code on processors does not work just like that, but is located at different levels (rings / Protection rings) with different levels of access to data, from the most privileged (Ring 0), to the most limited, regulated and β€œwith tightened screws” (Ring 3 ).

The operating system (OS kernel) runs on Ring 0 (kernel mode) and can do whatever it wants with any data and devices. User applications operate at the Ring 3 level (user mode) and are not entitled to do whatever they want, but instead each time they must request access to perform one or another operation (thus, user applications have access only to their own data and cannot "get" into "someone else's sandbox"). Ring 1 and 2 are for driver use.

Prior to the invention of Intel VT-x / AMD SVM, hypervisors ran on Ring 0 and guests ran on Ring 1. Since Ring 1 does not have enough permissions for the OS to function properly, the hypervisor had to modify this call on the fly with every privileged call from the guest. and execute it on Ring 0 (much like QEMU does). Those. guest binary NOT was executed directly on the processor, and each time it went through several intermediate modifications on the fly.

The overhead was significant and it was a big problem, and then processor manufacturers, independently of each other, released an extended set of instructions (Intel VT-x / AMD SVM) that allowed guest OS code to be executed DIRECTLY on the host processor (bypassing any expensive intermediate steps, as it was before).

With the advent of Intel VT-x / AMD SVM, a special new level Ring -1 (minus one) was created. And now the hypervisor is running on it, and the guests are running on Ring 0 and get privileged access to the CPU.

Those. eventually:

  • host running on Ring 0
  • guests work for Ring 0
  • hypervisor running on Ring -1

4) QEMU-KVM

KVM gives guests access to Ring 0 and uses QEMU to emulate the I/O (processor, disks, network, video, PCI, USB, serial ports, etc. that guests "see" and work with).

Hence QEMU-KVM (or KVM-QEMU) :)

CREDITS
Picture to attract attention
Picture protection rings

PS The text of this article was originally published in the Telegram channel @RU_Voip as an answer to a question from one of the channel members.

Write in the comments where I do not understand the topic correctly or if there is something to add.

Thank you!

Source: habr.com

Add a comment