Brief overview and setup of Kata Containers

Brief overview and setup of Kata Containers
This article will discuss how it works Kata Containers, and there will also be a practical part with their connection to Docker.

About common problems with Docker and their solutions already was written, today I will briefly describe the implementation from Kata Containers. Kata Containers is a secure container runtime based on lightweight virtual machines. Working with them is the same as with other containers, but in addition there is a more reliable isolation using hardware virtualization technology. The project began in 2017, when the community of the same name completed the merger of the best ideas from Intel Clear Containers and Hyper.sh RunV, after which work continued on support for various architectures, including AMD64, ARM, IBM p- and z-series. Additionally, work is supported inside the hypervisors QEMU, Firecracker, and there is also integration with containerd. The code is available at GitHub under the MIT license.

Main Features

  • Working with a separate core, thus providing network, memory and I / O isolation, it is possible to force the use of hardware isolation based on virtualization extensions
  • Support for industry standards including OCI (container format), Kubernetes CRI
  • Consistent performance of regular Linux containers, increased isolation without the performance overhead of regular VMs
  • Eliminate the need to run containers inside full-fledged virtual machines, generic interfaces simplify integration and launch

Installation

There is a bunch of installation options, I will consider installing from the repositories, based on the Centos 7 operating system.
It's important: Kata Containers work is supported only on hardware, virtualization forwarding does not always work, also need sse4.1 support from the processor.

Installing Kata Containers is quite simple:

Install utilities for working with repositories:

# yum -y install yum-utils

Disable Selinux (it’s more correct to configure, but for simplicity I disable it):

# setenforce 0
# sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config

We connect the repository and perform the installation

# source /etc/os-release
# ARCH=$(arch)
# BRANCH="${BRANCH:-stable-1.10}"
# yum-config-manager --add-repo "http://download.opensuse.org/repositories/home:/katacontainers:/releases:/${ARCH}:/${BRANCH}/CentOS_${VERSION_ID}/home:katacontainers:releases:${ARCH}:${BRANCH}.repo"
# yum -y install kata-runtime kata-proxy kata-shim

Setting

I will be setting up to work with docker, its installation is typical, I will not describe it in more detail:

# rpm -qa | grep docker
docker-ce-cli-19.03.6-3.el7.x86_64
docker-ce-19.03.6-3.el7.x86_64
# docker -v
Docker version 19.03.6, build 369ce74a3c

We make changes to daemon.json:

# cat <<EOF > /etc/docker/daemon.json
{
  "default-runtime": "kata-runtime",
  "runtimes": {
    "kata-runtime": {
      "path": "/usr/bin/kata-runtime"
    }
  }
}
EOF

Restart docker:

# service docker restart

Functional Testing

If you start the container before restarting docker, you can see that uname will give the version of the kernel running on the main system:

# docker run busybox uname -a
Linux 19efd7188d06 3.10.0-1062.12.1.el7.x86_64 #1 SMP Tue Feb 4 23:02:59 UTC 2020 x86_64 GNU/Linux

After a restart, the kernel version looks like this:

# docker run busybox uname -a
Linux 9dd1f30fe9d4 4.19.86-5.container #1 SMP Sat Feb 22 01:53:14 UTC 2020 x86_64 GNU/Linux

More teams!

# time docker run busybox mount
kataShared on / type 9p (rw,dirsync,nodev,relatime,mmap,access=client,trans=virtio)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev type tmpfs (rw,nosuid,size=65536k,mode=755)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666)
sysfs on /sys type sysfs (ro,nosuid,nodev,noexec,relatime)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,relatime,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (ro,nosuid,nodev,noexec,relatime,xattr,name=systemd)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (ro,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/blkio type cgroup (ro,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/memory type cgroup (ro,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/devices type cgroup (ro,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/perf_event type cgroup (ro,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (ro,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/freezer type cgroup (ro,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/pids type cgroup (ro,nosuid,nodev,noexec,relatime,pids)
cgroup on /sys/fs/cgroup/cpuset type cgroup (ro,nosuid,nodev,noexec,relatime,cpuset)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)
shm on /dev/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=65536k)
kataShared on /etc/resolv.conf type 9p (rw,dirsync,nodev,relatime,mmap,access=client,trans=virtio)
kataShared on /etc/hostname type 9p (rw,dirsync,nodev,relatime,mmap,access=client,trans=virtio)
kataShared on /etc/hosts type 9p (rw,dirsync,nodev,relatime,mmap,access=client,trans=virtio)
proc on /proc/bus type proc (ro,relatime)
proc on /proc/fs type proc (ro,relatime)
proc on /proc/irq type proc (ro,relatime)
proc on /proc/sys type proc (ro,relatime)
tmpfs on /proc/acpi type tmpfs (ro,relatime)
tmpfs on /proc/timer_list type tmpfs (rw,nosuid,size=65536k,mode=755)
tmpfs on /sys/firmware type tmpfs (ro,relatime)

real    0m2.381s
user    0m0.066s
sys 0m0.039s

# time docker run busybox free -m
              total        used        free      shared  buff/cache   available
Mem:           1993          30        1962           0           1        1946
Swap:             0           0           0

real    0m3.297s
user    0m0.086s
sys 0m0.050s

Fast load testing

To assess the losses from virtualization - I run sysbench, as the main examples take this option.

Running sysbench using Docker+containerd

Processor test

sysbench 1.0:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1
Initializing random number generator from current time

Prime numbers limit: 20000

Initializing worker threads...

Threads started!

General statistics:
    total time:                          36.7335s
    total number of events:              10000
    total time taken by event execution: 36.7173s
    response time:
         min:                                  3.43ms
         avg:                                  3.67ms
         max:                                  8.34ms
         approx.  95 percentile:               3.79ms

Threads fairness:
    events (avg/stddev):           10000.0000/0.00
    execution time (avg/stddev):   36.7173/0.00

RAM test

sysbench 1.0:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1
Initializing random number generator from current time

Initializing worker threads...

Threads started!

Operations performed: 104857600 (2172673.64 ops/sec)

102400.00 MiB transferred (2121.75 MiB/sec)

General statistics:
    total time:                          48.2620s
    total number of events:              104857600
    total time taken by event execution: 17.4161s
    response time:
         min:                                  0.00ms
         avg:                                  0.00ms
         max:                                  0.17ms
         approx.  95 percentile:               0.00ms

Threads fairness:
    events (avg/stddev):           104857600.0000/0.00
    execution time (avg/stddev):   17.4161/0.00

Running sysbench using Docker+Kata Containers

Processor test

sysbench 1.0:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1
Initializing random number generator from current time

Prime numbers limit: 20000

Initializing worker threads...

Threads started!

General statistics:
    total time:                          36.5747s
    total number of events:              10000
    total time taken by event execution: 36.5594s
    response time:
         min:                                  3.43ms
         avg:                                  3.66ms
         max:                                  4.93ms
         approx.  95 percentile:               3.77ms

Threads fairness:
    events (avg/stddev):           10000.0000/0.00
    execution time (avg/stddev):   36.5594/0.00

RAM test

sysbench 1.0:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1
Initializing random number generator from current time

Initializing worker threads...

Threads started!

Operations performed: 104857600 (2450366.94 ops/sec)

102400.00 MiB transferred (2392.94 MiB/sec)

General statistics:
    total time:                          42.7926s
    total number of events:              104857600
    total time taken by event execution: 16.1512s
    response time:
         min:                                  0.00ms
         avg:                                  0.00ms
         max:                                  0.43ms
         approx.  95 percentile:               0.00ms

Threads fairness:
    events (avg/stddev):           104857600.0000/0.00
    execution time (avg/stddev):   16.1512/0.00

In principle, the situation is already clear, but it is more optimal to run the tests several times, removing outliers and averaging the results, so I do not do more tests yet.

Conclusions

Despite the fact that such containers take about five to ten times longer to start up (typical run time for similar commands when using containerd is less than a third of a second), they still work quite quickly if we take the absolute start time (there are examples above, commands performed in an average of three seconds). Well, the results of a quick test of CPU and RAM show almost the same results, which cannot but rejoice, especially in light of the fact that isolation is provided using such a well-run mechanism as kvm.

Announcement

The article is a review, but it gives you the opportunity to feel the alternative runtime. Many areas of application are not covered, for example, the site describes the ability to run Kubernetes on top of Kata Containers. Additionally, you can also run a series of tests focused on finding security problems, setting restrictions, and other interesting things.

I ask all those who have read and rewound here to take part in the survey, on which future publications on this topic will depend.

Only registered users can participate in the survey. Sign in, you are welcome.

Should I continue to publish articles about Kata Containers?

  • Present in several = 80,0%Yes, write more!28

  • Present in several = 20,0%No, don't…7

35 users voted. 7 users abstained.

Source: habr.com

Add a comment