Container to conveyor: CRI-O is now the default in OpenShift Container Platform 4

Platform Red Hat OpenShift Container Platform 4 allows you to stream the creation hosts to deploy containers, including in the infrastructure of cloud service providers, on virtualization platforms or in bare-metal systems. To create a cloud platform in the full sense, we had to take under tight control all the elements used and thus increase the reliability of a complex automation process.

Container to conveyor: CRI-O is now the default in OpenShift Container Platform 4

The obvious solution was to use Red Hat Enterprise Linux CoreOS (varieties of Red Hat Enterprise Linux) and CRI-O as the standard, and here's why ...

Since the topic of navigation is very good for finding analogies when explaining the work of Kubernetes and containers, let's try to talk about the business problems that CoreOS and CRI-O solve, using an example Brunel's inventions for the production of rigging blocks. In 1803, Mark Brunel was given the task of making 100 rigging blocks for the needs of the growing British navy. A rigging block is a type of rig that is used to attach ropes to sails. Until the very beginning of the 19th century, these blocks were made by hand, but Brunel managed to automate production and began to make standardized blocks using machine tools. The automation of this process meant that the result was that all the blocks were almost the same, could be easily replaced in the event of a breakdown, and could be produced in large quantities.

Now imagine if Brunel had to do this work for 20 different ship models (Kubernetes versions) and for five different planets with completely different sea currents and winds (cloud providers). In addition, it was required that all ships (OpenShift clusters), regardless of the planets that are navigated, behave the same way from the point of view of the captains (operators who control the work of the clusters). Continuing the marine analogy, ship captains absolutely do not care what rigging blocks (CRI-O) are used on their ships - the main thing for them is that these blocks are strong and reliable.

OpenShift 4, as a cloud platform, faces a very similar business challenge. New nodes should be created at the time of cluster creation, in case of failure in one of the nodes, or when scaling the cluster. When creating and initializing a new node, critical host components, including CRI-O, must be configured accordingly. As in any other production, at the beginning it is necessary to submit "raw materials". In the case of ships, metal and wood are used as raw materials. However, in the case of creating a host for deploying containers in an OpenShift 4 cluster, you need to have configuration files and servers provided by the API as input. After that, OpenShift will provide the right level of automation throughout the entire life cycle, offering the necessary product support for end users and thus recouping the investment in the platform.

OpenShift 4 has been designed to provide easy system upgrades throughout the platform's lifecycle (for versions 4.X) for all major cloud providers, virtualization platforms and even bare metal systems. To do this, nodes must be created on the basis of interchangeable elements. When a cluster requires a new version of Kubernetes, it also gets the corresponding version of CRI-O on CoreOS. Since the CRI-O version is tied directly to Kubernetes, this greatly simplifies any swaps for testing, troubleshooting, or support. In addition, this approach reduces costs for end users and Red Hat.

This is a fundamentally new way of looking at Kubernetes clusters and lays the groundwork for planning new, highly useful and attractive features. CRI-O (the Container Runtime Interface - Open Container Initiative, abbreviated as CRI-OCI) proved to be the best choice for the mass node creation required to work with OpenShift. CRI-O will replace the previously used Docker engine, offering users OpenShift economical, stable, simple and boring - yes, you read that right - a boring container engine designed specifically to work with Kubernetes.

The world of open containers

The world has long been moving towards open containers. Whether it's in Kubernetes, or at lower levels, development of container standards leads to the emergence of an ecosystem of innovations at every level.

It all started with the creation of the Open Containers Initiative in june year 2015. At this early stage of work, the specifications of the container image (image) ΠΈ execution environment (runtime). This ensured that the tools could use a single standard. container images and a single format for working with them. Specifications added later distributionallowing users to easily share container images.

The Kubernetes community then developed a single pluggable interface standard called Container Runtime Interface (CRI). Thanks to this, Kubernetes users were able to connect various engines to work with containers in addition to Docker.

Red Hat and Google engineers saw the need in the market for a container engine that could accept requests from Kubelet over the CRI protocol and introduced containers that were compatible with the OCI specifications mentioned above. So appeared OCID. But excuse me, did we say that this material will be devoted to CRI-O? Actually it is, just with the release 1.0 versions the project was renamed to CRI-O.

Fig. 1.

Container to conveyor: CRI-O is now the default in OpenShift Container Platform 4

Innovation with CRI-O and CoreOS

With the launch of the OpenShift 4 platform, the container engine, used by the platform by default, and Docker was replaced by CRI-O, which offered a cost-effective, stable, simple and boring environment for running a container that develops in parallel with Kubernetes. This greatly simplifies cluster maintenance and configuration. Container engine and host configuration and management become automated within OpenShift 4.

Stop, how is it?

That's right, with the advent of OpenShift 4, there is no longer a need to connect to individual hosts and install a container engine, configure storage, configure search servers, or configure a network. The OpenShift 4 platform has been completely redesigned to use the operator framework not only in terms of end-user applications, but also in terms of basic platform-level operations such as deploying images, configuring the system, or installing updates.

Kubernetes has always allowed users to manage applications by defining the desired state and using controllersto ensure that the actual state is as close as possible to the specified state. This set state and actual state approach opens up great opportunities both in terms of development and in terms of operations. Developers can define the required state, pass it on operator in the form of a YAML or JSON file, and then the operator can create the required application instance in the production environment, while the working state of this instance will fully correspond to the specified one.

By using Operators in the platform, OpenShift 4 brings this new paradigm (using the concept of set and actual state) to the management of RHEL CoreOS and CRI-O. The tasks of configuring and managing versions of the operating system and the container engine are automated using the so-called Machine Config Operator (MCO). MCO greatly simplifies the work of the cluster administrator, essentially automating the last stages of installation, as well as subsequent operations after installation (day two operations). All this makes OpenShift 4 a true cloud platform. We will dwell on this a little later.

Running containers

Users have been able to use the CRI-O engine in the OpenShift platform since version 3.7 in Tech Preview status and from version 3.9 in Generally Available status (currently supported). In addition, Red Hat makes extensive use of CRI-O to run production workloads in OpenShift Online since version 3.10. All this allowed the team working on CRI-O to gain vast experience in mass launching containers on large Kubernetes clusters. To get a basic understanding of how Kubernetes uses CRI-O, let's look at the following illustration, which shows how the architecture works.

Rice. 2. How containers work in a Kubernetes cluster

Container to conveyor: CRI-O is now the default in OpenShift Container Platform 4

CRI-O simplifies the creation of new container hosts by synchronizing the entire top level when initializing new nodes, and when new versions of the OpenShift platform are released. The entire platform revision allows for transactional updates/rollbacks and also prevents deadlocks in dependencies between the container tail core, container engine, nodes (Kubelets) and the Kubernetes Master node. By centrally managing all platform components, with versioning and control, you can always trace a clear path from state A to state B. This simplifies the upgrade process, improves security, improves performance reporting, and helps reduce the cost of upgrades and new versions.

Demonstration of the power of interchangeable elements

As mentioned earlier, using the Machine Config Operator to manage the container host and container engine in OpenShift 4 provides a new level of automation that was not previously possible on the Kubernetes platform. To demonstrate the new features, we'll show you how you might make changes to the crio.conf file. In order not to get confused in the terminology, try to concentrate on the results.

First, let's create what is called a Container Runtime Config. Think of it as a Kubernetes resource that represents the configuration for CRI-O. In reality, it is a specialized version of what is called MachineConfig, which is any configuration that is deployed on a RHEL CoreOS machine within an OpenShift cluster.

This custom resource, called ContainerRuntimeConfig, was created to make it easier for cluster administrators to configure CRI-O. This is a powerful enough tool that it can only be applied to certain nodes, depending on the MachineConfigPool settings. Think of it as a group of machines that serve the same purpose.

Notice the last two lines we are going to change in the /etc/crio/crio.conf file. These two lines are very similar to the lines in the crio.conf file, they are:

vi ContainerRuntimeConfig.yaml

Conclusion:

apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
metadata:
 name: set-log-and-pid
spec:
 machineConfigPoolSelector:
   matchLabels:
     debug-crio: config-log-and-pid
 containerRuntimeConfig:
   pidsLimit: 2048
   logLevel: debug

Now let's send this file to the Kubernetes cluster and check that it was actually created. Please note that the work is carried out in the same way as with any other Kubernetes resource:

oc create -f ContainerRuntimeConfig.yaml
oc get ContainerRuntimeConfig

Conclusion:

NAME              AGE
set-log-and-pid   22h

Once we've created the ContainerRuntimeConfig, we need to modify one of the MachineConfigPools to let Kubernetes know that we want to apply this configuration to a specific group of machines in the cluster. In this case, we will change the MachineConfigPool for the master nodes:

oc edit MachineConfigPool/master

Conclusion (for clarity, the main essence is left):

...
metadata:
 creationTimestamp: 2019-04-10T23:42:28Z
 generation: 1
 labels:
   debug-crio: config-log-and-pid
   operator.machineconfiguration.openshift.io/required-for-upgrade: ""
...

At this point, MCO starts creating a new crio.conf file for the cluster. At the same time, a completely finished configuration file can be viewed using the Kubernetes API. Remember, ContainerRuntimeConfig is just a specialized version of MachineConfig, so we can see the result by looking at the relevant lines in MachineConfigs:

oc get MachineConfigs | grep rendered

Conclusion:

rendered-master-c923f24f01a0e38c77a05acfd631910b                  4.0.22-201904011459-dirty 2.2.0 16h
rendered-master-f722b027a98ac5b8e0b41d71e992f626                  4.0.22-201904011459-dirty 2.2.0 4m
rendered-worker-9777325797fe7e74c3f2dd11d359bc62                  4.0.22-201904011459-dirty 2.2.0 16h

Note that the resulting configuration file for the masternodes is a newer version than the original configurations. To view it, run the following command. In passing, we note that this is probably one of the best one-line scripts in the history of Kubernetes:

python3 -c "import sys, urllib.parse; print(urllib.parse.unquote(sys.argv[1]))" $(oc get MachineConfig/rendered-master-f722b027a98ac5b8e0b41d71e992f626 -o YAML | grep -B4 crio.conf | grep source | tail -n 1 | cut -d, -f2) | grep pid

Conclusion:

pids_limit = 2048

Now let's make sure that the configuration has been applied to all master nodes. First, get a list of nodes in the cluster:

oc get node | grep master

Output:

ip-10-0-135-153.us-east-2.compute.internal   Ready master 23h v1.12.4+509916ce1

ip-10-0-154-0.us-east-2.compute.internal     Ready master 23h v1.12.4+509916ce1

ip-10-0-166-79.us-east-2.compute.internal    Ready master 23h v1.12.4+509916ce1

Now let's view the installed file. You will see that the file has been updated with the new values ​​of the pid and debug directives that we specified in the ContainerRuntimeConfig resource. Elegance itself:

oc debug node/ip-10-0-135-153.us-east-2.compute.internal β€” cat /host/etc/crio/crio.conf | egrep 'debug||pid’

Conclusion:

...
pids_limit = 2048
...
log_level = "debug"
...

All of these changes to the cluster were made even without running SSH. All the work was done by calling the Kuberentes master node. That is, these new parameters were configured only on the master nodes. The worker nodes did not change, which demonstrates the benefits of the Kubernetes methodology using set and actual states in relation to hosts of containers and container engines with interchangeable elements.

The example above shows the ability to make changes to a small OpenShift Container Platform 4 cluster with three production nodes or a huge production cluster with 3000 nodes. In any case, the amount of work will be the same - and very small - just configure the ContainerRuntimeConfig file, and change one label (label) in the MachineConfigPool. And you can do this with any version of the OpenShift Container Platform 4.X used in Kubernetes throughout its lifecycle.

Often, technology companies grow so fast that we can't explain why we choose certain technologies for core components. Container engines have historically been the component with which users interact directly. Since the popularity of containers naturally began with the advent of container engines, users often show interest in them. This is another reason why Red Hat opted for CRI-O. Containers are evolving with the focus today on orchestration and we have found that CRI-O provides the best experience when working with OpenShift 4.

Source: habr.com

Add a comment