At the beginning of this month, on May 3, a major release of the "management system for distributed data storage in Kubernetes" was announced - Rook 1.0.0. Over a year ago we published general overview of Rook. Then we were asked to talk about his experience use in practice - and now, just in time for such a significant milestone in the history of the project, we are happy to share our impressions.
In short, Rook is a set operators for Kubernetes, which take full control of the deployment, management, automatic recovery of storage solutions such as Ceph, EdgeFS, Minio, Cassandra, CockroachDB.
Note: Among the significant changes in the release of Rook 1.0.0 related to Ceph, we can note support for Ceph Nautilus and the ability to use NFS for CephFS or RGW buckets. Of the others, the βmaturationβ of EdgeFS support to the beta level stands out.
So, in this article we:
answer the question, what advantages do we see in using Rook to deploy Ceph in a Kubernetes cluster;
share experience and impressions from using Rook in production;
We will tell you why we say βYes!β to Rook, and about our plans for him.
Let's start with general concepts and theory.
"I have one Rook advantage!" (unknown chess player)
One of the main advantages of Rook is that interaction with data stores is carried out through Kubernetes mechanisms. This means that you no longer need to copy commands to configure Ceph from a piece of paper to the console.
- Do you want to deploy CephFS in a cluster? Just write a YAML file!
- What? Do you want to deploy an object store with S3 API as well? Just write a second YAML file!
Rook was created according to all the rules of a typical operator. It interacts with CRD (Custom Resource Definitions), in which we describe the characteristics of Ceph entities we need (since this is the only stable implementation, the article will talk about Ceph by default, unless otherwise specified). According to the set parameters, the operator will automatically execute the commands necessary for setting up.
Let's look at the specifics using the example of creating an Object Store, or rather - CephObjectStoreUser.
The parameters indicated in the listing are quite standard and hardly need comments, but you should pay special attention to those that are allocated to template variables.
The general scheme of work boils down to the fact that through the YAML file we βorderβ resources, for which the operator executes the necessary commands and returns us a βnot realβ secret with which we can continue to work (see below). And from the variables that are indicated above, the command and the name of the secret will be compiled.
What is this team? When creating a user for object storage, the Rook statement inside the pod will do the following:
radosgw-admin user create --uid="rook-user" --display-name="{{ .Values.s3.username }}"
The result of this command will be a JSON structure:
Keys - what applications will need in the future to access object storage through the S3 API. Rook-operator kindly selects them and puts them in his namespace as a secret with the name rook-ceph-object-user-{{ $.Values.s3.crdName }}-{{ $.Values.s3.username }}.
To use the data from this secret, just add it to the container as environment variables. As an example, I will give a template for Job, in which we automatically create buckets for each user environment:
All actions listed in this Job were performed without going beyond Kubernetes. The structures described in the YAML files are folded into a Git repository and reused many times. We see this as a huge plus for DevOps engineers and the CI / CD process as a whole.
With Rook and Rados in joy
Using the Ceph + RBD bundle imposes certain restrictions on mounting volumes to pods.
In particular, the namespace must contain the secret to access Ceph so that stateful applications can function. It's ok if you have 2-3 environments in your namespaces: you can go ahead and copy the secret by hand. But what if a separate environment with its own namespace is created for each feature for developers?
We have solved this problem by using shell-operator, which automatically copied secrets to new namespaces (an example of such a hook is described in this article).
However, when using Rook, this problem simply does not exist. The mounting process takes place using proprietary drivers based on Flexvolume or CSI (still in beta stage) and therefore does not require secrets.
Rook automatically solves many problems, which is what encourages us to use it in new projects.
Siege of Rook
Let's finish the practical part by deploying Rook and Ceph so that we can conduct our own experiments. In order to make it easier to storm this impregnable tower, the developers have prepared a Helm-package. Let's download it:
In file rook-ceph/values.yaml you can find many different settings. The most important thing is to specify the tolerations for agents and search. What you can use the taints / tolerances mechanism for, we described in detail in this article.
In short, we don't want the client application pods to be on the same nodes as the storage disks. The reason is simple: so the work of Rook agents will not affect the application itself.
So let's open the file. rook-ceph/values.yaml with your favorite editor and add the following block to the end:
Checking the status of Ceph - we expect to see HEALTH_OK:
$ kubectl -n ${ROOK_NAMESPACE} exec $(kubectl -n ${ROOK_NAMESPACE} get pod -l app=rook-ceph-operator -o name -o jsonpath='{.items[0].metadata.name}') -- ceph -s
At the same time, we will check that the pods with the client application do not fall on the nodes reserved for Ceph:
$ kubectl -n ${APPLICATION_NAMESPACE} get pods -o custom-columns=NAME:.metadata.name,NODE:.spec.nodeName
Further optional components are configured as desired. More details about them are in documentation. For administration, we strongly recommend installing dashboard and toolbox.
Rook and hooks: is Rook enough for everything?
As you can see, the development of Rook is in full swing. But there are still problems that do not allow us to completely abandon the manual configuration of Ceph:
No driver Rook can not export metrics on the use of mounted blocks, which deprives us of monitoring.
Flexvolume and CSI do not know how change the size of volumes (unlike the same RBD), so Rook loses a useful (and sometimes critical!) tool.
Rook is still not as flexible as regular Ceph. If we want to configure the pool for CephFS metadata to be stored on the SSD, and the data itself to be stored on the HDD, we will need to register separate device groups in CRUSH maps manually.
Although rook-ceph-operator is considered stable, there are currently some issues when upgrading Ceph from version 13 to 14.
Conclusions
βNow the Rook is closed off from the outside world by pawns, but we believe that one day it will play a decisive role in the game!β (quote made specifically for this article)
The Rook project has undoubtedly won our hearts - we believe that [with all its pluses and minuses] it definitely deserves your attention.
Our future plans are to make rook-ceph a module for addon operator, which will make its use in our numerous Kubernetes clusters even easier and more convenient.