Container Storage Interface (CSI) is a unified interface between Kubernetes and storage systems. Briefly about him we already , and today we will take a closer look at the combination of CSI and Ceph: we will show how to the Kubernetes cluster.
The article contains real, albeit slightly simplified for ease of perception examples. We do not consider installation and configuration of Ceph and Kubernetes clusters.
Are you wondering how it works?

So, you have a Kubernetes cluster at hand, deployed, for example, . A Ceph cluster works nearby - you can also put it, for example, with this . I hope I don't need to mention that for production, there must be a network between them with a bandwidth of at least 10 Gb / s.
If you have all this, let's go!
First, let's go to one of the nodes of the Ceph cluster and check that everything is in order:
ceph health
ceph -sNext, we will immediately create a pool for RBD disks:
ceph osd pool create kube 32
ceph osd pool application enable kube rbdLet's go to the Kubernetes cluster. There, first of all, we will install the Ceph CSI driver for RBD. We will install, as expected, through Helm.
Add a repository with a chart, get a set of ceph-csi-rbd chart variables:
helm repo add ceph-csi https://ceph.github.io/csi-charts
helm inspect values ceph-csi/ceph-csi-rbd > cephrbd.ymlNow you need to fill in the cephrbd.yml file. To do this, we find out the cluster ID and IP addresses of monitors in Ceph:
ceph fsid # так мы узнаем clusterID
ceph mon dump # а так увидим IP-адреса мониторовThe resulting values are entered into the cephrbd.yml file. Along the way, we enable the creation of PSP policies (Pod Security Policies). Section Options node plugin и provisioner already in the file, they can be corrected as shown below:
csiConfig:
- clusterID: "bcd0d202-fba8-4352-b25d-75c89258d5ab"
monitors:
- "v2:172.18.8.5:3300/0,v1:172.18.8.5:6789/0"
- "v2:172.18.8.6:3300/0,v1:172.18.8.6:6789/0"
- "v2:172.18.8.7:3300/0,v1:172.18.8.7:6789/0"
nodeplugin:
podSecurityPolicy:
enabled: true
provisioner:
podSecurityPolicy:
enabled: trueThen all that remains for us is to install the chart in the Kubernetes cluster.
helm upgrade -i ceph-csi-rbd ceph-csi/ceph-csi-rbd -f cephrbd.yml -n ceph-csi-rbd --create-namespaceGreat, the RBD driver works!
Let's create a new StorageClass in Kubernetes. This again requires some work with Ceph.
Create a new user in Ceph and give him permission to write to the pool be:
ceph auth get-or-create client.rbdkube mon 'profile rbd' osd 'profile rbd pool=kube'And now let's see the access key is still there:
ceph auth get-key client.rbdkubeThe command will output something like this:
AQCO9NJbhYipKRAAMqZsnqqS/T8OYQX20xIa9A==Let's put this value in Secret in the Kubernetes cluster - where we need it userKey:
---
apiVersion: v1
kind: Secret
metadata:
name: csi-rbd-secret
namespace: ceph-csi-rbd
stringData:
# Значения ключей соответствуют имени пользователя и его ключу, как указано в
# кластере Ceph. ID юзера должен иметь доступ к пулу,
# указанному в storage class
userID: rbdkube
userKey: <user-key>And we create our secret:
kubectl apply -f secret.yamlNext, we need something like this StorageClass manifest:
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: csi-rbd-sc
provisioner: rbd.csi.ceph.com
parameters:
clusterID: <cluster-id>
pool: kube
imageFeatures: layering
# Эти секреты должны содержать данные для авторизации
# в ваш пул.
csi.storage.k8s.io/provisioner-secret-name: csi-rbd-secret
csi.storage.k8s.io/provisioner-secret-namespace: ceph-csi-rbd
csi.storage.k8s.io/controller-expand-secret-name: csi-rbd-secret
csi.storage.k8s.io/controller-expand-secret-namespace: ceph-csi-rbd
csi.storage.k8s.io/node-stage-secret-name: csi-rbd-secret
csi.storage.k8s.io/node-stage-secret-namespace: ceph-csi-rbd
csi.storage.k8s.io/fstype: ext4
reclaimPolicy: Delete
allowVolumeExpansion: true
mountOptions:
- discardNeed to fill clusterID, which we have already learned by the team ceph fsid, and apply this manifest on the Kubernetes cluster:
kubectl apply -f storageclass.yamlTo check the operation of clusters in a bundle, let's create the following PVC (Persistent Volume Claim):
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: rbd-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: csi-rbd-scLet's immediately see how Kubernetes created the requested volume in Ceph:
kubectl get pvc
kubectl get pvEverything seems to be great! And what does it look like on the Ceph side?
We get a list of volumes in the pool and view information about our volume:
rbd ls -p kube
rbd -p kube info csi-vol-eb3d257d-8c6c-11ea-bff5-6235e7640653 # тут, конечно же, будет другой ID тома, который выдала предыдущая командаNow let's see how RBD volume resizing works.
Change the volume size in the pvc.yaml manifest to 2Gi and apply it:
kubectl apply -f pvc.yamlLet's wait for the changes to take effect and take another look at the volume size.
rbd -p kube info csi-vol-eb3d257d-8c6c-11ea-bff5-6235e7640653
kubectl get pv
kubectl get pvcWe see that the size of the PVC has not changed. You can ask Kubernetes for a description of the PVC in YAML format to find out why:
kubectl get pvc rbd-pvc -o yamlAnd here is the problem:
message: Waiting for user to (re-)start a pod to finish file system resize of volume on node. type: FileSystemResizePending
That is, the disk has grown, but the file system on it has not.
To expand the file system, you need to mount the volume. In our case, the created PVC / PV is not used in any way now.
We can create a test Pod like this:
---
apiVersion: v1
kind: Pod
metadata:
name: csi-rbd-demo-pod
spec:
containers:
- name: web-server
image: nginx:1.17.6
volumeMounts:
- name: mypvc
mountPath: /data
volumes:
- name: mypvc
persistentVolumeClaim:
claimName: rbd-pvc
readOnly: falseAnd now let's look at PVC:
kubectl get pvcThe size has changed, everything is in order.
In the first part, we worked with the RBD block device (it stands for Rados Block Device), but this cannot be done if you want to work with this disk of different microservices at the same time. For working with files, rather than with a disk image, CephFS is much better suited.
Using the example of Ceph and Kubernetes clusters, we will configure CSI and other necessary entities to work with CephFS.
Let's get the values from the new Helm chart we need:
helm inspect values ceph-csi/ceph-csi-cephfs > cephfs.ymlAgain, you need to fill in the cephfs.yml file. As before, the Ceph commands will help:
ceph fsid
ceph mon dumpWe fill the file with values like this:
csiConfig:
- clusterID: "bcd0d202-fba8-4352-b25d-75c89258d5ab"
monitors:
- "172.18.8.5:6789"
- "172.18.8.6:6789"
- "172.18.8.7:6789"
nodeplugin:
httpMetrics:
enabled: true
containerPort: 8091
podSecurityPolicy:
enabled: true
provisioner:
replicaCount: 1
podSecurityPolicy:
enabled: trueNote that monitor addresses are specified in the simple form address:port. To mount cephfs on the host, these addresses are passed to a kernel module that does not yet know how to work with the monitor protocol v2.
We change the port for httpMetrics (Prometheus will go there for monitoring metrics) so that it does not conflict with nginx-proxy, which is installed by Kubespray. You may not need this.
Install Helm chart in Kubernetes cluster:
helm upgrade -i ceph-csi-cephfs ceph-csi/ceph-csi-cephfs -f cephfs.yml -n ceph-csi-cephfs --create-namespaceLet's move on to the Ceph data store to create a separate user there. The documentation states that the CephFS provider needs cluster administrator access rights. But we will create a separate user fs with limited rights:
ceph auth get-or-create client.fs mon 'allow r' mgr 'allow rw' mds 'allow rws' osd 'allow rw pool=cephfs_data, allow rw pool=cephfs_metadata'And immediately let's see its access key, it will be useful to us further:
ceph auth get-key client.fsLet's create separate Secret and StorageClass.
Nothing new, we have already seen this with the example of RBD:
---
apiVersion: v1
kind: Secret
metadata:
name: csi-cephfs-secret
namespace: ceph-csi-cephfs
stringData:
# Необходимо для динамически создаваемых томов
adminID: fs
adminKey: <вывод предыдущей команды>Applying the manifest:
kubectl apply -f secret.yamlAnd now - a separate StorageClass:
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: csi-cephfs-sc
provisioner: cephfs.csi.ceph.com
parameters:
clusterID: <cluster-id>
# Имя файловой системы CephFS, в которой будет создан том
fsName: cephfs
# (необязательно) Пул Ceph, в котором будут храниться данные тома
# pool: cephfs_data
# (необязательно) Разделенные запятыми опции монтирования для Ceph-fuse
# например:
# fuseMountOptions: debug
# (необязательно) Разделенные запятыми опции монтирования CephFS для ядра
# См. man mount.ceph чтобы узнать список этих опций. Например:
# kernelMountOptions: readdir_max_bytes=1048576,norbytes
# Секреты должны содержать доступы для админа и/или юзера Ceph.
csi.storage.k8s.io/provisioner-secret-name: csi-cephfs-secret
csi.storage.k8s.io/provisioner-secret-namespace: ceph-csi-cephfs
csi.storage.k8s.io/controller-expand-secret-name: csi-cephfs-secret
csi.storage.k8s.io/controller-expand-secret-namespace: ceph-csi-cephfs
csi.storage.k8s.io/node-stage-secret-name: csi-cephfs-secret
csi.storage.k8s.io/node-stage-secret-namespace: ceph-csi-cephfs
# (необязательно) Драйвер может использовать либо ceph-fuse (fuse),
# либо ceph kernelclient (kernel).
# Если не указано, будет использоваться монтирование томов по умолчанию,
# это определяется поиском ceph-fuse и mount.ceph
# mounter: kernel
reclaimPolicy: Delete
allowVolumeExpansion: true
mountOptions:
- debugFill in here clusterID and applicable in Kubernetes:
kubectl apply -f storageclass.yamlinspection
To check, as in the previous example, let's create a PVC:
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: csi-cephfs-pvc
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 5Gi
storageClassName: csi-cephfs-scAnd check for PVC/PV:
kubectl get pvc
kubectl get pvIf you want to look at the files and directories in CephFS, you can mount this file system somewhere. For example, as shown below.
We go to one of the nodes of the Ceph cluster and perform the following actions:
# Точка монтирования
mkdir -p /mnt/cephfs
# Создаём файл с ключом администратора
ceph auth get-key client.admin >/etc/ceph/secret.key
# Добавляем запись в /etc/fstab
# !! Изменяем ip адрес на адрес нашего узла
echo "172.18.8.6:6789:/ /mnt/cephfs ceph name=admin,secretfile=/etc/ceph/secret.key,noatime,_netdev 0 2" >> /etc/fstab
mount /mnt/cephfsOf course, such a FS mount on a Ceph node is only suitable for learning purposes, which is what we do on our . I don't think anyone would do this in production, there's a big risk of accidentally overwriting important files.
And finally, let's check how things work with volume resizing in the case of CephFS. We return to Kubernetes and edit our manifest for PVC - we increase the size there, for example, to 7Gi.
Apply the edited file:
kubectl apply -f pvc.yamlLet's see how the quota has changed on the mounted directory:
getfattr -n ceph.quota.max_bytes <каталог-с-данными>For this command to work, you may need to install the package attr.
The eyes are afraid, and the hands do
On the surface, all these spells and long YAML manifests seem complicated, but in practice, Slurm students deal with them pretty quickly.
In this article, we did not delve into the wilds - there is official documentation for this. If you are interested in the details of setting up Ceph storage with a Kubernetes cluster, these links will help:
On course Slurm you can go a step further and deploy a real application in Kubernetes that will use CephFS as storage for files. Through GET/POST requests, you will be able to transfer files to and receive them from Ceph.
And if you are more interested in data storage, then sign up for . While the beta test is running, the course can be obtained at a discount and influence its content.
Article author: Alexander Shvalov, practicing engineer , Certified Kubernetes Administrator, author and developer of Slurm courses.
Source: habr.com
