Container Storage Interface (CSI) is a unified interface between Kubernetes and storage systems. Briefly about him we already
The article contains real, albeit slightly simplified for ease of perception examples. We do not consider installation and configuration of Ceph and Kubernetes clusters.
Are you wondering how it works?
So, you have a Kubernetes cluster at hand, deployed, for example,
If you have all this, let's go!
First, let's go to one of the nodes of the Ceph cluster and check that everything is in order:
ceph health
ceph -s
Next, we will immediately create a pool for RBD disks:
ceph osd pool create kube 32
ceph osd pool application enable kube rbd
Let's go to the Kubernetes cluster. There, first of all, we will install the Ceph CSI driver for RBD. We will install, as expected, through Helm.
Add a repository with a chart, get a set of ceph-csi-rbd chart variables:
helm repo add ceph-csi https://ceph.github.io/csi-charts
helm inspect values ceph-csi/ceph-csi-rbd > cephrbd.yml
Now you need to fill in the cephrbd.yml file. To do this, we find out the cluster ID and IP addresses of monitors in Ceph:
ceph fsid # ΡΠ°ΠΊ ΠΌΡ ΡΠ·Π½Π°Π΅ΠΌ clusterID
ceph mon dump # Π° ΡΠ°ΠΊ ΡΠ²ΠΈΠ΄ΠΈΠΌ IP-Π°Π΄ΡΠ΅ΡΠ° ΠΌΠΎΠ½ΠΈΡΠΎΡΠΎΠ²
The resulting values ββare entered into the cephrbd.yml file. Along the way, we enable the creation of PSP policies (Pod Security Policies). Section Options node plugin ΠΈ provisioner already in the file, they can be corrected as shown below:
csiConfig:
- clusterID: "bcd0d202-fba8-4352-b25d-75c89258d5ab"
monitors:
- "v2:172.18.8.5:3300/0,v1:172.18.8.5:6789/0"
- "v2:172.18.8.6:3300/0,v1:172.18.8.6:6789/0"
- "v2:172.18.8.7:3300/0,v1:172.18.8.7:6789/0"
nodeplugin:
podSecurityPolicy:
enabled: true
provisioner:
podSecurityPolicy:
enabled: true
Then all that remains for us is to install the chart in the Kubernetes cluster.
helm upgrade -i ceph-csi-rbd ceph-csi/ceph-csi-rbd -f cephrbd.yml -n ceph-csi-rbd --create-namespace
Great, the RBD driver works!
Let's create a new StorageClass in Kubernetes. This again requires some work with Ceph.
Create a new user in Ceph and give him permission to write to the pool be:
ceph auth get-or-create client.rbdkube mon 'profile rbd' osd 'profile rbd pool=kube'
And now let's see the access key is still there:
ceph auth get-key client.rbdkube
The command will output something like this:
AQCO9NJbhYipKRAAMqZsnqqS/T8OYQX20xIa9A==
Let's put this value in Secret in the Kubernetes cluster - where we need it userKey:
---
apiVersion: v1
kind: Secret
metadata:
name: csi-rbd-secret
namespace: ceph-csi-rbd
stringData:
# ΠΠ½Π°ΡΠ΅Π½ΠΈΡ ΠΊΠ»ΡΡΠ΅ΠΉ ΡΠΎΠΎΡΠ²Π΅ΡΡΡΠ²ΡΡΡ ΠΈΠΌΠ΅Π½ΠΈ ΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°ΡΠ΅Π»Ρ ΠΈ Π΅Π³ΠΎ ΠΊΠ»ΡΡΡ, ΠΊΠ°ΠΊ ΡΠΊΠ°Π·Π°Π½ΠΎ Π²
# ΠΊΠ»Π°ΡΡΠ΅ΡΠ΅ Ceph. ID ΡΠ·Π΅ΡΠ° Π΄ΠΎΠ»ΠΆΠ΅Π½ ΠΈΠΌΠ΅ΡΡ Π΄ΠΎΡΡΡΠΏ ΠΊ ΠΏΡΠ»Ρ,
# ΡΠΊΠ°Π·Π°Π½Π½ΠΎΠΌΡ Π² storage class
userID: rbdkube
userKey: <user-key>
And we create our secret:
kubectl apply -f secret.yaml
Next, we need something like this StorageClass manifest:
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: csi-rbd-sc
provisioner: rbd.csi.ceph.com
parameters:
clusterID: <cluster-id>
pool: kube
imageFeatures: layering
# ΠΡΠΈ ΡΠ΅ΠΊΡΠ΅ΡΡ Π΄ΠΎΠ»ΠΆΠ½Ρ ΡΠΎΠ΄Π΅ΡΠΆΠ°ΡΡ Π΄Π°Π½Π½ΡΠ΅ Π΄Π»Ρ Π°Π²ΡΠΎΡΠΈΠ·Π°ΡΠΈΠΈ
# Π² Π²Π°Ρ ΠΏΡΠ».
csi.storage.k8s.io/provisioner-secret-name: csi-rbd-secret
csi.storage.k8s.io/provisioner-secret-namespace: ceph-csi-rbd
csi.storage.k8s.io/controller-expand-secret-name: csi-rbd-secret
csi.storage.k8s.io/controller-expand-secret-namespace: ceph-csi-rbd
csi.storage.k8s.io/node-stage-secret-name: csi-rbd-secret
csi.storage.k8s.io/node-stage-secret-namespace: ceph-csi-rbd
csi.storage.k8s.io/fstype: ext4
reclaimPolicy: Delete
allowVolumeExpansion: true
mountOptions:
- discard
Need to fill clusterID, which we have already learned by the team ceph fsid, and apply this manifest on the Kubernetes cluster:
kubectl apply -f storageclass.yaml
To check the operation of clusters in a bundle, let's create the following PVC (Persistent Volume Claim):
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: rbd-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: csi-rbd-sc
Let's immediately see how Kubernetes created the requested volume in Ceph:
kubectl get pvc
kubectl get pv
Everything seems to be great! And what does it look like on the Ceph side?
We get a list of volumes in the pool and view information about our volume:
rbd ls -p kube
rbd -p kube info csi-vol-eb3d257d-8c6c-11ea-bff5-6235e7640653 # ΡΡΡ, ΠΊΠΎΠ½Π΅ΡΠ½ΠΎ ΠΆΠ΅, Π±ΡΠ΄Π΅Ρ Π΄ΡΡΠ³ΠΎΠΉ ID ΡΠΎΠΌΠ°, ΠΊΠΎΡΠΎΡΡΠΉ Π²ΡΠ΄Π°Π»Π° ΠΏΡΠ΅Π΄ΡΠ΄ΡΡΠ°Ρ ΠΊΠΎΠΌΠ°Π½Π΄Π°
Now let's see how RBD volume resizing works.
Change the volume size in the pvc.yaml manifest to 2Gi and apply it:
kubectl apply -f pvc.yaml
Let's wait for the changes to take effect and take another look at the volume size.
rbd -p kube info csi-vol-eb3d257d-8c6c-11ea-bff5-6235e7640653
kubectl get pv
kubectl get pvc
We see that the size of the PVC has not changed. You can ask Kubernetes for a description of the PVC in YAML format to find out why:
kubectl get pvc rbd-pvc -o yaml
And here is the problem:
message: Waiting for user to (re-)start a pod to finish file system resize of volume on node. type: FileSystemResizePending
That is, the disk has grown, but the file system on it has not.
To expand the file system, you need to mount the volume. In our case, the created PVC / PV is not used in any way now.
We can create a test Pod like this:
---
apiVersion: v1
kind: Pod
metadata:
name: csi-rbd-demo-pod
spec:
containers:
- name: web-server
image: nginx:1.17.6
volumeMounts:
- name: mypvc
mountPath: /data
volumes:
- name: mypvc
persistentVolumeClaim:
claimName: rbd-pvc
readOnly: false
And now let's look at PVC:
kubectl get pvc
The size has changed, everything is in order.
In the first part, we worked with the RBD block device (it stands for Rados Block Device), but this cannot be done if you want to work with this disk of different microservices at the same time. For working with files, rather than with a disk image, CephFS is much better suited.
Using the example of Ceph and Kubernetes clusters, we will configure CSI and other necessary entities to work with CephFS.
Let's get the values ββfrom the new Helm chart we need:
helm inspect values ceph-csi/ceph-csi-cephfs > cephfs.yml
Again, you need to fill in the cephfs.yml file. As before, the Ceph commands will help:
ceph fsid
ceph mon dump
We fill the file with values ββlike this:
csiConfig:
- clusterID: "bcd0d202-fba8-4352-b25d-75c89258d5ab"
monitors:
- "172.18.8.5:6789"
- "172.18.8.6:6789"
- "172.18.8.7:6789"
nodeplugin:
httpMetrics:
enabled: true
containerPort: 8091
podSecurityPolicy:
enabled: true
provisioner:
replicaCount: 1
podSecurityPolicy:
enabled: true
Note that monitor addresses are specified in the simple form address:port. To mount cephfs on the host, these addresses are passed to a kernel module that does not yet know how to work with the monitor protocol v2.
We change the port for httpMetrics (Prometheus will go there for monitoring metrics) so that it does not conflict with nginx-proxy, which is installed by Kubespray. You may not need this.
Install Helm chart in Kubernetes cluster:
helm upgrade -i ceph-csi-cephfs ceph-csi/ceph-csi-cephfs -f cephfs.yml -n ceph-csi-cephfs --create-namespace
Let's move on to the Ceph data store to create a separate user there. The documentation states that the CephFS provider needs cluster administrator access rights. But we will create a separate user fs with limited rights:
ceph auth get-or-create client.fs mon 'allow r' mgr 'allow rw' mds 'allow rws' osd 'allow rw pool=cephfs_data, allow rw pool=cephfs_metadata'
And immediately let's see its access key, it will be useful to us further:
ceph auth get-key client.fs
Let's create separate Secret and StorageClass.
Nothing new, we have already seen this with the example of RBD:
---
apiVersion: v1
kind: Secret
metadata:
name: csi-cephfs-secret
namespace: ceph-csi-cephfs
stringData:
# ΠΠ΅ΠΎΠ±Ρ
ΠΎΠ΄ΠΈΠΌΠΎ Π΄Π»Ρ Π΄ΠΈΠ½Π°ΠΌΠΈΡΠ΅ΡΠΊΠΈ ΡΠΎΠ·Π΄Π°Π²Π°Π΅ΠΌΡΡ
ΡΠΎΠΌΠΎΠ²
adminID: fs
adminKey: <Π²ΡΠ²ΠΎΠ΄ ΠΏΡΠ΅Π΄ΡΠ΄ΡΡΠ΅ΠΉ ΠΊΠΎΠΌΠ°Π½Π΄Ρ>
Applying the manifest:
kubectl apply -f secret.yaml
And now - a separate StorageClass:
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: csi-cephfs-sc
provisioner: cephfs.csi.ceph.com
parameters:
clusterID: <cluster-id>
# ΠΠΌΡ ΡΠ°ΠΉΠ»ΠΎΠ²ΠΎΠΉ ΡΠΈΡΡΠ΅ΠΌΡ CephFS, Π² ΠΊΠΎΡΠΎΡΠΎΠΉ Π±ΡΠ΄Π΅Ρ ΡΠΎΠ·Π΄Π°Π½ ΡΠΎΠΌ
fsName: cephfs
# (Π½Π΅ΠΎΠ±ΡΠ·Π°ΡΠ΅Π»ΡΠ½ΠΎ) ΠΡΠ» Ceph, Π² ΠΊΠΎΡΠΎΡΠΎΠΌ Π±ΡΠ΄ΡΡ Ρ
ΡΠ°Π½ΠΈΡΡΡΡ Π΄Π°Π½Π½ΡΠ΅ ΡΠΎΠΌΠ°
# pool: cephfs_data
# (Π½Π΅ΠΎΠ±ΡΠ·Π°ΡΠ΅Π»ΡΠ½ΠΎ) Π Π°Π·Π΄Π΅Π»Π΅Π½Π½ΡΠ΅ Π·Π°ΠΏΡΡΡΠΌΠΈ ΠΎΠΏΡΠΈΠΈ ΠΌΠΎΠ½ΡΠΈΡΠΎΠ²Π°Π½ΠΈΡ Π΄Π»Ρ Ceph-fuse
# Π½Π°ΠΏΡΠΈΠΌΠ΅Ρ:
# fuseMountOptions: debug
# (Π½Π΅ΠΎΠ±ΡΠ·Π°ΡΠ΅Π»ΡΠ½ΠΎ) Π Π°Π·Π΄Π΅Π»Π΅Π½Π½ΡΠ΅ Π·Π°ΠΏΡΡΡΠΌΠΈ ΠΎΠΏΡΠΈΠΈ ΠΌΠΎΠ½ΡΠΈΡΠΎΠ²Π°Π½ΠΈΡ CephFS Π΄Π»Ρ ΡΠ΄ΡΠ°
# Π‘ΠΌ. man mount.ceph ΡΡΠΎΠ±Ρ ΡΠ·Π½Π°ΡΡ ΡΠΏΠΈΡΠΎΠΊ ΡΡΠΈΡ
ΠΎΠΏΡΠΈΠΉ. ΠΠ°ΠΏΡΠΈΠΌΠ΅Ρ:
# kernelMountOptions: readdir_max_bytes=1048576,norbytes
# Π‘Π΅ΠΊΡΠ΅ΡΡ Π΄ΠΎΠ»ΠΆΠ½Ρ ΡΠΎΠ΄Π΅ΡΠΆΠ°ΡΡ Π΄ΠΎΡΡΡΠΏΡ Π΄Π»Ρ Π°Π΄ΠΌΠΈΠ½Π° ΠΈ/ΠΈΠ»ΠΈ ΡΠ·Π΅ΡΠ° Ceph.
csi.storage.k8s.io/provisioner-secret-name: csi-cephfs-secret
csi.storage.k8s.io/provisioner-secret-namespace: ceph-csi-cephfs
csi.storage.k8s.io/controller-expand-secret-name: csi-cephfs-secret
csi.storage.k8s.io/controller-expand-secret-namespace: ceph-csi-cephfs
csi.storage.k8s.io/node-stage-secret-name: csi-cephfs-secret
csi.storage.k8s.io/node-stage-secret-namespace: ceph-csi-cephfs
# (Π½Π΅ΠΎΠ±ΡΠ·Π°ΡΠ΅Π»ΡΠ½ΠΎ) ΠΡΠ°ΠΉΠ²Π΅Ρ ΠΌΠΎΠΆΠ΅Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°ΡΡ Π»ΠΈΠ±ΠΎ ceph-fuse (fuse),
# Π»ΠΈΠ±ΠΎ ceph kernelclient (kernel).
# ΠΡΠ»ΠΈ Π½Π΅ ΡΠΊΠ°Π·Π°Π½ΠΎ, Π±ΡΠ΄Π΅Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°ΡΡΡΡ ΠΌΠΎΠ½ΡΠΈΡΠΎΠ²Π°Π½ΠΈΠ΅ ΡΠΎΠΌΠΎΠ² ΠΏΠΎ ΡΠΌΠΎΠ»ΡΠ°Π½ΠΈΡ,
# ΡΡΠΎ ΠΎΠΏΡΠ΅Π΄Π΅Π»ΡΠ΅ΡΡΡ ΠΏΠΎΠΈΡΠΊΠΎΠΌ ceph-fuse ΠΈ mount.ceph
# mounter: kernel
reclaimPolicy: Delete
allowVolumeExpansion: true
mountOptions:
- debug
Fill in here clusterID and applicable in Kubernetes:
kubectl apply -f storageclass.yaml
inspection
To check, as in the previous example, let's create a PVC:
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: csi-cephfs-pvc
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 5Gi
storageClassName: csi-cephfs-sc
And check for PVC/PV:
kubectl get pvc
kubectl get pv
If you want to look at the files and directories in CephFS, you can mount this file system somewhere. For example, as shown below.
We go to one of the nodes of the Ceph cluster and perform the following actions:
# Π’ΠΎΡΠΊΠ° ΠΌΠΎΠ½ΡΠΈΡΠΎΠ²Π°Π½ΠΈΡ
mkdir -p /mnt/cephfs
# Π‘ΠΎΠ·Π΄Π°ΡΠΌ ΡΠ°ΠΉΠ» Ρ ΠΊΠ»ΡΡΠΎΠΌ Π°Π΄ΠΌΠΈΠ½ΠΈΡΡΡΠ°ΡΠΎΡΠ°
ceph auth get-key client.admin >/etc/ceph/secret.key
# ΠΠΎΠ±Π°Π²Π»ΡΠ΅ΠΌ Π·Π°ΠΏΠΈΡΡ Π² /etc/fstab
# !! ΠΠ·ΠΌΠ΅Π½ΡΠ΅ΠΌ ip Π°Π΄ΡΠ΅Ρ Π½Π° Π°Π΄ΡΠ΅Ρ Π½Π°ΡΠ΅Π³ΠΎ ΡΠ·Π»Π°
echo "172.18.8.6:6789:/ /mnt/cephfs ceph name=admin,secretfile=/etc/ceph/secret.key,noatime,_netdev 0 2" >> /etc/fstab
mount /mnt/cephfs
Of course, such a FS mount on a Ceph node is only suitable for learning purposes, which is what we do on our
And finally, let's check how things work with volume resizing in the case of CephFS. We return to Kubernetes and edit our manifest for PVC - we increase the size there, for example, to 7Gi.
Apply the edited file:
kubectl apply -f pvc.yaml
Let's see how the quota has changed on the mounted directory:
getfattr -n ceph.quota.max_bytes <ΠΊΠ°ΡΠ°Π»ΠΎΠ³-Ρ-Π΄Π°Π½Π½ΡΠΌΠΈ>
For this command to work, you may need to install the package attr.
The eyes are afraid, and the hands do
On the surface, all these spells and long YAML manifests seem complicated, but in practice, Slurm students deal with them pretty quickly.
In this article, we did not delve into the wilds - there is official documentation for this. If you are interested in the details of setting up Ceph storage with a Kubernetes cluster, these links will help:
On course Slurm
And if you are more interested in data storage, then sign up for
Article author: Alexander Shvalov, practicing engineer
Source: habr.com