Ceph via iSCSI - or skiing while standing in a hammock
Are there those among us (tsefovodov) who do not like “professional extreme”?
It’s unlikely - otherwise we wouldn’t be tumbling around with this extremely interesting and funny product.
Many of those who were involved in the operation of Ceph have come across one not very frequent (or rather even very infrequent) but sometimes in demand case - connecting Ceph via iSCSI or FC. For what? Well, for example, submit an image from Ceph to a Windows or Solaris server that has not yet been virtualized for some reason. Or a virtualized one, but using a hypervisor that can’t do Ceph - and, as we know, there are plenty of them. For example? Well, for example, HyperV or ESXi, which are actively used. And if the task arises of serving an image from Ceph to a guest machine, this turns into a very exciting task.
So, given:
an already running Ceph cluster
an already existing image that must be served via iSCSI
Pool name mypool, image name myimage
Begin?
First of all, when we talk about FC or iSCSI, we have such entities as initiator and target. Target is actually a server, initiator is a client. Our task is to submit the Ceph image to the initiator with minimal effort. This means we must expand target. But where, on what computer?
Fortunately, in a Ceph cluster we have at least one component whose IP address is fixed and on which one of the most important components of Ceph is configured, and that component is the monitor. Accordingly, we install an iSCSI target on the monitor (and an initator at the same time, at least for tests). I did this on CentOS, but the solution is also suitable for any other distribution - you just need to install the packages in the way that is acceptable in your distribution.
# yum -y install iscsi-initiator-utils targetcli
What is the purpose of the installed packages?
targetcli — a utility for managing the SCSI target built into the Linux kernel
iscsi-initiator-utils — a package with utilities used to manage the iSCSI initiator built into the Linux kernel
In order to submit an image via iSCSI to the initiator, there are two options for the development of events - use the userspace backend of the target or connect the image as a block device visible to the operating system and export it via iSCSI. We will go the second way - the userspace backend is still in an “experimental” state and is slightly not ready for productive use. In addition, there are pitfalls with it, about which you can talk a lot and (oh horror!) argue.
If we use even a somewhat stable distribution with a long support cycle, then the kernel we have is some ancient, ancient version. For example, in CentOS7 it is 3.10.*, in CentOS8 it is 4.19. And we are interested in a kernel of at least 5.3 (or rather 5.4) and newer. Why? Because by default Ceph images have a set of options enabled that are not compatible with older kernels. This means that we connect a repository with a new kernel for our distribution (for example, for CentOS this is elrepo), install the new kernel and reboot the system to work with the new kernel:
Connect to the monitor selected for the experiment
Install the kernel: yum -y —enablerepo=elrepo-kernel install kernel-ml
Reboot the server with the monitor (we have three monitors, right?)
Connecting the image as a block device
# rbd map mypool/myimage
/dev/rbd0
All that remains is to configure the target. In this example, I will configure the target in the so-called. demo mode - without authentication, visible and accessible to everyone. In a production environment, you'll likely want to configure authentication - but that's a bit out-of-scope for today's just-for-fun exercise.
Create a backend named disk1 associated with the file /dev/rbd/mypool/myimage. The specified file is a symbolic link automatically created by the udev daemon to /dev/rbd0. We use a symbolic link because the name of the rbd device can change depending on the order in which the Ceph images are connected to the host.
# targetcli /iscsi/iqn.2020-01.demo.ceph:mypool/tpg1/ set
> attribute demo_mode_write_protect=0
# targetcli /iscsi/iqn.2020-01.demo.ceph:mypool/tpg1/ set
> attribute generate_node_acls=1
# targetcli /iscsi/iqn.2020-01.demo.ceph:mypool/tpg1/ set
> attribute cache_dynamic_acls=1
Save the configuration:
# targetcli saveconfig
Checking the presence of the target:
# iscsiadm -m discovery -t st -p 127.0.0.1:3260
127.0.0.1:3260,1 iqn.2020-01.demo.ceph:mypool
We connect the target:
# iscsiadm -m node --login
Logging in to [iface: default, target: iqn.2020-01.demo.ceph:mypool, portal: 127.0.0.1,3260] (multiple)
Login to [iface: default, target: iqn.2020-01.demo.ceph:mypool, portal: 127.0.0.1,3260] successful.
If you did everything correctly, a new disk will appear on the server, which looks like a SCSI device, but is actually an image from Ceph, accessed through an iSCSI target. To avoid boot problems, it is better to remove the connected disk and the detected target from the local initiator:
All that remains is to persist the configuration so that the image is connected automatically and, after connection, the target is stratified. Launching a target consists of two steps - connecting the RBD and actually launching the target.
First, let's configure the automatic connection of RBD images to the host. This is done by adding the following lines to the /etc/ceph/rbdmap file:
The final test is to reboot our monitor again (it’s now an iSCSI target). It should be noted that if we had not cleared the initiator’s database with the command iscsiadm -n discoverydb -o delete ... you could end up with a server that doesn't load or takes a long time to load.
What is left?
Configure the initiator on the server where we want to send the target.
How to ensure fault tolerance of our target?
You can similarly configure targets on other monitors and set up multipath (vmware will understand this and even work, Hyper-V will not understand - it requires SCSI locks). Since the Ceph client from the kernel does not use caching, this is quite workable. Or another option is to create a cluster resource of three components - a dedicated target IP address and rbdmap and scsi-target services, and manage this resource through clustering tools (who said pacemaker?)
instead of an epilogue
As is clear, this article is a bit of a joke - but in it I tried to “quickly and with examples” consider several fairly popular topics at the same time - iSCSI target, which may not necessarily export Ceph images - but for example, export LVM volumes, the basics of working with an iSCSI initiator ( how to scan a target, how to connect to a target, disconnect, delete a target entry from the database), writing your own unit for systemd and some others
I hope that even if you do not repeat this entire experiment in full, at least something from this article will be useful to you.