LINSTOR storage and its integration with OpenNebula

LINSTOR storage and its integration with OpenNebula

Not so long ago, the guys from LINBIT presented their new SDS solution - Linstor. This is a completely free storage based on proven technologies: DRBD, LVM, ZFS. Linstor combines simplicity and well-designed architecture, which allows you to achieve stability and quite impressive results.

Today I would like to talk a little more about it and show how easy it can be integrated with OpenNebula using linstor_un - a new driver that I developed specifically for this purpose.

Linstor in combination with OpenNebula allows you to build a fast and reliable cloud that can be easily deployed on your own infrastructure.

Linstor architecture

Linstor is neither a file system nor block storage in itself, Linstor is an orchestrator that provides an abstraction layer to automate the creation of volumes in LVM or ZFS and replicate them using DRBD9.

We break stereotypes

But wait, DRBD? - Why automate it and how can it work at all?

Let's remember the past when DRBD8 was very popular. Its standard use meant creating one large block device and cutting it into many small pieces, using the same LVM. A kind of mdadm RAID-1 but with network replication.

This approach is not without drawbacks, and therefore, with the advent of DRBD9, the principles of storage construction have changed, now a separate DRBD device is created for each virtual machine.

The independent block device approach allows better utilization of space in the cluster, and also adds a number of additional features. For example, for each such device, you can determine the number of replicas, their location and individual settings. They are easy to create/delete, take snapshots, resize, enable encryption, and more. It is worth noting that DRBD9 also maintains a quorum, which avoids split-brain situations.

Resources and backends

When creating a new block device, Linstor places the required number of replicas on different nodes in the cluster. We will call each such replica a DRBD resource.

Resources are of two types:

  • Data resource - represent a DRBD device located on a node in an LVM or ZFS pool.
    At the moment there is support for several backends and their number is constantly growing. There is support for LVM, ThinLVM and ZFS. The last two allow you to create and use snapshots.
  • Diskless resource - is a DRBD device hosted on a node without a backend, but allowing it to be treated like a regular block device, all read / write operations will be redirected to data resources. The closest analogue to diskless resources is iSCSI LUN.

Each DRBD resource can have up to 8 replicas and only one of them can be active by default βˆ’ Primary, all the rest will be Secondary and their use will be impossible as long as there is at least one Primary, that is, they will simply replicate data between themselves.

By mounting a DRBD device into the system, it automatically becomes Primary, thus even a Diskless resource, in DRBD terminology, can be Primary.

So why do we need Linstor?

By entrusting all resource-intensive tasks to the kernel, Linstor is essentially a regular Java application that allows you to easily automate the creation of DRBD resources.
At the same time, each resource created by him will be an independent DRBD cluster that works independently, regardless of the state of the control-plane and other DRBD resources.

Linstor consists of only two components:

  • linstor-controller - The main controller, which provides an API for creating and managing resources. It also communicates with satellites, checking free space on them, and sends jobs to create and delete new resources. Runs in a single instance and uses a database that can be either internal (H2) or external (PostgreSQL, MySQL, MariaDB)
  • linstor-satellite - Installed on all storage nodes and provides the controller with information about free space, and also performs tasks received from the controller to create and delete new volumes and DRBD devices on top of them.

Linstor operates with the following key concepts:

  • Node β€” a physical server on which DRBD resources will be created and used.
  • storage pool - LVM or ZFS pool created on the node in which DRBD resources will be placed. A diskless pool is also possible - this is a pool in which only diskless resources will be placed.
  • resource definition - Definition of a resource, in fact it is a prototype that describes the name and all its properties.
  • volume definition β€” Volume definition. Each resource can consist of several volumes, each volume must have a size.
  • Resource - Created an instance of a block device, each resource must be placed on a specific node and in some storage pool.

Installation of Linstor

I recommend using Ubuntu as a system, because. exists for her ready-made PPA:

add-apt-repository ppa:linbit/linbit-drbd9-stack
apt-get update

Or Debian, where Linstor can be installed from the official Proxmox repository:

wget -O- https://packages.linbit.com/package-signing-pubkey.asc | apt-key add -
PVERS=5 && echo "deb http://packages.linbit.com/proxmox/ proxmox-$PVERS drbd-9.0" > 
    /etc/apt/sources.list.d/linbit.list
apt-get update

Controller

Here everything is simple:

apt-get install linstor-controller linstor-client
systemctl enable linstor-controller
systemctl start linstor-controller

Storage nodes

The Linux kernel currently ships with an in-tree kernel module DRBD8, unfortunately it does not suit us and we need to install DRBD9:

apt-get install drbd-dkms

As practice shows, most of the difficulties arise precisely with the fact that the DRBD8 module is loaded into the system, and not DRBD9. Luckily, this is easy to check by running:

modprobe drbd
cat /proc/drbd

If you see version: 9 - everything is fine if version: 8 - it means something went wrong and you need to take additional steps to find out the reasons.

Now let's install linstor-satellite ΠΈ drbd-utils:

apt-get install linstor-satellite drbd-utils
systemctl enable linstor-satellite
systemctl start linstor-satellite

Create a cluster

Storage pools and nodes

As a backend we will take ThinLVM, because it is the simplest and supports snapshots.
Set lvm2, if you haven't already done so, and let's create a ThinLVM pool on all our storage nodes:

sudo vgcreate drbdpool /dev/sdb
sudo lvcreate -L 800G -T drbdpool/thinpool

All further actions can be performed directly on the controller:

Let's add our nodes:

linstor node create node1 127.0.0.11
linstor node create node2 127.0.0.12
linstor node create node3 127.0.0.13

Let's create storage pools:

linstor storage-pool create lvmthin node1 data drbdpool/thinpool
linstor storage-pool create lvmthin node2 data drbdpool/thinpool
linstor storage-pool create lvmthin node3 data drbdpool/thinpool

Now let's check the created pools:

linstor storage-pool list

If everything is done correctly, then we should see something like:

+------------------------------------------------- -------------------------------------------------- ----+ | StoragePool | node | driver | poolname | freecapacity | totalcapacity | Supports Snapshots | |------------------------------------------------- -------------------------------------------------- ----| | data | node1 | LVM_THIN | drbdpool/thinpool | 64 GiB | 64 GiB | true | | data | node2 | LVM_THIN | drbdpool/thinpool | 64 GiB | 64 GiB | true | | data | node3 | LVM_THIN | drbdpool/thinpool | 64 GiB | 64 GiB | true | +------------------------------------------------- -------------------------------------------------- ----+

DRBD resources

Now let's try to create our new DRBD resource:

linstor resource-definition create myres
linstor volume-definition create myres 1G
linstor resource create myres --auto-place 2

Let's check the created resources:

linstor resource list 

+------------------------------------------------- -------------------------------------------------- ---+ | node | resource | StoragePool | VolumeNr | MinorNr | DeviceName | Allocated | in use | state | |------------------------------------------------- -------------------------------------------------- ---| | node1 | myres | data | 0 | 1084 | /dev/drbd1084 | 52 KiB | Unused | UpToDate | | node2 | myres | data | 0 | 1084 | /dev/drbd1084 | 52 KiB | Unused | UpToDate | +------------------------------------------------- -------------------------------------------------- ---+

Great! - we see that the resource was created on the first two nodes, we can also try to create a diskless resource on the third:

linstor resource create --diskless node3 myres

On nodes you will always find this device as /dev/drbd1084 or /dev/drbd/by-res/myres/0

This is how Linstor works, you can get more information from official documentation.

Now I will talk about how to integrate it with OpenNebula

Setting up OpenNebula

I won't go too deep into the OpenNebula setup process, because All steps are detailed in official documentation, which I recommend you refer to, I will only talk about the integration of OpenNebula with Linstor.

linstor_un

To solve this problem, I wrote my own driver βˆ’ linstor_un, it is currently available as a plugin and must be installed separately.

The entire installation is performed on the frontend OpenNebula nodes and does not require additional actions on the compute nodes.

First of all, we need to make sure that we have jq ΠΈ linstor-client:

apt-get install jq linstor-client

Team linstor node list should display a list of nodes. All OpenNebula compute nodes must be added to the Linstor cluster.

Download and install the plugin:

curl -L https://github.com/OpenNebula/addon-linstor_un/archive/master.tar.gz | tar -xzvf - -C /tmp

mv /tmp/addon-linstor_un-master/vmm/kvm/* /var/lib/one/remotes/vmm/kvm/

mkdir -p /var/lib/one/remotes/etc/datastore/linstor_un
mv /tmp/addon-linstor_un-master/datastore/linstor_un/linstor_un.conf /var/lib/one/remotes/etc/datastore/linstor_un/linstor_un.conf

mv /tmp/addon-linstor_un-master/datastore/linstor_un /var/lib/one/remotes/datastore/linstor_un
mv /tmp/addon-linstor_un-master/tm/linstor_un /var/lib/one/remotes/tm/linstor_un

rm -rf /tmp/addon-linstor_un-master

Now we need to add it to the OpenNebula config, for this we follow the simple steps described here.

Then restart OpenNebula:

systemctl restart opennebula

And add our datastores, system:

cat > system-ds.conf <<EOT
NAME="linstor-system"
TYPE="SYSTEM_DS"
STORAGE_POOL="data"
AUTO_PLACE="2"
CLONE_MODE="snapshot"
CHECKPOINT_AUTO_PLACE="1"
BRIDGE_LIST="node1 node2 node3"
TM_MAD="linstor_un"
EOT

onedatastore create system-ds.conf

And the image store:

cat > images-ds.conf <<EOT
NAME="linstor-images"
TYPE="IMAGE_DS"
STORAGE_POOL="data"
AUTO_PLACE="2"
BRIDGE_LIST="node1 node2 node3"
DISK_TYPE="BLOCK"
DS_MAD="linstor_un"
TM_MAD="linstor_un"
EOT

onedatastore create images-ds.conf

  • Parameter AUTO_PLACE displays the number of data replicas that will be created for each new image in OpenNebula.
  • Parameter CLONE_MODE indicates exactly how images will be cloned when creating new virtual machines, snapshot - will create a snapshot of the image and deploy a virtual machine from the snapshot, copy - will make a full copy of the image for each virtual machine.
  • Π’ BRIDGE_LIST it is recommended to specify all nodes that will be used to perform image cloning operations.

For a complete list of supported options, see README the project.

This completes the setup, now you can download some appliance from the official OpenNebula Marketplace and create virtual machines from it.

Link to the project:
https://github.com/OpenNebula/addon-linstor_un

Source: habr.com

Add a comment