Clustering in Proxmox VE

Clustering in Proxmox VE

In past articles, we started talking about what Proxmox VE is and how it works. Today we will talk about how you can use the possibility of clustering and show what benefits it gives.

What is a cluster and why is it needed? A cluster (from the English cluster) is a group of servers united by high-speed communication channels, working and appearing to the user as a single whole. There are several main scenarios for using a cluster:

  • Providing fault tolerance (high-availability).
  • Load balancing (Load Balancing).
  • Increase in productivity (high performance).
  • Performing Distributed Computing (Distributed computing).

Each scenario has its own requirements for the cluster members. For example, for a cluster that performs distributed computing, the main requirement is high speed of floating point operations and low network latency. Such clusters are often used for research purposes.

Since we have touched on the topic of distributed computing, I would like to note that there is also such a thing as grid system (from the English grid - lattice, network). Despite the general similarity, do not confuse the grid system and the cluster. Grid is not a cluster in the usual sense. Unlike a cluster, the nodes included in the grid are most often heterogeneous and are characterized by low availability. This approach simplifies the solution of distributed computing problems, but does not allow creating a single whole from nodes.

A striking example of a grid system is a popular computing platform BOINC (Berkeley Open Infrastructure for Network Computing). This platform was originally created for the project SETI @ home (Search for Extra-Terrestrial Intelligence at Home), dealing with the problem of finding extraterrestrial intelligence by analyzing radio signals.

How it worksA huge array of data received from radio telescopes is broken into many small pieces, and they are sent to the nodes of the grid system (in the SETI@home project, volunteer computers play the role of such nodes). The data is processed at the nodes and after processing is completed, it is sent to the central server of the SETI project. Thus, the project solves the most complex global problem without having the required computing power at its disposal.

Now that we have a clear understanding of what a cluster is, we propose to consider how it can be created and used. We will use an open source virtualization system Proxmox VE.

It is especially important to clearly understand the limitations and system requirements of Proxmox before starting to create a cluster, namely:

  • maximum number of nodes in a cluster - 32;
  • all nodes must have the same version of Proxmox (there are exceptions, but they are not recommended for production);
  • if in the future it is planned to use the High Availability functionality, then the cluster should have at least 3 nodes;
  • ports must be open for nodes to communicate with each other UDP/5404, UDP/5405 for corosync and TCP/22 for SSH;
  • network delay between nodes should not exceed 2 ms.

Create a cluster

Important! The following configuration is a test one. Don't forget to check with official documentation Proxmox V.E.

In order to run a test cluster, we took three servers with the Proxmox hypervisor installed with the same configuration (2 cores, 2 GB of RAM).

If you want to know how you can install Proxmox, then we recommend reading our previous article - The magic of virtualization: an introductory course in Proxmox VE.

Initially, after installing the OS, a single server runs in standalone-mode.

Clustering in Proxmox VE
Create a cluster by clicking the button Create Cluster in the relevant section.

Clustering in Proxmox VE
We set a name for the future cluster and select an active network connection.

Clustering in Proxmox VE
Click the Create button. The server will generate a 2048-bit key and write it, along with the parameters of the new cluster, to the configuration files.

Clustering in Proxmox VE
Inscription TASK OK indicates the successful completion of the operation. Now, looking at the general information about the system, it can be seen that the server has switched to cluster mode. So far, the cluster consists of only one node, that is, it does not yet have the capabilities for which a cluster is needed.

Clustering in Proxmox VE

Joining a Cluster

Before connecting to the created cluster, we need to obtain information to complete the connection. To do this, go to the section cluster and press the button Join Information.

Clustering in Proxmox VE
In the window that opens, we are interested in the contents of the field of the same name. It will need to be copied.

Clustering in Proxmox VE
All the necessary connection parameters are encoded here: the server address for connection and the digital fingerprint. We go to the server that needs to be included in the cluster. We press the button Join Cluster and in the window that opens, paste the copied content.

Clustering in Proxmox VE
fields Peer address ΠΈ Fingerprint will be filled in automatically. Enter the root password for node number 1, select the network connection and press the button Join.

Clustering in Proxmox VE
During the process of joining a cluster, the GUI web page may stop updating. It's ok, just reload the page. In exactly the same way, we add another node and as a result we get a full-fledged cluster of 3 working nodes.

Clustering in Proxmox VE
Now we can control all cluster nodes from one GUI.

Clustering in Proxmox VE

High Availability Organization

Proxmox out of the box supports HA organization functionality for both virtual machines and LXC containers. Utility ha-manager detects and handles errors and failures, performing a failover from a failed node to a working one. For the mechanism to work correctly, it is necessary that virtual machines and containers have a common file storage.

After activating the High Availability functionality, the ha-manager software stack will continuously monitor the state of the virtual machine or container and interact asynchronously with other cluster nodes.

Attaching shared storage

As an example, we deployed a small NFS file share at 192.168.88.18. In order for all nodes of the cluster to be able to use it, you need to do the following manipulations.

Select from the web interface menu Datacenter - Storage - Add - NFS.

Clustering in Proxmox VE
Fill in the fields ID ΠΈ Server & Hosting. In drop down list Export select the desired directory from the available ones and in the list Content β€” required data types. After pressing the button Add the storage will be connected to all cluster nodes.

Clustering in Proxmox VE
When creating virtual machines and containers on any of the nodes, we specify our storage as storage.

Setting up HA

For example, let's create a container with Ubuntu 18.04 and configure High Availability for it. After creating and running the container, go to the section Datacenter-HA-Add. In the field that opens, specify the virtual machine/container ID and the maximum number of attempts to restart and move between nodes.

If this number is exceeded, the hypervisor will mark the VM as failed and put it in the Error state, after which it will stop performing any actions with it.

Clustering in Proxmox VE
After pressing the Add utility ha-manager will notify all nodes of the cluster that now the VM with the specified ID is controlled and in case of a crash it must be restarted on another node.

Clustering in Proxmox VE

Let's make a crash

To see how exactly the switching mechanism works, let's turn off node1's power supply abnormally. We look from another node what is happening with the cluster. We see that the system has fixed a failure.

Clustering in Proxmox VE

The operation of the HA mechanism does not mean the continuity of the VM. As soon as the node "falls", the VM operation is temporarily stopped until it is automatically restarted on another node.

And here the β€œmagic” begins - the cluster automatically reassigned the node to run our VM and within 120 seconds the work was automatically restored.

Clustering in Proxmox VE
We extinguish node2 on nutrition. Let's see if the cluster will survive and if the VM will return to a working state automatically.

Clustering in Proxmox VE
Alas, as we can see, we have a problem with the fact that there is no longer a quorum on the only surviving node, which automatically disables HA. We give the command to force the installation of a quorum in the console.

pvecm expected 1

Clustering in Proxmox VE
After 2 minutes, the HA mechanism worked correctly and, not finding node2, launched our VM on node3.

Clustering in Proxmox VE
As soon as we turned node1 and node2 back on, the cluster was fully restored. Please note that the VM does not migrate back to node1 on its own, but this can be done manually.

Summing up

We told you about how the Proxmox clustering mechanism works, and also showed you how HA is configured for virtual machines and containers. Proper use of clustering and HA greatly increases the reliability of the infrastructure, as well as providing disaster recovery.

Before creating a cluster, you need to immediately plan for what purposes it will be used and how much it will need to be scaled in the future. You also need to check the network infrastructure for readiness to work with minimal delays so that the future cluster works without failures.

Tell us - are you using Proxmox's clustering capabilities? We are waiting for you in the comments.

Previous articles on the Proxmox VE hypervisor:

Source: habr.com

Add a comment