Seamless RabbitMQ to Kubernetes migration

Seamless RabbitMQ to Kubernetes migration

RabbitMQ is a message broker written in Erlang that allows you to organize a failover cluster with full data replication to multiple nodes, where each node can serve read and write requests. With many Kubernetes clusters in production, we maintain a large number of RabbitMQ installations and are faced with the need to migrate data from one cluster to another without downtime.

This operation was necessary for us in at least two cases:

  1. Transferring data from a RabbitMQ cluster that is not in Kubernetes to a new - already "kubernetized" (i.e. functioning in K8s pods) - cluster.
  2. Migrating RabbitMQ within Kubernetes from one namespace to another (for example, if the contours are delimited by namespaces, then to transfer the infrastructure from one contour to another).

The recipe proposed in the article is focused on situations (but not limited to them at all) in which there is an old RabbitMQ cluster (for example, from 3 nodes), which is either already in K8s, or on some old servers. An application hosted in Kubernetes works with it (already there or in the future):

Seamless RabbitMQ to Kubernetes migration

... and we are faced with the task of migrating it to a new production in Kubernetes.

First, the general approach to the migration itself will be described, and only after that - the technical details for its implementation.

Migration algorithm

The first, preliminary, stage before any actions is to check that the high availability mode is enabled in the old installation of RabbitMQ (HA). The reason is obvious - we don't want to lose any data. To perform this check, you can go to the RabbitMQ admin panel and in the Admin β†’ Policies tab make sure that the value is set to ha-mode: all:

Seamless RabbitMQ to Kubernetes migration

The next step is to raise a new RabbitMQ cluster in Kubernetes pods (in our case, for example, consisting of 3 nodes, but their number may be different).

After that, we merge the old and new RabbitMQ clusters, getting a single cluster (of 6 nodes):

Seamless RabbitMQ to Kubernetes migration

The process of data synchronization between the old and new RabbitMQ clusters is initiated. Once all data is synced across all nodes in the cluster, we can switch the application to use the new cluster:

Seamless RabbitMQ to Kubernetes migration

After these operations, it is enough to remove the old nodes from the RabbitMQ cluster, and the move can be considered completed:

Seamless RabbitMQ to Kubernetes migration

We have repeatedly used this scheme in our production. However, for our own convenience, we implemented it within a specialized system that distributes typical RMQ configurations on sets of Kubernetes clusters (for those who are curious: we are talking about addon operatorabout which we just recently told). Below are individual instructions that everyone can use on their installations to try the proposed solution in action.

Let's try in practice

Requirements

The details are very simple:

  1. Kubernetes cluster (minikube will do)
  2. RabbitMQ cluster (can be deployed on bare metal, and made like a regular cluster in Kubernetes from the official Helm chart).

For the example below, I deployed RMQ to Kubernetes and named it rmq-old.

Stand preparation

1. Download the Helm chart and edit it a bit:

helm fetch --untar stable/rabbitmq-ha

For convenience, set a password, ErlangCookie and make policy ha-allso that the queues are synchronized between all nodes of the RMQ cluster by default:

rabbitmqPassword: guest
rabbitmqErlangCookie: mae9joopaol7aiVu3eechei2waiGa2we
definitions:
policies: |-
  {
    "name": "ha-all",
    "pattern": ".*",
    "vhost": "/",
    "definition": {
      "ha-mode": "all",
      "ha-sync-mode": "automatic",
      "ha-sync-batch-size": 81920
    }
  }

2. Install the chart:

helm install . --name rmq-old --namespace rmq-old

3. We go to the RabbitMQ admin panel, create a new queue and add a few messages. They will be needed so that after the migration we can make sure that all the data is preserved and we have not lost anything:

Seamless RabbitMQ to Kubernetes migration

The test bench is ready: we have the "old" RabbitMQ with the data that needs to be transferred.

Migrating a RabbitMQ Cluster

1. First, let's deploy a new RabbitMQ in other namespace with same ErlangCookie and password for the user. To do this, we will perform the operations described above, changing the final RMQ installation command to the following:

helm install . --name rmq-new --namespace rmq-new

2. Now you need to merge the new cluster with the old one. To do this, go to each of the pods new RabbitMQ and run the commands:

export OLD_RMQ=rabbit@rmq-old-rabbitmq-ha-0.rmq-old-rabbitmq-ha-discovery.rmq-old.svc.cluster.local && 
  rabbitmqctl stop_app && 
  rabbitmqctl join_cluster $OLD_RMQ && 
  rabbitmqctl start_app

In a variable OLD_RMQ is the address of one of the nodes old RMQ cluster.

These commands will stop the current node new RMQ cluster, join it to the old cluster, and start it up again.

3. RMQ cluster of 6 nodes is ready:

Seamless RabbitMQ to Kubernetes migration

You must wait until the messages are synchronized between all nodes. It is easy to guess that the message synchronization time depends on the capacity of the hardware on which the cluster is deployed, and on the number of messages. In the described scenario, there are only 10 of them, so the data was synchronized instantly, but with a sufficiently large number of messages, synchronization can take hours.

So the sync status is:

Seamless RabbitMQ to Kubernetes migration

Here +5 means the messages are already there more on 5 nodes (other than what is specified in the field Node). Thus, the synchronization was successful.

4. It remains only to switch the RMQ address to the new cluster in the application (the specific actions here depend on the technology stack you use and other specifics of the application), after which you can say goodbye to the old one.

For the last operation (i.e. already after switching the application to a new cluster) go to each node old cluster and run the following commands:

rabbitmqctl stop_app
rabbitmqctl reset

The cluster "forgot" about the old nodes: you can delete the old RMQ, on which the move will be completed.

Note: If you use RMQ with certificates, then nothing fundamentally changes - the process of moving will be carried out in exactly the same way.

Conclusions

The described scheme is suitable for almost all cases when we need to migrate RabbitMQ or just move to a new cluster.

In our case, difficulties arose only once, when RMQ was accessed from many places, and we did not have the opportunity to change the RMQ address to a new one everywhere. Then we launched a new RMQ in the same namespace with the same labels so that it fell under already existing services and Ingresses, and when the pod was launched, we manipulated the labels with our hands, deleting them at the beginning so that requests would not fall on the empty RMQ, and adding them back after message sync.

We used the same strategy when updating RabbitMQ to a new version with a changed configuration - everything worked like clockwork.

PS

As a logical continuation of this material, we are preparing articles about MongoDB (migration from an iron server to Kubernetes) and MySQL (how we prepare this DBMS inside Kubernetes). They will be published in the coming months.

P.P.S

Read also on our blog:

Source: habr.com

Add a comment