How to migrate to the cloud in two hours thanks to Kubernetes and automation

How to migrate to the cloud in two hours thanks to Kubernetes and automation

The URUS company tried Kubernetes in different forms: independent deployment on bare metal, on Google Cloud, and then transferred its platform to the Mail.ru Cloud Solutions (MCS) cloud. Igor Shishkin tells how they chose a new cloud provider and how they managed to migrate to it in a record two hours (t3ran), senior system administrator at URUS.

What does URUS do?

There are many ways to improve the quality of the urban environment, and one of them is to make it environmentally friendly. This is exactly what the URUS - Smart Digital Services company is working on. It implements solutions that help enterprises control important environmental indicators and reduce the negative impact on the environment. Sensors collect data on air composition, noise level and other parameters, and then send them to the URUS-Ekomon single platform for analysis and recommendations.

How URUS works from the inside

A typical URUS client is a company located in or near a residential area. It can be a factory, a port, a railway depot or any other object. If our client has already received a warning, was fined for environmental pollution, or wants to make less noise, reduce the amount of harmful emissions, he comes to us, and we already offer him a ready-made solution for environmental monitoring.

How to migrate to the cloud in two hours thanks to Kubernetes and automation
H2S concentration monitoring graph shows regular nighttime emissions from a nearby facility

The devices that we use at URUS contain several sensors that collect information about the content of certain gases, noise levels and other data to assess the environmental situation. The exact number of sensors is always determined by the specific task.

How to migrate to the cloud in two hours thanks to Kubernetes and automation
Depending on the specifics of measurements, devices with sensors can be located on the walls of buildings, poles, and in other arbitrary places. Each such device collects information, aggregates it and sends it to the data receiving gateway. There we save the data for long-term storage and pre-process it for further analysis. The simplest example of what we get as an output after analysis is the air quality index, also known as AQI.

In parallel, many other services operate on our platform, but mostly they are of a service nature. For example, the notification service sends notifications to clients if any of the monitored parameters (for example, CO2 content) has exceeded the allowable value.

How we store data. History from Kubernetes to bare metal

The URUS environmental monitoring project has several data repositories. In one, we keep "raw" data - what we received directly from the devices themselves. This storage is a "magnetic" tape, like on old cassettes, with a history of all indicators. The second type of storage is used for pre-processed data - data from devices enriched with metadata about sensor connections and indications of the devices themselves, belonging to organizations, locations, etc. This information allows you to evaluate in dynamics how this or that indicator has changed over a certain period of time . We use the storage of "raw" data, including as a backup and for restoring preprocessed data, if such a need arises.

When we were looking for a solution to a storage problem a few years ago, we had two options for choosing a platform: Kubernetes and OpenStack. But since the latter looks pretty monstrous (just look at its architecture to see this), we settled on Kubernetes. Another argument in its favor was the relatively simple software management, the ability to more flexibly cut even iron nodes into resources.

In parallel with the development of Kubernetes itself, we also studied data storage methods, while we kept all our storages in Kubernetes on our hardware, we received excellent expertise. Everything that we had back then lived on Kubernetes: statefull storage, monitoring system, CI / CD. Kubernetes has become an all-in-one platform for us.

But we wanted to work with Kubernetes as a service, and not support and develop it. Plus, we didn't like how much it cost us to maintain it on bare metal, and we needed development all the time! For example, one of the first tasks was to fit Kubernetes Ingress controllers into the network infrastructure of our organization. This is a cumbersome task, especially if you imagine that at that time nothing was ready for programmatic resource management like DNS records or allocation of IP addresses. Later we started experimenting with external data storage. They never got to the implementation of the PVC controller, but even then it became clear that this was a large front of work, for which individual specialists needed to be allocated.

Switching to Google Cloud Platform is a temporary solution

We realized that we couldn't continue like this, and moved our data from bare metal to Google Cloud Platform. In fact, at that time there were not so many interesting options for the Russian company: in addition to the Google Cloud Platform, only Amazon offered a similar service, but we still settled on a solution from Google. Then it seemed to us more cost-effective, closer to Upstream, not to mention the fact that Google itself is a kind of Kubernetes PoC in Production.

The first major problem appeared on the horizon as our client base grew. When we needed to store personal data, we were faced with a choice: either we work with Google and violate Russian laws, or we are looking for an alternative in the Russian Federation. The choice, in general, was predictable. πŸ™‚

How we saw the ideal cloud service

By the beginning of the search, we already knew what we wanted to get from the future cloud provider. What service are we looking for:

  • Fast and flexible. Such that we can quickly add a new node or deploy something at any time.
  • Inexpensive. We were very worried about the financial issue, as we were limited in resources. We already knew that we wanted to work with Kubernetes, and now the task was to minimize its cost in order to increase or at least maintain the efficiency of using this solution.
  • Automated. We planned to work with the service through the API, without managers and phone calls or situations when you need to manually raise several dozen nodes in emergency mode. Since most of our processes are automated, we expected the same from the cloud service.
  • With servers in Russia. Of course, we planned to comply with Russian legislation and the same 152-FZ.

At that time, there were few Kubernetes providers in the aaS model in Russia, while choosing a provider, it was important for us not to compromise our priorities. The Mail.ru Cloud Solutions team, with which we started working and are still cooperating, provided us with a fully automated service, with API support and a convenient control panel that has Horizon - with it we could quickly raise an arbitrary number of nodes.

How we managed to migrate to MCS in two hours

In such moves, many companies face difficulties and setbacks, but in our case there were none. We were lucky: since we were already working on Kubernetes before the migration, we simply corrected three files and launched our services on a new cloud platform, in MCS. Let me remind you that by that time we had finally left bare metal and lived on the Google Cloud Platform. Therefore, the move itself took no more than two hours, plus a little more time (about an hour) was spent on copying data from our devices. Back then, we were already using Spinnaker (a multi-cloud CD service for Continous Delivery). We also quickly added it to the new cluster and continued to work as usual.

Thanks to the automation of development processes and CI / CD Kubernetes, URUS has one specialist (and that's me). At some stage, another system administrator worked with me, but then it turned out that we had already automated the entire main routine, and from the side of our main product there are more and more tasks and it makes sense to direct resources to this.

We got what we expected from the cloud provider, as we started cooperation without illusions. If there were any incidents, then mostly technical ones and those that can be easily explained by the relative freshness of the service. The main thing is that the MCS team promptly eliminates shortcomings and quickly responds to questions in messengers.

If we compare the experience with the Google Cloud Platform, then in their case I didn’t even know where the feedback button was, because there was simply no need for it. And if any problems happened, Google itself sent notifications unilaterally. But in the case of MCS, I consider it a big plus that they are as close as possible to Russian clients - both geographically and mentally.

How do we see cloud computing in the future?

Now our work is closely tied to Kubernetes, and it completely suits us in terms of infrastructure tasks. Therefore, we do not plan to migrate from it somewhere, although we are constantly introducing new practices and services to simplify routine tasks and automate new ones, increase the stability and reliability of services ... Now we are launching the Chaos Monkey service (specifically, we use chaoskube, but this does not change the concept: ), which was originally created by Netflix. Chaos Monkey does one simple thing: delete an arbitrary pod in Kubernetes at random times. This is necessary for our service to live normally with the number of instances n-1, so we accustom ourselves to be ready for any problems.

Now I see the use of third-party solutions - the same cloud platforms - as the only right thing for young companies. Usually, at the beginning of their journey, they are limited in resources, both human and financial, and building and maintaining their own cloud or data center is too expensive and labor-intensive. Cloud providers allow you to minimize these costs, you can quickly get the resources necessary for the operation of services here and now, and pay for these resources after the fact. As for the URUS company, for now we will remain faithful to Kubernetes in the cloud. But who knows, we may have to expand geographically, or implement solutions based on some specific equipment. Or maybe the amount of resources consumed will justify your own bare-metal Kubernetes, like in the good old days. πŸ™‚

What we learned from the experience of working with cloud services

We started using Kubernetes on bare metal, and even there it was good in its own way. But its strengths were revealed precisely as an aaS component in the cloud. If you set a goal and automate everything as much as possible, you will be able to avoid vendor lock-in and moving between cloud providers will take a couple of hours, and the nerve cells will remain with us. We can advise other companies: if you want to launch your own (cloud) service with limited resources and maximum velocity for development, start right now by renting cloud resources, and build your data center after Forbes writes about you.

Source: habr.com

Add a comment