Modern infrastructure: problems and prospects

Modern infrastructure: problems and prospects

At the end of May we held an online meetup on the topic "Modern infrastructure and containers: problems and prospects". We talked about containers, Kubernetes and orchestration in general, criteria for choosing an infrastructure, and much more. Participants shared cases from their own practice.

Participants:

  • Evgeny Potapov, CEO of ITSumma. More than half of its customers are either already moving or looking to move to Kubernetes.
  • Dmitry Stolyarov, CTO Flant. Has 10+ years of experience with container systems.
  • Denis Remchukov (aka Eric Oldmann), COO argotech.io, ex-RAO UES. He promised to tell about the cases in the "bloody" enterprise.
  • Andrey Fedorovsky, CTO News360.comAfter the purchase of the company by another player, he is responsible for a number of ML and AI projects and infrastructure.
  • Ivan Kruglov, systems engineer, ex-Booking.com.The same person who did a lot with Kubernetes with his own hands.

Themes:

  • Insights of participants about containers and orchestration (Docker, Kubernetes, etc.); that have been tried in practice or analyzed.
  • Case: The company is building an infrastructure development plan for years. How is the decision made whether to build (or translate the current) infrastructure on containers and Kuber or not?
  • Problems in the cloud-native world, what is missing, let's dream up what will happen tomorrow.

An interesting discussion ensued, the opinions of the participants turned out to be so different and caused so many comments that I would like to share them with you. Eat video for three hours, and below is an excerpt from the discussion.

Kubernetes is already a standard or a great marketing?

“We came to him (Kubernetes. - Ed.) when no one knew about him yet. We came to him even when he was gone. We wanted him before.” Dmitry Stolyarov

Modern infrastructure: problems and prospects
Photo from reddit.com

About 5-10 years ago there was a huge number of tools, and there was no single standard. Every six months, a new product appeared, or even more than one. First Vagrant, then Salt, Chef, Puppet,… “and you rebuild your infrastructure every six months. You have five admins who are constantly busy rewriting configs, ”recalls Andrey Fedorovsky. He believes that Docker and Kubernetes "crushed" the rest. Docker has become the standard in the last five years, Kubernetes in the last two years. And it's good for the industry.

Dmitry Stolyarov and his team love Kuber. They wanted such an instrument before it existed, and they came to it when no one knew about it yet. At the moment, for reasons of convenience, they do not take a client if they understand that they will not implement Kubernetes from him. At the same time, according to Dmitry, the company has "a lot of giant success stories reworking a terrible legacy."

Kubernetes is not only container orchestration, it is a configuration management system with a developed API, a networking component, L3 balancing and Ingress controllers, which makes it relatively easy to manage resources, scale and abstract from the lower layers of infrastructure.

Unfortunately, we have to pay for everything in our lives. And this tax is large, especially if we talk about the transition to Kubernetes of a company with a developed infrastructure, according to Ivan Kruglov. He could freely work both in a company with a traditional infrastructure and with Kuber. The main thing is to understand the characteristics of the company and the market. But, for example, for Evgeny Potapov, who would generalize Kubernetes to any container orchestration tool, such a question is not worth it.

Eugene drew an analogy with the situation in the 1990s, when object-oriented programming appeared as a way to program complex applications. At that time, the debate did not stop and new tools appeared that supported the OOP. Then came microservices as a way to get away from the monolithic concept. This, in turn, led to the emergence of containers and tools for managing them. “I think that soon we will come to the time when there will be no question of whether it is worth writing a microservice small application, it will be written by default as a microservice,” he believes. Likewise, Docker and Kubernetes will eventually become the standard solution without having to choose.

The database problem in stateless

Modern infrastructure: problems and prospects
Photo by Twitter: @jankolario on Unsplash

Nowadays, there are many recipes for running databases in Kubernetes. Even how to separate the part that works with the I / O disk from, conditionally, the application part of the base. Is it possible that in the future databases will change so much that they will be delivered in a box, where one part will be orchestrated through Docker and Kubernetes, and in another part of the infrastructure, through separate software, the storage part will be provided? Bases will change as a product?

This description is similar to queue management, but the requirements for the reliability and synchronism of information in traditional databases are much higher, Andrey believes. Cache hit ratio in normal databases is kept at 99%. If a worker goes down, a new one is started and the cache is “warmed up” from scratch. While the cache is not warmed up, the worker is slow, which means that it cannot be loaded with a user load. While there is no user load, the cache is not warmed up. It's a vicious circle.

Dmitry fundamentally disagrees - quorums and sharding solve the problem. But Andrei insists that the solution is not for everyone. In some situations, quorum is fine, but it puts extra strain on the network. The NoSQL base is not suitable in all cases.

The meeting participants were divided into two camps.

Denis and Andrey argue that everything that writes to disk - databases and so on - is impossible to do in the current Kuber ecosystem. It is impossible to maintain the integrity and consistency of productive data in Kubernetes. This is a fundamental feature. Solution: hybrid infrastructure.

Even modern cloud native databases like MongoDB and Cassandra, or message queues like Kafka or RabbitMQ require persistent data stores outside of Kubernetes.

Eugene objects: “The bases in Kubera are a near-Russian or near-enterprise trauma, which is connected with the fact that there is no Cloud Adoption in Russia.” Small or medium-sized companies in the West are Cloud. Using Amazon RDS databases is easier than messing with Kubernetes yourself. In Russia, they use Kuber “on-premise” and transfer bases to it when they try to get rid of the zoo.

Dmitry also disagreed with the statement that no databases can be kept in Kubernetes: “The base is different. And if you shove a giant relational database, then by no means. If you shove something small and cloud native, which is morally ready for a semi-ephemeral life, everything will be fine.” Dmitry also mentioned that the database management tools are not ready for either Docker or Kuber, so there are big difficulties.

Ivan, in turn, is sure that even if we ignore the concepts of stateful and stateless, the ecosystem of enterprise solutions in Kubernetes is not yet ready. With Kuber, it is difficult to keep up with the requirements of legislative and regulatory bodies. For example, it is not possible to make an identity provisioning solution that requires strong guarantees of server identity, down to the hardware that is inserted into the servers. This area is developing, but so far there is no solution.
The participants failed to agree, so no conclusions will follow in this part. Let's take a couple of practical examples.

Case 1. Cybersecurity of a “mega-regulator” with bases outside of Kuber

In the case of a developed cybersecurity system, the use of containers and orchestration allows you to fend off attacks and intrusions. For example, in one mega-regulator, Denis and his team implemented a bunch of an orchestrator with a trained SIEM service that analyzes logs in real time and determines the process of an attack, hacking or failure. In the event of an attack, an attempt to put something, or when a ransomware virus invades, it raises containers with applications through the orchestrator faster than they are infected, or faster than they are attacked by an attacker.

Case 2. Partial migration of Booking.com databases to Kubernetes

In Booking.com, the main database is MySQL with asynchronous replication - there is a master and a whole hierarchy of slaves. By the time Ivan left the company, a project was launched to transfer slaves, which can be “shoot off” with some damage.

In addition to the main base, there is a Cassandra installation with self-written orchestration, which was written even before Kuber entered the mainstream. There are no problems in this regard, but it has persistent on local SSDs. Remote storage, even within the same data center, is not used due to high latency issues.

The third class of databases is the Booking.com search service, where each service node is a database. Attempts to transfer the search service to Kuber failed, because each node is 60-80 GB of local storage, which are difficult to “raise” and “warm up”.

As a result, the search engine was not transferred to Kubernetes, and Ivan does not think that there will be new attempts in the near future. The MySQL database has been transferred by half: only the Slaves, who are not afraid to "shoot off". Cassandra "got accustomed" perfectly.

The choice of infrastructure as a task without a common solution

Modern infrastructure: problems and prospects
Photo by Manuel Geissinger from Pexels

Suppose we have a new company, or a company where part of the infrastructure is built in the old way. It builds an infrastructure development plan for years. How is the decision made whether to build infrastructure on containers and Kuber or not?

Companies that fight for nanoseconds are excluded from the discussion. Healthy conservatism pays off in terms of reliability, but nonetheless, there are companies that should consider new approaches.

Ivan: “I would certainly start a cloud company now, simply because it is faster,” although not necessarily cheaper. With the development of venture capitalism, startups have no big problems with money, and the main task is to conquer the market.

Ivan is of the opinion that the development of the current infrastructure is a selection criterion. If there were serious investments in the past, and it works, then there is no point in redoing it. If the infrastructure is not developed, and there are problems with tools, security and monitoring, then it makes sense to look at a distributed infrastructure.

The tax would have to be paid anyway, and Ivan would have paid the one that allowed him to pay less in the future. "Because just because I am on a train that is being driven by others, I will travel much further than if I sit on another train that I have to fuel myself.' says Ivan. When the company is new, and the requirements for latency are tens of milliseconds, then Ivan would look towards the “operators” into which classic databases are “wrapped” today. They raise the replication chain, which switches itself in the event of a failover, etc...

For a small company with a couple of servers in Kuber, it makes no sense, says Andrey. But if it plans to grow to a hundred servers or more, then automation and a resource management system are needed. 90% of cases justify the costs. And regardless of the level of load and resources. It makes sense for everyone, from startups to large companies with millions of users, to gradually look towards container orchestration products. “Yes, this is really the future,” Andrey is sure.

Denis outlined two main criteria - scalability and sustainability. He will choose those tools that are best suited for this task. “It can be a assembled noname on the knee, and Nutanix Community Edition on it. This can be a second line in the form of an application on Kuber with a database on the backend that is replicated and has the specified RTO and RPO parameters ”(recovery time / point objectives - note).

Eugene identified a possible problem with personnel. At the moment, there are not so many high-class specialists on the market who understand the “guts”. Indeed, if the chosen technology is old, then it is difficult to hire someone other than middle-aged people who are bored and tired of life. Although other participants believe that this is a matter of training.
If we raise the question of choice: run a small company in the Public Cloud with databases in Amazon RDS or “on premise” with databases in Kubernetes, then despite some shortcomings, Amazon RDS became the choice of the participants.

Since most of the listeners of the meetup are not from the "bloody" enterprise, then distributed solutions are something to strive for. Data storage systems must be distributed, reliable, and create latency, measured in units of milliseconds, maximum tensAndrey concluded.

Estimating Kubernetes Usage

Listener Anton Zhbankov asked a trap question to Kubernetes apologists: how was the feasibility study chosen and carried out? Why Kubernetes, why not virtual machines, for example?

Modern infrastructure: problems and prospects
Photo by Tatiana Eremina on Unsplash

Dmitry and Ivan answered it. In both cases, by trial and error, a sequence of decisions was made, as a result of which both participants came to Kubernetes. Now the business is starting to independently develop software that makes sense to transfer to Kuber. We are not talking about classic third-party systems, such as 1C. Kubernetes helps when developers need to quickly make releases, with non-stop Continuous Improvement.

Andrey's team tried to make a scalable cluster based on virtual machines. Nodes fell like dominoes, which sometimes led to the fall of the cluster. “Theoretically, you can finish and support it with your hands, but it's a chore. And if there is a solution on the market that allows you to work out of the box, then we are happy to go for it. And we switched as a result,” says Andrei.

There are standards for such analysis and calculation, but no one can say how accurate they are on real hardware in operation. For calculations, it is also important to understand each tool and ecosystem, but this is impossible.

What awaits us

Modern infrastructure: problems and prospects
Photo by Drew Beamer on Unsplash

As technology advances, there are more and more disparate pieces, and then there is a phase transition, a vendor appears that has killed enough “dough” to make everything come together in a single tool.

Don't you think there will come a time when there will be a tool like Ubuntu has become for the Linux world? Perhaps a single containerization and orchestration tool will include Kuber. With it, it will become easy to build on-premise clouds.

Ivan gave the answer: “Google is now building Anthos - this is their package offer that deploys the cloud and includes Kuber, Service Mesh, monitoring - all the binding that is needed for on-premise microservices. We're almost in the future."

Denis also mentioned Nutanix and VMWare with the vRealize Suite product, which can handle this task without containerization.

Dmitry shared his opinion that reducing the “pain” and reducing the tax are two areas where improvements should be expected.

Summing up the discussion, we highlight the following problems of modern infrastructure

  • Three participants immediately identified a problem with stateful.
  • Various security issues, including the possibility that Docker will end up with multiple versions of Python, application servers, and components.
    Overspending, about which it is better to make a separate mitap.
    A learning problem, as orchestration is a complex ecosystem.
    A common problem in the industry is the misuse of tools.

    The rest of the conclusions are up to you. So far, there is a feeling that it is not easy for the Docker + Kubernetes bundle to become the “central” part of the system. For example, operating systems are put on the hardware first, which cannot be said about containers and orchestration. Perhaps, in the future, OSes and containers with cloud management software will grow together.

    Modern infrastructure: problems and prospects
    Photo by Gabriel Santos Photografia from Pexels

    I would like to take this opportunity to say hello to my mother, let me remind you that we have a Facebook group "Management and development of large IT projects", channel @feedmeto with interesting publications from various techno blogs. And my channel @rybakalexeywhere I talk about development management in product companies.

Source: habr.com

Add a comment