Microservices - Combinatorial Explosion of Versions

Hey Habr! I present to your attention author's translation of the article Microservices – Combinatorial Explosion of Versions.
Microservices - Combinatorial Explosion of Versions
At a time when the IT world is gradually moving towards microservices and tools like Kubernetes, only one problem is becoming more and more noticeable. This problem - combinatorial explosion versions of microservices. Nevertheless, the IT community believes that the current situation is much better than "Dependency Hell" previous generation of technologies. However, versioning microservices is a very complex problem. One piece of evidence for this is articles like "Give me back my monolith".

If you are reading this text do not yet understand the problem, let me explain. Let's say your product consists of 10 microservices. Now let's say that 1 new version is released for each of these microservices. Only 1 version - I hope we can all agree that this is a very trivial and insignificant fact. Now, however, let's look again at our product. With just one new version of each component, we now have 2^10 - or 1024 permutations of how our product could be put together.

If there is still confusion, let me break down the math. So, we have 10 microservices, each receives one update. That is, we get 2 possible versions for each microservice (either old or new). Now, for each of the product components, we can use either of these two versions. Mathematically, this is the same as if we had a binary number with 10 digits. For example, let's say that 1 is the new version and 0 is the old version - then one possible permutation could be denoted as 1001000000 - where the 1st and 4th components are updated and all the others are not. We know from mathematics that a binary number of 10 digits can have 2^10 or 1024 values. That is, we have confirmed the scale of the number we are dealing with.

Let's continue the reasoning further - what happens if we have 100 microservices and each has 10 possible versions? The whole situation becomes quite unpleasant - now we have 10^100 permutations - and this is a huge number. However, I prefer to describe this situation in this way, because now we are not hiding behind words like “kubernetes”, but are facing the problem as it is.

Why does this problem fascinate me so much? Partly because working earlier in the world of NLP and AI, we discussed the problem of combinatorial explosion a lot about 5-6 years ago. Only instead of versions we had single words, and instead of products we had sentences and paragraphs. And although the problems of NLP and AI remain largely unresolved, it must be admitted that significant progress has been made over the past few years. (in my opinion, progress could be bоIt would be great if people in the industry paid a little less attention to machine learning and a little more to other techniques - but this is already an off-topic).

Back to the world of DevOps and microservices. We have a huge problem in front of us, masquerading as an elephant in the Kunstkamera - because what I often hear is “just take kubernetes and helm, and everything will be fine!” But no, everything will not be fine if everything is left as it is. Moreover, the analytical solution of this problem is not acceptable in view of the complexity. As in NLP, we should first approach this problem by narrowing the scope of the search—in this case, by eliminating obsolete permutations.

One of the things that might help is I wrote last year about the need to maintain a minimum spread between versions laid out for clients. It is also important to note that a well-designed CI/CD process is very helpful in reducing variation. However, the current state of affairs with CI/CD is not good enough to solve the problem of permutations without additional accounting and component tracking tools.

What we need is a system of experiments at the integration stage, where we could determine the risk factor for each component, and also have an automated process for updating various components and testing without operator intervention - to see what works and what does not.

Such a system of experiments might look like this:

  1. Developers write tests (this is a critical stage - because otherwise we have no evaluation criterion - it's like marking data in machine learning).
  2. Each component (project) receives its own CI system - this process is well developed today, and the issue of creating a CI system for a single component has been largely resolved
  3. "Smart Integration System" collects the results of various CI systems and assembles the projects-components into the final product, starts testing and finally calculates the shortest path to obtaining the desired functionality of the product based on the existing components and risk factors. If an update is not possible, this system notifies developers about the available components and on which of them the error occurs. Once again, the test system is of critical importance here - since the integration system uses tests as an evaluation criterion.
  4. CD system, which then receives data from the "Smart Integration System" and performs the update itself. This stage ends the cycle.

To sum up, for me one of the biggest problems right now is the lack of such a "Smart Integration System" that would bind the various components into a product and thus allow you to track how the product as a whole is put together. I would be interested in the community's thoughts on this (spoiler - I'm currently working on a project Reliza, which can become such a smart integration system).

One last thing I want to mention is that for me, a monolith is not acceptable for any project, even if it is a medium size. I find it very skeptical to try to speed up development time and quality by going back to a monolith. Firstly, the monolith has a similar problem of managing components - among the various libraries that it consists of, however, all this is not so noticeable and manifests itself primarily in the time spent by developers. The consequence of the monolith problem is the virtual inability to make changes to the code - and the extremely slow development speed.

Microservices improve the situation, but then the microservice architecture faces the problem of a combinatorial explosion at the integration stage. Yes, in general, we moved the same problem - from the development stage to the integration stage. However, in my opinion, the microservices approach still leads to better results, and teams achieve results faster (probably mainly due to the reduction in the size of the development unit - or batch size). However, the move from monolith to microservices hasn't improved the process enough yet - the combinatorial explosion of microservice versions is a huge problem, and we have a lot of potential to improve as we address it.

Source: habr.com

Add a comment