🥇Transition from monolith to microservices: history and practice

In this article, I will talk about how the project I work on has evolved from a large monolith into a set of microservices.

The project began its history quite a long time ago, in early 2000. The first versions were written in Visual Basic 6. Over time, it became clear that development in this language would be difficult to support in the future, since the IDE and the language itself are developing poorly. In the late 2000s, it was decided to switch to the more promising C#. The new version was written in parallel with the revision of the old one, gradually more and more code was in .NET. The backend in C# was originally focused on the service architecture, however, common libraries with logic were used during development, and services were launched in a single process. The result was an application that we called the “service monolith”.

One of the few advantages of such a bundle was the ability for services to call each other through an external API. There were clear prerequisites for the transition to a more correct service, and in the future, microservice architecture.

We started our decomposition work around 2015. We have not yet reached the ideal state - there are parts of a large project that can hardly be called monoliths, but they do not look like microservices either. However, progress is significant.
I will talk about it in the article.

Content

Architecture and problems of the existing solution
Expectations from microservices
Transition issues
How to move from monolith to microservices
Working with the database
- Separation of existing tables
- Processing department
Working with source code
Infrastructure issues

Architecture and problems of the existing solution

Initially, the architecture looked like this: UI - a separate application, the monolithic part was written in Visual Basic 6, the .NET application was a set of related services working with a fairly large database.

Disadvantages of the previous solution

Single point of failure
We had a single point of failure: the .NET application was running in one process. If any of the modules failed, the entire application failed and had to be restarted. Since we automate a large number of processes for different users, due to a failure in one of them, all of them could not work for some time. And with a software error, redundancy did not help either.

Improvement queue
This shortcoming is more organizational. There are many customers in our application, and they all want to finalize it as soon as possible. Previously, it was impossible to do this in parallel, and all customers stood in line. This process caused a negative for the business, because they needed to prove that their task was of value. And the development team spent time organizing this queue. It took a lot of time and effort, and as a result, the product could not change as quickly as we would like it to.

Suboptimal use of resources
When hosting services in a single process, we always completely copied the configuration from server to server. We wanted to separate the most loaded services separately so as not to waste resources, and to get more flexible control over our deployment scheme.

Difficult to implement modern technologies
A problem familiar to all developers: there is a desire to introduce modern technologies into the project, but there is no way. With a large monolithic solution, any update of the current library, not to mention the transition to a new one, turns into a rather non-trivial task. It takes a long time to prove to the team leader that this will bring more bonuses than wasted nerves.

Difficulty issuing changes
This was the most serious problem - we issued releases every two months.
Each release turned into a real disaster for the bank, despite the testing and efforts of the developers. The business understood that some of the functionality would not work for it at the beginning of the week. And the developers understood that a week of serious incidents awaited them.
Everyone had a desire to change the situation.

Expectations from microservices

Issuance of components when ready. Issuance of components as they are ready by decomposing the solution and separating different processes.

Small product teams. This is important because the large team working on the old monolith was difficult to manage. Such a team was forced to work according to a strict process, but they wanted more creativity and independence. Only small teams could afford it.

Isolation of services in separate processes. Ideally, I would like to isolate it in containers, but a large number of services written in the .NET Framework run only under WindowsServices based on .NET Core are now appearing, but there are still few of them.

Deployment flexibility. I would like to combine services the way we need it, and not the way the code forces.

Use of new technologies. This is of interest to any programmer.

Transition issues

Of course, if splitting a monolith into microservices was easy, there would be no need to talk about it at conferences and write articles. There are many pitfalls in this process, I will describe the main ones that hindered us.

First problem typical for most monoliths: business logic connectivity. When we write a monolith, we want to reuse our classes so as not to write extra code. And when moving to microservices, this becomes a problem: all the code is quite tightly coupled, and it is difficult to separate services.

At the time of the start of work, the repository had more than 500 projects and more than 700 thousand lines of code. This is a big enough solution. second problem. It was not possible to simply take and divide it into microservices.

Third problem — Lack of necessary infrastructure. In fact, we were engaged in manual copying of the source code to the servers.

How to move from monolith to microservices

Allocation of microservices

Firstly, we immediately determined for ourselves that the separation of microservices is an iterative process. We have always been required to develop business tasks in parallel. How we will implement this technically is already our problem. Therefore, we prepared for an iterative process. It will not work differently if you have a large application, and it is not initially ready to be rewritten.

What methods do we use to isolate microservices?

The first method - to take out existing modules as services. In this regard, we were lucky: there were already formalized services that worked according to the WCF protocol. They were separated into separate assemblies. We ported them separately, adding a small launcher to each build. It was written using the wonderful Topshelf library, which allows you to run the application both as a service and as a console. This is useful for debugging because no additional projects are required in the solution.

The services were connected by business logic, as they used common assemblies and worked with a common database. It was difficult to call them microservices in their purest form. However, we could issue these services separately, in different processes. This already made it possible to reduce their influence on each other, reducing the problem with parallel development and a single point of failure.

The host assembly is just one line of code in the Program class. We hid the work with Topshelf in an auxiliary class.

namespace RBA.Services.Accounts.Host
{
   internal class Program
   {
      private static void Main(string[] args)
      {
        HostRunner<Accounts>.Run("RBA.Services.Accounts.Host");

       }
    }
}

The second way to isolate microservices: create them to solve new problems. If at the same time the monolith does not grow, this is already excellent, which means that we are moving in the right direction. To solve new problems, we tried to create separate services. If there was such an opportunity, then we created more “canonical” services that completely manage their data model, a separate database.

We, like many others, started with authentication and authorization services. They are perfect for this. They are independent, as a rule, they have a separate data model. They themselves do not interact with the monolith, only it refers to them to solve some problems. On these services, you can start the transition to a new architecture, debug the infrastructure on them, try some approaches related to network libraries, etc. There are no teams in our organization that could not make an authentication service.

The third way to isolate microservices, which we use, is a little specific to us. This is the removal of business logic from the UI layer. Our main UI application is desktop, it, like the backend, is written in C#. The developers periodically made mistakes and took out parts of the logic on the UI that should have existed in the backend and reused.

If you look at a real example from the code of the UI part, you can see that most of this solution contains real business logic, which is useful in other processes, not only for building a UI form.

The real UI logic there is only the last couple of lines. We transferred it to the server so that it could be reused, thereby reducing the UI and achieving the correct architecture.

Fourth, most important way to isolate microservices, which allows you to reduce the monolith, is the removal of existing services with processing. When we take out existing modules as they are, developers do not always like the result, and the business process could become outdated since the creation of the functionality. Through refactoring, we can support a new business process because business requirements are constantly changing. We can improve the source code, remove known defects, create a better data model. There are many benefits to be gained.

Decoupling services with rework goes hand in hand with the concept of bounded context. This is a concept from domain-oriented design. It means a section of the domain model in which all the terms of a single language are uniquely defined. Consider the context of insurance and bills as an example. We have a monolithic application, and it is necessary to work with the account in insurance. We expect the developer to find the existing Account class in another assembly, make a reference to it from the Insurance class, and we will get working code. The DRY principle will be observed, the task will be done faster by using existing code.

As a result, it turns out that the contexts of accounts and insurances are connected. As new requirements come along, this relationship will interfere with development, adding complexity to an already complex business logic. To solve this problem, you need to find the boundaries between contexts in the code and remove their violations. For example, in the context of insurances, it is quite possible that the 20-digit number of the Central Bank account and the date of opening the account will be sufficient.

In order to separate these bounded contexts from each other and start the process of extracting microservices from a monolithic solution, we used an approach such as creating external APIs inside the application. If we knew that some module should become a microservice, somehow change within the process, then we immediately made calls to the logic that belongs to another limited context through external calls. For example, through REST or WCF.

We firmly decided for ourselves that we would not avoid code that would require distributed transactions. In our case, it turned out to be quite easy to follow this rule. So far, we have not encountered such situations when hard distributed transactions are really needed - the final consistency between the modules is quite enough.

Let's consider a specific example. We have the concept of an orchestrator - a pipeline that processes the essence of the "application". He creates a client, an account, and a bank card in turn. If the client and the account are created successfully, but the card creation fails, the application does not go into the "successful" status and remains in the "card not created" status. In the future, the background activity will pick it up and finish it. The system has been in a state of inconsistency for some time, but we are generally satisfied with this.

In the event that a situation nevertheless arises when it will be necessary to consistently save part of the data, we will most likely go for the enlargement of the service in order to process this in one process.

Consider an example of allocating a microservice. How can it be relatively safe to bring it to production? In this example, we have a separate part of the system - a payroll service module, one of the code sections of which we would like to make microservice.

First of all, we create a microservice by rewriting the code. We improve some points that did not suit us. We implement new business requirements from the customer. We add to the bundle between the UI and the API Gateway backend, which will provide call forwarding.

Next, we release this configuration into operation, but in a pilot state. Most of our users are still working with old business processes. For new users, we are developing a new version of the monolithic application that this process no longer contains. In fact, we have a bunch of monolith and microservice working in the form of a pilot.

With a successful pilot, we understand that the new configuration really works, we can remove the old monolith from the equation and leave the new configuration in place of the old solution.

In total, we use almost all existing methods for separating the source code of a monolith. All of them allow us to reduce the size of parts of the application and translate them to new libraries, making better source code.

Working with the database

The database lends itself to separation worse than the source code, since it contains not only the current schema, but also the accumulated historical data.

Our database, like many others, had another important drawback - its huge size. This database was designed according to the intricate business logic of a monolith, and relationships have accumulated between tables of various limited contexts.

In our case, on top of all the troubles (a large database, many relationships, sometimes incomprehensible boundaries between tables), there was a problem that occurs in many large projects: the use of the shared database pattern. Data was taken from tables through views, through replication and shipped to other systems where this replication is needed. As a result, we could not move the tables into a separate schema, because they were actively used.

In the division, the very division into limited contexts in the code helps us. It usually gives us a pretty good idea of how we break up data at the database level. We understand which tables belong to one bounded context and which to another.

We have applied two global ways to split the database: split existing tables and split with processing.

Separating existing tables is a good practice to use if the data structure is good, meets the business requirements, and everyone is happy with it. In this case, we can allocate existing tables into a separate schema.

A branch with processing is needed when the business model has changed a lot, and the tables no longer satisfy us at all.

Separation of existing tables. We need to determine what we will separate. Without this knowledge, nothing will work, and separation of limited contexts in the code will help us here. As a rule, if you can understand the boundaries of the contexts in the source code, it becomes clear which tables should be included in the list for separation.

Imagine that we have a solution where two monolith modules interact with the same database. We need to make sure that only one module interacts with the section of tables to be separated, and the other one starts interacting with it through the API. To begin with, it is enough that only recording is carried out through the API. This is a necessary condition so that we can talk about the independence of microservices. Read links can remain as long as there is no big problem.

As the next step, we can already separate the section of code that works with detachable tables, with or without processing, into a separate microservice and run in a separate process, a container. This will be a separate service with a connection to the monolith database and those tables that are not directly related to it. The monolith still interacts with the detachable part for reading.

Later we will remove this connection, that is, we will also transfer the reading of monolithic application data from detached tables to the API.

Next, we select tables from the common database that only the new microservice works with. We can move the tables to a separate schema or even to a separate physical database. There remains a connection for reading between the microservice and the monolith database, but there is nothing to worry about, in this configuration it can live for a long time.

The last step is to completely remove all links. In this case, we may need to migrate data from the main database. Sometimes we want to reuse some data or directories replicated from external systems in several databases. We have this from time to time.

Processing department. This method is very similar to the first one, only it goes in reverse. We immediately have a new database and a new microservice that interacts with the monolith through the API. But this leaves a set of database tables that we want to delete in the future. We will no longer need it, we have replaced it in the new model.

For this scheme to work, we will most likely need a transitional period.

Next, there are two possible approaches.

There's a: we duplicate all data in the new and old databases. In this case, we have data redundancy, there may be problems with synchronization. But then we can take two different clients. One will work with the new version, the other with the old one.

Second: separating data according to some business attribute. For example, we had 5 products in the system, which are stored in the old database. The sixth one, within the framework of the new business task, we place in a new database. But we need an API Gateway that synchronizes this data and shows the client where and what to take.

Both approaches work, choose depending on the situation.

After we make sure that everything works, the part of the monolith that works with the old database structures can be disabled.

The last step is to remove the old data structures.

Summing up, we can say that we have problems with the database: it is difficult to work with it compared to the source code, it is more difficult to separate, but it can and should be done. We have found some ways that allow you to do this quite safely, yet it is easier to make a mistake with the data than with the source code.

Working with source code

This is what the source code diagram looked like when we started to analyze the monolithic project.

It can be conditionally divided into three layers. This is a layer of run modules, plugins, services and individual activities. In fact, these were entry points within a monolithic solution. All of them were tightly fastened with the Common layer. It had business logic that was shared between services and a lot of relationships. Each service and plugin used up to 10 or more common assemblies, depending on their size and the conscience of the developers.

We were lucky, we had infrastructure libraries that could be used separately.

Sometimes a situation arose when some Common objects did not actually belong to this layer, but were infrastructure libraries. This was resolved by renaming.

Bounded contexts were the biggest concern. It happened that 3-4 contexts were mixed in one Common assembly and used each other within the same business functions. It was necessary to understand where it can be divided and along what boundaries, and what to do next with mapping this separation to source code assemblies.

We have formulated several rules for the code splitting process.

The first: We no longer wanted to share business logic between services, activities and plugins. We wanted to make business logic independent within microservices. On the other hand, microservices, ideally, are perceived as services that exist completely independently. I think that this approach is somewhat wasteful, and difficult to achieve, because, for example, C# services will be connected by the standard library anyway. Our system is written in C#, other technologies have not yet been used. Therefore, we decided that we could afford to use common technical builds. The main thing is that they do not contain any fragments of business logic. If you have a nice wrapper around the ORM you're using, it's very expensive to copy it from service to service.

Our team is a fan of domain-oriented design, so onion architecture was a great fit for us. The basis in our services was not the data access layer, but an assembly with domain logic, which contains only business logic and is devoid of links to the infrastructure. At the same time, we can independently refine the domain assembly to solve problems associated with frameworks.

At this stage, we met the first serious problem. The service had to refer to one domain assembly, we wanted to make the logic independent, and the DRY principle interfered with us here. The developers wanted to reuse classes from neighboring assemblies to avoid duplication, and as a result, the domains began to communicate with each other again. We analyzed the results and decided that perhaps the problem also lies in the area of \uXNUMXb\uXNUMXbthe source code repository device. We had a large repository that contained all the source codes. Solution for the whole project was very difficult to build on a local machine. Therefore, separate small solutions were created for parts of the project, and no one forbade adding some Common or domain assembly to them and reusing them. The only tool that did not allow us to do this was the review code. But sometimes he messed up too.

Then we started moving to a model with separate repositories. Business logic has ceased to flow from service to service, domains have indeed become independent. Bounded contexts are more explicitly supported. How do we reuse infrastructure libraries? We separated them into a separate repository, then put them into Nuget packages, which we put in Artifactory. With any change, the assembly and publication occurs automatically.

Our services began to refer to internal infrastructure packages in the same way as to external ones. We download external libraries from Nuget. To work with Artifactory, where we put these packages, we used two package managers. In small repositories, we also used Nuget. In repositories with multiple services, we have used a Paket which provides more version consistency between modules.

Thus, by working on the source code, slightly changing the architecture and separating the repositories, we make our services more independent.

Infrastructure issues

Most of the downsides to migrating to microservices have to do with infrastructure. You will need automated deployment, you will need new libraries to run the infrastructure.

Manual installation in environments

Initially, we installed the solution on environments manually. To automate this process, we have created a CI/CD pipeline. We chose the continuous delivery process, because continuous deployment is still unacceptable for us from the point of view of business processes. Therefore, sending into operation is carried out by a button, and for testing - automatically.

We use Atlassian, Bitbucket for source storage, and Bamboo for builds. We like to write build scripts in Cake because it's the same C#. Ready-made packages come to Artifactory, and Ansible automatically gets to the test servers, after which they can be tested immediately.

Separate logging

At one time, one of the ideas of the monolith was to provide shared logging. We also needed to figure out what to do with individual logs that are on the disks. Logs are written to text files. We decided to use the standard ELK stack. We didn’t write to ELK directly through providers, but decided that we would finalize the text logs and write the trace ID in them as an identifier, adding the service name so that these logs could then be parsed.

With Filebeat we are able to collect our logs from servers, then transform them, use Kibana to build queries in the UI, and see how the call was routed between services. Trace IDs are very helpful for this.

Testing and debugging related services

Initially, we did not fully understand how we could debug the services we were developing. Everything was simple with the monolith, we ran it on the local machine. At first, they tried to do the same with microservices, but sometimes, in order to fully launch one microservice, you need to launch several others, and this is inconvenient. We realized that it is necessary to move to the model when we leave only the service or services that we want to debug on the local machine. The remaining services are used from servers that match the configuration with prod. After debugging, during testing, for each task, only changed services are issued to the test server. Thus, the solution is tested in the form in which it will be on sale in the future.

There are servers on which only production versions of services are installed. These servers are needed for incidents, for pre-deployment delivery checks, and for internal training.

We have added an automated testing process using the popular Specflow library. Tests are run automatically by NUnit as soon as they are deployed from Ansible. If task coverage is fully automatic, then there is no need for manual testing. Although sometimes additional manual testing is still required. To determine which tests to run for a particular issue, we use tags in Jira.

Additionally, the need for load testing has grown, previously it was carried out only in rare cases. We use JMeter to run the tests, InfluxDB to store them, and Grafana to plot the process.

What have we achieved?

First, we got rid of the concept of "release". Two-month monstrous releases disappeared when this colossus was deployed in a production environment, breaking business processes for a while. Now we deploy services on average every 1,5 days, grouping them, because they go into operation after approval.

There are no fatal crashes in our system. If we release a microservice with a bug, then the functionality associated with it will be broken, and all other functionality will not be affected. This greatly improves the user experience.

We can control the deployment scheme. You can separate groups of services separately from the rest of the solution, if necessary.

In addition, we have significantly reduced the problem with a large queue of improvements. We have separate product teams that work with some of the services independently. This is where the Scrum process comes in handy. A specific team may have a separate product owner who assigns tasks to it.

Summary

Microservices are well suited for decomposing complex systems. In the process, we begin to understand what is in our system, what are the limited contexts, where are their boundaries. This allows you to properly distribute improvements to modules and prevent code obfuscation.
Microservices provide organizational benefits. They are often referred to only as architecture, but any architecture is needed to solve business needs, and not in itself. Therefore, we can say that microservices are well suited for solving problems in small teams, given that Scrum is very popular right now.
Separation is an iterative process. You can’t take an application and just split it into microservices. The resulting product is unlikely to be workable. When highlighting microservices, it is beneficial to rewrite the existing legacy, that is, turn it into code that we like and better meets the needs of the business in terms of functionality and speed.
Small caveat: The costs of migrating to microservices are quite significant. It took a long time to solve the problem of infrastructure. Therefore, if you have a small application that does not require specific scaling, if there are not a large number of customers who are fighting for the attention and time of your team, then perhaps microservices are not what you need today. It's quite expensive. If you start the process with microservices, then the costs at first will be more than if the same project starts with the development of a monolith.
PS A more emotional story (and as if to you personally) - by link.
Here is the full version of the report.

Source: habr.com