History of the Dodo IS Architecture: An Early Monolith

Or every unhappy company with a monolith is unhappy in its own way.

The development of the Dodo IS system began immediately, like the Dodo Pizza business, in 2011. It was based on the idea of ​​complete and total digitization of business processes, and on my own, which even then in 2011 caused a lot of questions and skepticism. But for 9 years now we have been following this path - with our own development, which began with a monolith.

This article is an β€œanswer” to the questions β€œWhy rewrite the architecture and make such large-scale and long-term changes?” back to previous article "History of the Dodo IS Architecture: The Way of the Back Office". I'll start with how the development of Dodo IS began, how the original architecture looked like, how new modules appeared, and because of what problems large-scale changes had to be made.

History of the Dodo IS Architecture: An Early Monolith

Series of articles "What is Dodo IS?" tells about:

  1. Early monolith in Dodo IS (2011-2015). (you are here)

  2. The Back Office Path: Separate Bases and Bus.

  3. The client side path: facade over the base (2016-2017). (In progress...)

  4. The history of real microservices. (2018-2019). (In progress...)

  5. Finished sawing of the monolith and stabilization of the architecture. (In progress...)

Initial architecture

In 2011, the Dodo IS architecture looked like this:

History of the Dodo IS Architecture: An Early Monolith

The first module in the architecture is order acceptance. The business process was:

  • the client calls the pizzeria;

  • the manager picks up the phone;

  • accepts an order by phone;

  • fills it in parallel in the order acceptance interface: it takes into account information about the client, data on order details, delivery address. 

The interface of the information system looked something like this ...

First version from October 2011:

Slightly improved in January 2012

Dodo Pizza Information System Delivery Pizza Restaurant

Resources for the development of the first order taking module were limited. We had to do a lot, quickly and with a small team. A small team is 2 developers who laid the foundation for the entire future system.

Their first decision determined the fate of the technology stack:

  • Backend on ASP.NET MVC, C# language. The developers were dotnetchiki, this stack was familiar and pleasant to them.

  • Frontend on Bootstrap and JQuery: user interfaces on self-written styles and scripts. 

  • MySQL database: no license costs, easy to use.

  • Servers on Windows Server, because .NET then could only be under Windows (we will not discuss Mono).

Physically, all this was expressed in the β€œdedic at the hoster”. 

Order Intake Application Architecture

Then everyone was already talking about microservices, and SOA was used in large projects for 5 years, for example, WCF was released in 2006. But then they chose a reliable and proven solution.

Here it is.

History of the Dodo IS Architecture: An Early Monolith

Asp.Net MVC is Razor, which, on request from a form or from a client, renders an HTML page with server rendering. On the client, CSS and JS scripts already display information and, if necessary, perform AJAX requests through JQuery.

Requests on the server end up in the *Controller classes, where the processing and generation of the final HTML page takes place in the method. Controllers make requests to a layer of logic called *Services. Each of the services corresponded to some aspect of the business:

  • For example, DepartmentStructureService gave out information on pizzerias, on departments. A department is a group of pizzerias run by a single franchisee.

  • ReceivingOrdersService accepted and calculated the composition of the order.

  • And SmsService sent SMS by calling API services to send SMS.

Services processed data from the database, stored business logic. Each service had one or more *Repositories with the appropriate name. They already contained queries to stored procedures in the database and a layer of mappers. There was business logic in the storages, especially a lot in those that issued reporting data. ORM was not used, everyone relied on hand-written sql. 

There was also a layer of the domain model and common helper classes, for example, the Order class that stored the order. In the same place, in the layer, there was a helper for converting the display text according to the selected currency.

All this can be represented by such a model:

History of the Dodo IS Architecture: An Early Monolith

Order Way

Consider a simplified initial way to create such an order.

History of the Dodo IS Architecture: An Early Monolith

Initially, the site was static. It had prices on it, and on top - a phone number and the inscription "If you want pizza - call the number and order." To order, we need to implement a simple flow: 

  • The client visits a static site with prices, selects products and calls the number listed on the site.

  • The customer names the products they want to add to the order.

  • Gives his address and name.

  • The operator accepts the order.

  • The order is displayed in the accepted orders interface.

It all starts with displaying the menu. A logged-in user-operator accepts only one order at a time. Therefore, the draft cart can be stored in his session (the user's session is stored in memory). There is a Cart object containing products and customer information.

The customer names the product, the operator clicks on + next to the product, and a request is sent to the server. Information about the product is pulled out from the database and information about the product is added to the cart.

History of the Dodo IS Architecture: An Early Monolith

Note. Yes, here you can not pull the product from the database, but transfer it from the frontend. But for clarity, I showed exactly the path from the database. 

Next, enter the address and name of the client. 

History of the Dodo IS Architecture: An Early Monolith

When you click "Create Order":

  • The request is sent to OrderController.SaveOrder().

  • We get Cart from the session, there are products in the quantity we need.

  • We supplement the Cart with information about the client and pass it to the AddOrder method of the ReceivingOrderService class, where it is saved to the database. 

  • The database has tables with the order, the composition of the order, the client, and they are all connected.

  • The order display interface goes and pulls out the latest orders and displays them.

New modules

Taking the order was important and necessary. You can't do a pizza business if you don't have an order to sell. Therefore, the system began to acquire functionality - approximately from 2012 to 2015. During this time, many different blocks of the system appeared, which I will call modules, as opposed to the concept of service or product. 

A module is a set of functions that are united by some common business goal. At the same time, they are physically in the same application.

Modules can be called system blocks. For example, this is a reporting module, admin interfaces, food tracker in the kitchen, authorization. These are all different user interfaces, some even have different visual styles. At the same time, everything is within the framework of one application, one running process. 

Technically, the modules were designed as Area (such an idea even remained in asp.net core). There were separate files for the frontend, models, as well as their own controller classes. As a result, the system was transformed from this ...

History of the Dodo IS Architecture: An Early Monolith

...into this:

History of the Dodo IS Architecture: An Early Monolith

Some modules are implemented by separate sites (executable project), due to a completely separate functionality and partly due to a slightly separate, more focused development. This:

  • Site β€” first version site dodopizza.ru.

  • Export: uploading reports from Dodo IS for 1C. 

  • Our Team - personal account of the employee. It was developed separately and has its own entry point and separate design.

  • fs β€” a project for hosting statics. Later we moved away from it, moving all the statics to the Akamai CDN. 

The rest of the blocks were in the BackOffice application. 

History of the Dodo IS Architecture: An Early Monolith

Name explanation:

  • Cashier - Restaurant cashier.

  • ShiftManager - interfaces for the "Shift Manager" role: operational statistics on pizzeria sales, the ability to put products on the stop list, change the order.

  • OfficeManager - interfaces for the "Pizzeria Manager" and "Franchisee" roles. Here are collected functions for setting up a pizzeria, its bonus promotions, receiving and working with employees, reports.

  • PublicScreens - interfaces for TVs and tablets hanging in pizzerias. TVs display menus, advertising information, order status upon delivery. 

They used a common service layer, a common Dodo.Core domain class block, and a common base. Sometimes they could still lead along the transitions to each other. Including individual sites, such as dodopizza.ru or personal.dodopizza.ru, went to general services.

When new modules appeared, we tried to reuse the already created code of services, stored procedures and tables in the database to the maximum. 

For a better understanding of the scale of the modules made in the system, here is a diagram from 2012 with development plans:

History of the Dodo IS Architecture: An Early Monolith

By 2015, everything was on the map and even more was in production.

  • Order acceptance has grown into a separate block of the Contact Center, where the order is accepted by the operator.

  • There were public screens with menus and information hanging in pizzerias.

  • The kitchen has a module that automatically plays the voice message "New Pizza" when a new order arrives, and also prints an invoice for the courier. This greatly simplifies the processes in the kitchen, allows employees not to be distracted by a large number of simple operations.

  • The delivery unit became a separate Delivery Checkout, where the order was issued to the courier who had previously taken the shift. His working time was taken into account for payroll calculation. 

In parallel, from 2012 to 2015, more than 10 developers appeared, 35 pizzerias opened, deployed the system to Romania and prepared for the opening of outlets in the United States. The developers no longer dealt with all the tasks, but were divided into teams. each specialized in its own part of the system. 

Problems

Including because of the architecture (but not only).

Chaos in the base

One base is convenient. Consistency can be achieved in it, and at the expense of tools built into relational databases. Working with it is familiar and convenient, especially if there are few tables and little data.

But over 4 years of development, the database turned out to have about 600 tables, 1500 stored procedures, many of which also had logic. Alas, stored procedures do not bring much advantage when working with MySQL. They are not cached by the base, and storing logic in them complicates development and debugging. Code reuse is also difficult.

Many tables did not have suitable indexes, somewhere, on the contrary, there were a lot of indexes, which made it difficult to insert. It was necessary to modify about 20 tables - the transaction to create an order could take about 3-5 seconds. 

The data in the tables was not always in the most appropriate form. Somewhere it was necessary to do denormalization. Part of the regularly received data was in a column in the form of an XML structure, this increased the execution time, lengthened the queries and complicated the development.

To the same tables were produced very heterogeneous requests. Popular tables suffered especially, like the table mentioned above. orders or tables pizzeria. They were used to display operational interfaces in the kitchen, analytics. Another site contacted them (dodopizza.ru), where at any given time a lot of requests could suddenly come. 

The data was not aggregated and many calculations took place on the fly using the base. This created unnecessary calculations and additional load. 

Often the code went to the database when it could not have done so. Somewhere there were not enough bulk operations, somewhere it would be necessary to spread one request into several through the code in order to speed up and increase reliability. 

Cohesion and obfuscation in code

Modules that were supposed to be responsible for their part of the business did not do it honestly. Some of them had duplication of functions for roles. For example, a local marketer who is responsible for the network's marketing activity in his city had to use both the "Admin" interface (to create promotions) and the "Office Manager" interface (to view the impact of promotions on the business). Of course, inside both modules used the same service that worked with bonus promotions.

Services (classes within one monolithic large project) could call each other to enrich their data.

With the model classes themselves that store data, work in the code was carried out differently. Somewhere there were constructors through which it was possible to specify required fields. Somewhere this was done through public properties. Of course, getting and transforming data from the database was varied. 

The logic was either in the controllers or in the service classes. 

These seem to be minor issues, but they greatly slowed down development and reduced quality, leading to instability and bugs. 

The complexity of a large development

Difficulties arose in the development itself. It was necessary to make different blocks of the system, and in parallel. Fitting the needs of each component into a single code became increasingly difficult. It was not easy to agree and please all the components at the same time. Added to this were limitations in technology, especially with regards to the base and frontend. It was necessary to abandon JQuery towards high-level frameworks, especially in terms of client services (website).

In some parts of the system, databases more suitable for this could be used.. For example, later we had the use case of moving from Redis to CosmosDB to store an order basket. 

Teams and developers involved in their field clearly wanted more autonomy for their services, both in terms of development and rollout. Merge conflicts, release issues. If for 5 developers this problem is insignificant, then with 10, and even more so with the planned growth, everything would become more serious. And ahead was to be the development of a mobile application (it started in 2017, and in 2018 it was big fall). 

Different parts of the system required different levels of stability, but due to the strong connectivity of the system, we could not provide this. An error in the development of a new function in the admin panel could well have taken place in the acceptance of an order on the site, because the code is common and reusable, the database and data are also the same.

It would probably be possible to avoid these mistakes and problems within the framework of such a monolithic-modular architecture: make a division of responsibility, refactor both the code and the database, clearly separate the layers from each other, monitor the quality every day. But the chosen architectural solutions and the focus on the rapid expansion of the system's functionality led to problems in terms of stability.

How the Power of the Mind blog put the cash registers in restaurants

If the growth of the pizzeria network (and load) continued at the same pace, then after a while the falls would be such that the system would not rise. Well illustrates the problems that we began to face by 2015, here is such a story. 

In the blog "Mind power” was a widget that showed data on revenue for the year of the entire network. The widget accessed the Dodo public API, which provides this data. This statistic is currently available at http://dodopizzastory.com/. The widget was shown on every page and made requests on a timer every 20 seconds. The request went to api.dodopizza.ru and requested:

  • the number of pizzerias in the network;

  • total network revenue since the beginning of the year;

  • revenue for today.

The request for statistics on revenue went straight to the database and started requesting data on orders, aggregating data on the fly and giving out the amount. 

Cash desks in restaurants went to the same table of orders, unloaded a list of orders received for today, and new orders were added to it. Cash registers made their requests every 5 seconds or on page refresh.

The diagram looked like this:

History of the Dodo IS Architecture: An Early Monolith

One fall, Fyodor Ovchinnikov wrote a long and popular article on his blog. A lot of people came to the blog and began to read everything carefully. While each of the people who came was reading the article, the revenue widget worked properly and requested the API every 20 seconds.

The API called a stored procedure to calculate the sum of all orders since the beginning of the year for all pizzerias in the network. The aggregation was based on the orders table, which is very popular. All cash desks of all open restaurants at that time go to it. Cash desks stopped responding, orders were not accepted. They were also not accepted from the site, did not appear on the tracker, the shift manager could not see them in his interface. 

This is not the only story. By the fall of 2015, every Friday the load on the system was critical. Several times we turned off the public API, and once, we even had to turn off the site, because nothing helped. There was even a list of services with a shutdown order under heavy loads.

From now on, our struggle with loads and for the stabilization of the system begins (from autumn 2015 to autumn 2018). That's when it happened"great fall". Further, failures also sometimes occurred, some were very sensitive, but the general period of instability can now be considered passed.

Rapid business growth

Why couldn't it be done right away? Just look at the following charts.

History of the Dodo IS Architecture: An Early Monolith

Also in 2014-2015 there was an opening in Romania and an opening in the USA was being prepared.

The network grew very quickly, new countries were opened, new formats of pizzerias appeared, for example, a pizzeria was opened at the food court. All this required significant attention specifically to the expansion of Dodo IS functions. Without all these functions, without tracking in the kitchen, accounting for products and losses in the system, displaying the issuance of an order in the food court hall, we would hardly be talking about the β€œcorrect” architecture and the β€œcorrect” approach to development now.

Another obstacle to the timely revision of the architecture and generally attention to technical problems was the crisis of 2014. Things like this hit hard on opportunities for teams to grow, especially for a young business like Dodo Pizza.

Quick Solutions That Helped

Problems needed solutions. Conventionally, solutions can be divided into 2 groups:

  • Fast ones that put out the fire and give a small margin of safety and buy us time to change.

  • Systemic and, therefore, long. Reengineering of a number of modules, division of a monolithic architecture into separate services (most of them are not at all micro, but rather macro services, and there is something about it Andrey Morevskiy's report). 

The dry list of quick changes is as follows:

Scale up base master

Of course, the first thing that is done to deal with loads is to increase the capacity of the server. This was done for the master database and for web servers. Alas, this is possible only up to a certain limit, then it becomes too expensive.

Since 2014, we have moved to Azure, we also wrote about this topic at that time in the article β€œHow Dodo Pizza delivers pizza using the Microsoft Azure cloud". But after a series of increases in the server for the base, they came up against the cost. 

Base replicas for reading

Two replicas were made for the base:

ReadReplica for reference requests. It is used to read directories, type, city, street, pizzeria, products (slowly changed domain), and in those interfaces where a small delay is acceptable. There were 2 of these replicas, we ensured their availability in the same way as the masters.

ReadReplica for Report Requests. This database had lower availability, but all reports went to it. Let them have heavy requests for huge data recalculations, but they do not affect the main database and operational interfaces. 

Caches in code

There were no caches anywhere in the code (at all). This led to additional, not always necessary, requests to the loaded database. Caches were first both in-memory and on an external cache service, that was Redis. Everything was invalidated by time, the settings were specified in the code.

Multiple backend servers

The backend of the application also needed to be scaled to handle the increased workloads. It was necessary to make a cluster from one iis-server. We've rescheduled application session from memory to RedisCache, which made it possible to make several servers behind a simple load balancer with round robin. At first, the same Redis was used as for caches, then it was split into several. 

As a result, the architecture has become more complicated ...

History of the Dodo IS Architecture: An Early Monolith

… but some of the tension was removed.

And then it was necessary to redo the loaded components, which we undertook. We will talk about this in the next part.

Source: habr.com

Add a comment