In-memory architecture for web services: basic technology and principles

In-Memory is a set of data storage concepts when they are stored in the application's RAM, and the disk is used for backup. In classical approaches, data is stored on disk and memory is stored in cache. For example, a web application with a backend for processing data requests them from the storage: it receives, transforms, and a lot of data is transferred over the network. In In-Memory, calculations are sent to data - to storage, where they are processed and the network is less loaded.

Due to its architecture, In-Memory is many times, and sometimes orders of magnitude, faster than the speed of data access. For example, bank analysts want to see in the analytical application a report on loans issued in dynamics by day for the past year. This process on a classic DBMS will take minutes, and with In-Memory it will appear almost immediately. This is because the approach allows you to cache much more information and it is stored in RAM β€œat hand”. The application does not need to request data from the hard disk, the availability of which is limited by the speed of the network and disk.

What other possibilities are available with In-Memory and what kind of approach it is, will tell Vladimir Pligin - Engineer at GridGain. This overview will be useful for web application backend developers who have not worked with In-Memory and want to try it out, or who are interested in current trends in software development and architecture design.

Note. The article is based on the transcript of Vladimir's report at #GetIT Conf. Before the introduction of self-isolation, we regularly held meetups and conferences for developers in Moscow and St. Petersburg: we discussed trends, current development issues, problems and their solutions. There is no conference to be held now, but it's time to share useful materials from the past.

Who uses In-Memory and how

In-Memory is used most often where fast user interaction or processing of large amounts of data is required.

  • Banks use In-Memory, for example, to reduce delays when customers use applications or to analyze a customer before issuing a loan.
  • Fintech uses In-Memory to improve the performance of services and applications for banks that outsource data processing and analysis. 
  • Insurance companies: to calculate risks, for example by analyzing customer data over several years.
  • Logistics companies. They process a lot of data, for example, in order to calculate the optimal routes for freight and passenger traffic with thousands of parameters, and track the status of shipments.
  • retail. In-Memory solutions help to serve customers faster and process large amounts of information: shipments, invoices, transactions, availability of thousands of goods in warehouses, and prepare analytical reports.
  • Π’ IoT In-Memory replaces traditional databases.
  • Pharmaceutical companies use In-Memory, for example, to enumerate combinations of drug formulations. 

I'll share some examples of how our clients use In-Memory solutions and how you can implement them in your own.

In-Memory as primary storage

One of our clients is a major supplier of medical scientific equipment from the USA. They use the In-Memory solution as their main data storage. All data is stored on disk, and the subset of data that is actively used is kept in RAM. Storage access methods are standard - GDBC (Generic Database Connector) and SQL query language.

In-memory architecture for web services: basic technology and principles

Collectively, this is called In-Memory Database (IMDB) or Memory-Centric Storage. This class of solutions has many names, but these are not the only ones. 

IMDB Features:

  • The data that is stored in In-Memory and accessed via SQL is the same as in the other approaches. They are synchronized, only the way they are presented, the way they are accessed, differs. Transactionality works between data.

  • IMDB is faster than relational databases because getting information from RAM is faster than from disk. 
  • Internal optimization algorithms have fewer instructions.
  • IMDBs are suitable for managing data, events, and transactions in applications.

IMDBs partially support ACID: atomicity, consistency, and isolation. But they do not support "durability" - when the power is turned off, all data disappears. To solve the problem, you can use snapshots - a β€œsnapshot” of the database, an analogue of a database backup to a hard disk, or record transactions (logs) to restore data after a reboot.

To build fault-tolerant applications

Imagine the classic architecture of a fault-tolerant web application. It works like this: all requests are distributed by the web balancer between servers. This system is resilient because the servers duplicate each other and secure in case of incidents.

In-memory architecture for web services: basic technology and principles

The balancer directs all requests from one session strictly to one server. This is a sticky-session mechanism: each session is tied to a server where it is locally stored and processed. 

What happens when one of the servers fails?

In-memory architecture for web services: basic technology and principles

The service will not suffer because the architecture is duplicated. But we will lose a subset of the sessions of the deceased server. And at the same time, users who are tied to these sessions. For example, a client places an order and suddenly throws him out of the office. He will be dissatisfied when he re-authorizes and finds that everything will have to be processed again.

A web application is required to support a large number of users and not β€œslow down” so that they can work comfortably. But in case of failure, with each subsequent request, the time to communicate with the session storage will increase. This increases the average latency for other users. But they don't want to wait any longer than they're used to.

This problem can be solved, as our other client is a large PASS provider from the USA. It uses In-Memory to cluster web sessions. To do this, it stores them not locally, but centrally - in an In-Memory cluster. In this case, the sessions are available much faster, because they are already in RAM.

In-memory architecture for web services: basic technology and principles

When a server goes down, the load balancer sends requests from the downed server to other servers, just like in the classical architecture. But there is an important difference: sessions are stored in the In-Memory cluster and the servers have access to the sessions of the downed server.

This architecture increases the fault tolerance of the entire system. Moreover, it is possible to completely abandon the mechanism of sticky sessions.

Hybrid transactional-analytical processing (HTAP)

Usually transactional and analytical systems are kept separately. When they are separated, the main base falls under the load. For analytical processing, data is copied to a replica so that analytical processing does not interfere with transactional processes. But copying comes with a backlog - without a backlog, it is impossible to replicate. If we do it synchronously, it will also slow down the main base and we will not get a win.

In HTAP, things work differently - the same data store is used for transactional load from applications, and for analytical queries that can take a long time to complete. When the data is in RAM, analytical queries are executed faster, and the server with the database is loaded less (on average).

In-memory architecture for web services: basic technology and principles

The hybrid approach breaks the wall between transaction processing and analytics. If we perform analytics on the same storage, then analytical queries are run on data from RAM. They are much more precise, more interpretable and adequate.

Integration of In-Memory Solutions

Simple (relatively) way βˆ’ develop everything from scratch. We keep data on disk, but hot data is stored in memory. This helps to survive server reboots or shutdowns.

There are two main scenarios at work here when data is stored on disk. In the first, we want to survive crashes or regular reboots of the cluster or parts - we want to use it as a simple database. In the second scenario, when there is too much data, some of it is in memory.

If it is not possible to build everything from scratch, it is possible to integrate In-Memory into already existing architecture. But not all In-Memory solutions are suitable for this. There are three prerequisites. In-Memory solution must support:

  • a standard way to connect to the database that will be under it (for example, MySQL);
  • a standard query language so as not to rewrite and change the logic of interaction with the repository;
  • transactional - preserve the semantics of interaction.

If all three conditions are met, then integration is possible. We place the In-Memory Data Grid between the application and the database. Now write requests will be delegated to the underlying database, and read requests will be delegated to the database if the data is not in the cache.

In-memory architecture for web services: basic technology and principles

If fast access to data and its processing is important to you, for example, for business intelligence, you can think about implementing In-Memory. And for implementation, you can use both methods when designing a new architecture.

Source: habr.com

Add a comment