Towards serverless databases - how and why

Hi all! My name is Golov Nikolay. Previously, I worked at Avito and managed the Data Platform for six years, that is, I was involved in all databases: analytical (Vertica, ClickHouse), streaming and OLTP (Redis, Tarantool, VoltDB, MongoDB, PostgreSQL). During this time, I dealt with a large number of databases - very different and unusual, and with non-standard cases of their use.

I currently work for ManyChat. In essence, this is a startup - new, ambitious and rapidly growing. And when I first joined the company, the classic question arose: “What should a young startup take from the DBMS and database market now?”.

In this article, based on my report on online festival RIT++2020I will answer this question. The video version of the report is available at YouTube.

Towards serverless databases - how and why

Well Known Databases 2020

In the yard of 2020, I looked around and saw three types of databases.

The first type is classic OLTP databases: PostgreSQL, SQL Server, Oracle, MySQL. They were written a long time ago, but are still relevant because they are well known to the development community.

The second type is bases from "zero". They tried to get away from the classic templates by abandoning SQL, traditional structures and ACID, by adding built-in sharding and other attractive features. For example, these are Cassandra, MongoDB, Redis or Tarantool. All these solutions wanted to offer something fundamentally new to the market and occupied their niche, because they turned out to be extremely convenient in certain tasks. These databases will be denoted by the umbrella term NOSQL.

The “zero” ones are over, they are used to NOSQL databases, and the world, from my point of view, has taken the next step - to managed databases. These databases have the same core as classic OLTP databases or new NoSQL databases. But they have no need for DBA and DevOps and they run on managed hardware in the clouds. For a developer, this is “just a base” that works somewhere, and no one cares how it is installed on the server, who configured the server and who updates it.

Examples of such bases:

  • AWS RDS - managed wrapper over PostgreSQL/MySQL.
  • DynamoDB - AWS analogue of document based database, similar to Redis and MongoDB.
  • Amazon Redshift - managed analytical base.

Basically, these are old databases, but raised in a managed environment, without the need to work with hardware.

Note. The examples are taken for the AWS environment, but their counterparts also exist in Microsoft Azure, Google Cloud, or Yandex.Cloud.

Towards serverless databases - how and why

What is new about this? In 2020, none of that.

Serverless concept

The really new thing on the market in 2020 is serverless or serverless solutions.

I will try to explain what this means, using the example of a regular service or backend application.
To deploy a regular backend application, we buy or rent a server, copy the code to it, publish the endpoint outside, and regularly pay for rent, electricity, and data center services. This is the standard layout.

Is it possible somehow differently? It is possible with serverless services.

What is the focus of this approach: there is no server, there is not even a lease of a virtual instance in the cloud. To deploy the service, we copy the code (functions) to the repository and publish it outside the endpoint. Further, we simply pay for each call to this function, completely ignoring the hardware where it is executed.

I will try to illustrate this approach with pictures.
Towards serverless databases - how and why

Classic deployment. We have a service with a certain load. We raise two instances: physical servers or instances in AWS. External requests are sent to these instances and processed there.

As you can see in the picture, the servers are utilized differently. One is 100% utilized, there are two requests, and one is only 50% partially idle. If not three requests come, but 30, then the whole system will not cope with the load and will start to slow down.

Towards serverless databases - how and why

Serverless deployment. In a serverless environment, such a service does not have instances and servers. There is some pool of heated resources - small prepared Docker containers with deployed function code. The system receives external requests, and for each of them, the serverless framework raises a small container with code: it processes exactly this request and kills the container.

One request - one lifted container, 1000 requests - 1000 containers. And deployment on iron servers is already the work of a cloud provider. It is completely hidden by the serverless framework. In this concept, we pay for each call. For example, one call came in a day - we paid for one call, a million came in a minute - we paid for a million. Or in a second, it also happens.

The concept of publishing a serverless function is appropriate for a stateless service. And if you need a (state) statefull service, then add a database to the service. In this case, when it comes to working with state, with state, each statefull function just writes and reads from the database. Moreover, from a database of any of the three types described at the beginning of the article.

What is the common limitation of all these bases? This is the cost of a constantly used cloud or iron server (or several servers). It doesn’t matter if we use a classic database or managed, have Devops and an admin or not, we still pay 24/7 for hardware, electricity and data center rental. If we have a classic base, we pay for master and slave. If a highly loaded sharded base - we pay for 10, 20 or 30 servers, and we pay constantly.

The presence of permanently reserved servers in the cost structure used to be perceived as a necessary evil. Regular databases also have other difficulties, such as limits on the number of connections, scaling limits, geo-distributed consensus - they can somehow be solved in certain databases, but not all at once and not ideally.

Serverless Database - Theory

Question 2020: Can the database be made serverless too? Everyone has heard about the serverless backend ... but let's try to make the database serverless as well?

This sounds strange, because the database is a statefull service, not very suitable for a serverless infrastructure. At the same time, the state of the database is very large: gigabytes, Terabytes, and even petabytes in analytical databases. It is not so easy to lift it in lightweight Docker containers.

On the other hand, almost all modern databases are a huge amount of logic and components: transactions, integrity negotiation, procedures, relational dependencies and a lot of logic. For a fairly large part of the database logic, a small state is sufficient. Gigabytes and Terabytes are directly used by only a small part of the database logic related to the direct execution of queries.

Accordingly, the idea is: if part of the logic allows stateless execution, why not split the base into Stateful and Stateless parts.

Serverless for OLAP solutions

Let's see how a database split into Stateful and Stateless parts might look like using practical examples.

Towards serverless databases - how and why

For example, we have an analytical database: external data (red cylinder on the left), an ETL process that loads data into the database, and an analyst that sends SQL queries to the database. This is a classic data warehouse scheme.

In this scheme, conventionally, ETL is performed once. Then you need to pay all the time for the servers on which the database runs with data flooded with ETL, so that there is something to throw requests.

Let's look at an alternative approach implemented in AWS Athena Serverless. There is no permanently dedicated hardware on which downloaded data is stored. Instead of this:

  • The user sends a SQL query to Athena. The Athena optimizer parses the SQL query and searches the Metadata store for the specific data needed to execute the query.
  • The optimizer, based on the collected data, unloads the necessary data from external sources into a temporary storage (temporary database).
  • In temporary storage, an SQL query is executed from the user, the result is returned to the user.
  • Temporary storage is cleared, resources are released.

In this architecture, we pay only for the process of executing the request. No requests, no costs.

Towards serverless databases - how and why

This is a working approach and is implemented not only in Athena Serverless, but also in Redshift Spectrum (on AWS).

The Athena example shows that the Serverless database works on real queries with tens and hundreds of Terabytes of data. Hundreds of Terabytes will require hundreds of servers, but we don't have to pay for them - we pay for requests. The speed of each request is (very) low compared to specialized analytical databases like Vertica, but we do not pay for downtime periods.

Such a database is applicable for occasional ad-hoc analytic queries. For example, when we spontaneously decide to test a hypothesis on some gigantic amount of data. For these cases, Athena fits perfectly. For regular requests, such a system is expensive. In this case, cache the data in some specialized solution.

Serverless for OLTP solutions

In the previous example, OLAP tasks (analytical) were considered. Now consider OLTP tasks.

Imagine a scalable PostgreSQL or MySQL. Let's raise a regular managed instance of PostgreSQL or MySQL with minimal resources. When more load comes to the instance, we will connect additional replicas to which we will distribute part of the reading load. If there are no requests and load, we turn off the replicas. The first instance is the master, and the rest are replicas.

This idea is implemented in a database called Aurora Serverless AWS. The principle is simple: requests from external applications are accepted by the proxy fleet. Seeing the increase in load, it allocates computing resources from pre-warmed minimal instances - the connection is made as quickly as possible. Disabling instances works the same way.

Within Aurora, there is the concept of Aurora Capacity Unit, ACU. This (conditionally) is an instance (server). Each specific ACU can be master or slave. Each Capacity Unit has its own RAM, processor and minimum disk. Accordingly, one master, the rest are read only replicas.

The number of these running Aurora Capacity Units is a configurable setting. The minimum number can be one or zero (in this case, the database does not work if there are no requests).

Towards serverless databases - how and why

When the base receives requests, the proxy fleet raises Aurora CapacityUnits, increasing the system's productive resources. The ability to increase and decrease resources allows the system to “juggle” resources: automatically display individual ACUs (replacing them with new ones) and roll all relevant updates onto the withdrawn resources.

The Aurora Serverless base can scale the read load. But the documentation doesn't say this directly. You may get the feeling that they can raise a multi-master. There is no magic.

This base is well suited to avoid spending huge amounts of money on systems with unpredictable access. For example, when building an MVP or marketing flyer sites, we usually don't expect a steady load. Accordingly, in the absence of access, we do not pay for instances. When a load suddenly arises, for example, after a conference or an advertising campaign, crowds of people enter the site and the load rises sharply, Aurora Serverless automatically takes this load and quickly connects the missing resources (ACU). Then the conference passes, everyone forgets about the prototype, the servers (ACU) go out, and the costs drop to zero - convenient.

This solution is not suitable for stable high highload because it cannot scale the writing load. All these connections and disconnections of resources occur at the moment of the so-called “scale point” - the moment in time when the database is not held by a transaction, temporary tables are not held. For example, within a week, the scale point may not happen, and the database works on the same resources and simply cannot expand or shrink.

There is no magic - it's just regular PostgreSQL. But the process of adding machines and disconnecting them is partially automated.

Serverless design

Aurora Serverless is an old base that has been rewritten for the clouds to take advantage of some of the benefits of Serverless. And now I’ll tell you about the database, which was originally written for the clouds, for the serverless approach - Serverless-by-design. It was developed right away without the assumption that it runs on physical servers.

This base is called Snowflake. It has three key blocks.

Towards serverless databases - how and why

The first is the metadata block. This is a fast in-memory service that solves issues with security, metadata, transactions, query optimization (in the illustration on the left).

The second block is a set of virtual computing clusters for calculations (in the illustration, a set of blue circles).

The third block is a storage system based on S3. S3 is the dimensionless object storage in AWS, sort of like the dimensionless Dropbox for business.

Let's see how Snowflake works, assuming a cold start. That is, the database is there, the data is loaded into it, there are no working queries. Accordingly, if there are no requests to the database, then we have a fast in-memory Metadata service (the first block). And we have S3 storage, where the table data is stored, divided into so-called micro-partitions. For simplicity: if there are deals in the table, then micro-partitions are the days of deals. Every day is a separate micropartition, a separate file. And when the database works in this mode, you pay only for the space occupied by the data. Moreover, the fare per seat is very low (especially given the significant compression). The metadata service also works all the time, but it doesn’t take a lot of resources to optimize queries, and the service can be considered shareware.

Now let's imagine that a user came to our database and threw a SQL query. The SQL query is immediately sent for processing to the Metadata service. Accordingly, having received a request, this service analyzes the request, available data, user permissions and, if everything is fine, draws up a plan for processing the request.

Next, the service initiates the launch of the computing cluster. A compute cluster is a cluster of servers that perform calculations. That is, this is a cluster that can contain 1 server, 2 servers, 4, 8, 16, 32 - as much as you want. You throw a request and under it the launch of this cluster instantly starts. It really takes seconds.

Towards serverless databases - how and why

Further, after the cluster has started, the micro-partitions needed to process your request begin to be copied to the cluster from S3. That is, imagine that to execute a SQL query, you need two partitions from one table and one from the second. In this case, only three necessary partitions will be copied to the cluster, and not all tables as a whole. That is why, and precisely because everything is located within the same data center and connected by very fast channels, the entire transfer process is very fast: in seconds, very rarely - in minutes, unless we are talking about some monstrous requests. . Accordingly, micropartitions are copied to the computing cluster, and, upon completion, an SQL query is executed on this computing cluster. The result of this request can be one line, several lines or a table - they are sent out to the user so that he can upload it, display it in his BI tool, or use it in some other way.

Each SQL query can not only read aggregates from previously loaded data, but also load/form new data in the database. That is, it can be a query that, for example, inserts new records into another table, which leads to the appearance of a new partition on the computing cluster, which, in turn, is automatically stored in a single S3 storage.

The scenario described above, from the arrival of a user to raising the cluster, loading data, executing queries, obtaining results, is paid at the rate for minutes of using the raised virtual computing cluster, virtual warehouse. Rates vary by AWS zone and cluster size, but the average is a few dollars per hour. A cluster of four cars is twice as expensive as two cars, an eight car cluster is twice as expensive. Available options from 16, 32 machines, depending on the complexity of the requests. But you pay only for those minutes when the cluster is really working, because when there are no requests, you sort of remove your hands, and after 5-10 minutes of waiting (configurable parameter), it will turn off by itself, free up resources and become free.

The scenario is quite real when you send a request, the cluster pops up, relatively speaking, in a minute, it counts for another minute, then five minutes to turn it off, and you end up paying for seven minutes of this cluster, and not for months and years.

The first scenario described the use of Snowflake in a single-user mode. Now let's imagine that there are many users, which is closer to the real scenario.

Suppose we have a lot of analysts and Tableau reports that are constantly bombarding our database with a lot of simple analytical SQL queries.

In addition, let's say that we have resourceful Data Scientists who are trying to do monstrous things with data, operate on tens of Terabytes, analyze billions and trillions of data rows.

For the two types of work described above, Snowflake allows you to raise several independent computing clusters of different capacities. Moreover, these computing clusters work independently, but with common consistent data.

For a large number of light queries, you can raise 2-3 small clusters, conditionally 2 machines in size each. This behavior is implemented, including with the help of automatic settings. So you're saying, "Snowflake, pick up a little cluster. If the load on it grows more than a certain parameter, raise a similar second, third. When the load starts to subside, extinguish the extra ones. So that no matter how many analysts come and start looking at reports, everyone has enough resources.

At the same time, if analysts are sleeping and no one is watching the reports, the clusters can completely go out, and you stop paying for them.

At the same time, for heavy queries (from Data Scientists), you can raise one very large cluster for 32 conditional machines. This cluster will also be paid only for those minutes and hours when your giant request works there.

The feature described above allows you to divide into clusters not only 2, but also more types of workload (ETL, monitoring, materialization of reports, ...).

Let's sum up Snowflake. The base combines a beautiful idea and a workable implementation. At ManyChat, we use Snowflake to analyze all the data we have. We have not three clusters, as in the example, but from 5 to 9, of different sizes. We have conditional 16-machine, 2-machine, and super-small 1-machine for some tasks. They successfully distribute the load and allow us to save a lot.

The base successfully scales the reading and writing load. This is a huge difference and a huge breakthrough compared to the same Aurora, which pulled only the reading load. Snowflake allows these compute clusters to scale the writing load as well. That is, as I mentioned, we use several clusters in ManyChat, small and super small clusters are mainly used for ETL, for loading data. And analysts already live on medium-sized clusters that are absolutely not affected by the ETL load, so they work very quickly.

Accordingly, the database is well suited for OLAP tasks. At the same time, unfortunately, it is not yet applicable for OLTP loads. Firstly, this base is columnar, with all the ensuing consequences. Secondly, the approach itself, when for each request, if necessary, you raise a computing cluster and spill it with data, unfortunately, it is still not fast enough for OLTP loads. Waiting seconds for OLAP tasks is normal, but for OLTP tasks it is unacceptable, 100 ms would be better, and even better - 10 ms.

Сonclusion

A serverless database is possible by dividing the database into Stateless and Stateful parts. You must have noticed that in all the examples given, the Stateful part is, relatively speaking, storing micropartitions in S3, and Stateless is the optimizer, working with meta-data, handling security issues that can be raised as independent lightweight stateless services.

The execution of SQL queries can also be perceived as services with a light state, which can emerge in serverless mode, like Snowflake computing clusters, download only the necessary data, execute the query and “go out”.

Serverless production-level databases are already available for use, they work. These serverless databases are already ready to handle OLAP tasks. Unfortunately, for OLTP tasks they are used ... with nuances, since there are limitations. On the one hand, this is a minus. But on the other hand, this is an opportunity. Perhaps one of the readers will find a way to make the OLTP database completely serverless, without the limitations of Aurora.

I hope you were interested. Serverless is the future 🙂

Source: habr.com

Add a comment