Yandex opened the code of the distributed DBMS YDB supporting SQL

Yandex has published the source texts of the distributed YDB DBMS, which implements support for the SQL dialect and ACID transactions. The DBMS was created from scratch and initially developed with an eye to ensuring fault tolerance, automatic recovery in case of failures and scalability. It is noted that Yandex launched working YDB clusters, including more than 10 thousand nodes, storing hundreds of petabytes of data and serving millions of distributed transactions per second. YDB is used in Yandex projects such as Market, Cloud, Smart Home, Alice, Metrika and Auto.ru. The code is written in C/C++ and distributed under the Apache 2.0 license. For familiarization and quick launch, you can use a ready-made Docker container.

Features of the project:

  • Using the relational data model with tables. YQL (YDB Query Language) is used to query and define the data schema, which is a dialect of SQL adapted to work with large distributed databases. When creating a storage scheme, a tree-like grouping of tables is supported, resembling directories in a file system. An API is provided for working with data in JSON format.
    Yandex opened the code of the distributed DBMS YDB supporting SQL
  • Support for data access using scan queries designed to perform analytical ad-hoc queries against the database, executed in read-only mode and returning a grpc stream.
  • Interaction with the DBMS and sending requests is carried out using the command line interface, the built-in web interface or the YDB SDK, which provides libraries for C ++, C # (.NET), Go, Java, Node.js, PHP and Python.
  • The ability to create fault-tolerant configurations that continue to work when individual disks, nodes, racks, and even data centers fail. YDB supports deployment and synchronous replication across three availability zones while maintaining the health of the cluster in the event of a failure of one of the zones.
  • Automatically recover from failures with minimal delays for applications and automatically maintain the specified redundancy when storing data.
  • Automatic creation of indexes on the primary key and the ability to define secondary indexes to improve the efficiency of access to arbitrary columns.
  • Horizontal scalability. As the load and the size of the stored data grow, the cluster can be expanded by simply connecting new nodes. The compute and storage tiers are separated, allowing for compute and storage scaling separately. The DBMS itself monitors the uniform distribution of data and load, taking into account the available hardware resources. It is possible to deploy geographically distributed configurations covering several data centers in different parts of the world.
  • Support for strong consistency model and ACID transactions when processing queries spanning multiple nodes and tables. To improve performance, you can selectively disable consistency control.
  • Automatic data replication, automatic partitioning (partitioning, sharding) when the size or load increases, and automatic load and data balancing between nodes.
  • Storing data directly on block devices using native PDisk component and VDisk layer. On top of VDisk, DSProxy runs, which analyzes the availability and performance of disks in order to exclude them if problems are detected.
  • A flexible architecture that allows you to create on top of YDB, various services, up to virtual block devices and persistent queues (persistent queue). Application suitability for different types of workload, OLTP and OLAP (analytical queries).
  • Support for multi-user (multitenant) and serverless configurations. Ability to authenticate clients. Users can create their own virtual clusters and databases in a common shared infrastructure, taking into account resource consumption at the level of the number of requests and data size, or by renting / reserving certain computing resources and storage space.
  • Possibility to adjust the lifetime of records for automatic deletion of obsolete data.

Source: opennet.ru

Add a comment