libmdbx 0.10 high-performance embedded database release

After three months of development, the libmdbx 0.10.0 (MDBX) library was released with a high-performance, compact key-value embedded database implementation. The libmdbx code is distributed under the OpenLDAP Public License. libmdbx is a deep reworking of the LMDB DBMS and, according to the developers, surpasses its progenitor in terms of reliability, feature set, and performance. libmdbx is claimed to be up to 20% faster than LMDB in CRUD scripts and up to 30% faster if internal controls are disabled when building libmdbx to a level comparable to LMDB.

Libmdbx offers ACID, strict change serialization, and non-blocking reads with linear scaling across CPU cores. libmdbx puts a lot of emphasis on code quality, API stability, testing, and automated checks. Autocompactification, automatic database size management, a single database format for 32-bit and 64-bit assemblies, and range query estimation are supported. A utility for checking the integrity of the database structure is supplied with some recovery options. Since 2016, the project has been funded by Positive Technologies and has been used in its products since 2017, while sanctions imposed by the US government against Positive Technologies do not have any noticeable effect.

The main innovations, improvements and fixes added since the last release:

  • Ruby binding by Mahlon E. Smith and trial version of Python bindings by Noel Kuntze are available, updated GoLang bindings by Alexey Sharov.
  • For the "MDBX_WRITEMAP" mode, when the database data is changed directly in RAM, a "transparent spill" to the disk of the changed database pages is implemented. Now, after the completion of each operation, such pages are immediately completely ready for writing to disk, and the OS kernel can independently flush the changed pages to disk, and committing the transaction will not require their modification. As a result, in busy scenarios with a lack of RAM, the volume of disk operations can be reduced by up to 2 times.
  • Implemented eviction of long-unused shadow copies of modified pages, with a preference to evict pages with large/long values, which in the vast majority of scenarios are modified only once per transaction. The result is reduced disk traffic and increased performance in scenarios with very large transactions.
  • Implemented "smart" mode of page separation when inserting keys. Now, when inserting ordered sequences, full page filling is automatically ensured, and in other cases, more optimal balancing of the tree. As a result, on average, the database pages are filled more optimally, and the B-tree is more balanced, which has a positive effect on performance.
  • Added statistics of page operations, which allows you to accurately estimate the cost of modifying operations with the database.
  • Fixed over a dozen issues and bugs, including: MinGW build issues, `std::filesystem::path` usage in iOS <= 13.0, building targeting older versions of Windows, etc.
  • In total, more than 200 changes were made to 66 files, ~6500 lines were added, ~4500 were deleted.

Separately, I would like to note the choice by the Turbo-Geth project (a turbo fork of Go-Ethereum) of libmdbx as a new storage backend, and also thank the project team (especially Alexey Sharov, Artyom Vorotnikov and Alexey Akhunov) for their great help in testing in extreme use cases. In particular, this is how a defect in read-ahead/caching management was discovered and fixed, which led to performance degradation in difficult-to-reproduce scenarios with large databases.

Source: opennet.ru

Add a comment