Release of DBMS libmdbx 0.11.7. Move Development to GitFlic After Lockdown on GitHub

The libmdbx 0.11.7 (MDBX) library has been released with the implementation of a high-performance compact embedded key-value database. The libmdbx code is distributed under the OpenLDAP Public License. All current operating systems and architectures are supported, as well as the Russian Elbrus 2000.

The release is notable for the migration of the project to the GitFlic service after the GitHub administration removed libmdbx along with a host of other projects on April 15, 2022 without any warning or explanation, while simultaneously blocking access to many developers associated with companies that fell under US sanctions. From the user's point of view, all pages, repository and forks of the project suddenly turned into a "404" page, without the possibility of any communication and finding out the reasons.

Unfortunately, almost all issues have been lost, in which there were many questions with detailed answers, as well as a lot of discussions. The loss of this information is the only objective damage that the GitHub administration managed to inflict on the project. Partial copies of the discussions remain available in the archive.org archive.

The loss of built-in CI scripts and infrastructure (available for OpenSource projects for free) forced us to do revision, unification and elimination of a small technical debt. Now CI has been restored to almost the same extent, with the exception of builds and test runs for all BSD and Solaris variants. Tellingly, after GitHub's actions, no clarifications or notifications were received, apart from a reminder of the need for payment and attempts to write off money.

Since the last news about the release of libmdbx v0.11.3, in addition to recovering from GitHub actions, the following improvements and fixes are worth noting:

  • Added a workaround for a detected incoherence effect/defect in the combined page and buffer cache in the Linux kernel. On systems where the page and buffer caches are truly unified, it makes no sense for the kernel to waste memory on two copies of data when writing to an already memory-mapped file. Therefore, the data being written becomes visible via memory mapping before the write() system call completes, even if the data has not yet been written to disk.

    In total, other behavior is not rational, because with a delayed merge, you still have to grab locks for page lists, copy data, or adjust PTE. Therefore, the unspoken rule of coherence has been in effect since 1989, when the unified buffer cache appeared in SRV4. Therefore, finding strange failures in busy libmdbx production scenarios required a lot of work. First, by reproducing the problem, then by verifying hypotheses and checking improvements.

    Now we can confidently say that the problem has been reliably identified, localized and reliably eliminated, despite the complexity and specificity of the playback scenario. Additionally, the work of the bypass mechanism was confirmed by one of the developers of Erigon (Ethereum), in his case, on the debug build, the protection triggered as a regression due to an extra assert check.

    It should be noted that in the context of the widespread use of libmdbx in working projects, it is fundamentally more important to ensure reliable operation, rather than finding out “is this a bug or a feature” and whether such coherence can be relied upon, especially not finding the causes of incoherence within the Linux kernel. Therefore, here we are talking about fixing a problem that could affect users.

  • Fixed a regression of the EXDEV (Cross-device link) error when hot copying a database without compactification to another file system, both through the API and using the mdbx_copy utility.
  • Kris Zyp has implemented support for libmdbx in Deno. Kai Wetlesen has packaged RPMs for Fedora. David Bouyssié implemented bindings for Scala.
  • Fixed processing of the value set by the MDBX_opt_rp_augment_limit option when processing huge transactions in large databases. Previously, due to a bug, unnecessary actions could be performed, which sometimes affected performance in Ethereum implementations (Erigon/Akula/Silkworm) and Binance Chain projects.
  • A lot of bugs have been fixed, including those in the C++ API. Fixed many build issues in rare and exotic configurations. A complete list of all significant improvements is available in the ChangeLog.
  • A total of 185 changes were made to 89 files, ≈3300 lines were added, ≈4100 were deleted. Removed more mostly due to purge of already useless tech files associated with GitHub and dependent services.

Historically, libmdbx is a deep redesign of the LMDB DBMS and surpasses its progenitor in terms of reliability, feature set and performance. Compared to LMDB, libmdbx puts a lot of emphasis on code quality, API stability, testing, and automated checks. A utility for checking the integrity of the database structure is supplied with some recovery options.

Technologically, libmdbx offers ACID, strict change serialization, and non-blocking reads with linear scaling across CPU cores. Autocompactification, automatic database size management, and range query estimation are supported. Since 2016, the project has been funded by Positive Technologies and has been used in its products since 2017.

libmdbx offers a developed C++ API, as well as enthusiast-supported bindings to Rust, Haskell, Python, NodeJS, Ruby, Go, Nim, Deno, Scala.

Source: opennet.ru

Add a comment