Micron releases HSE 3.0 storage engine optimized for SSDs

Micron Technology, a specialist in the production of DRAM and flash memory, has published the release of the HSE 3.0 (Heterogeneous-memory Storage Engine) storage engine, designed specifically for use on SSDs and read-only memory (NVDIMM). The engine is made in the form of a library for embedding into other applications and supports data processing in the key-value format. The HSE code is written in C and distributed under the Apache 2.0 license.

HSE is optimized not only for maximum performance, but also for longevity across different classes of SSDs. High speed is achieved through a hybrid storage model - the most up-to-date data is cached in RAM, which reduces the number of accesses to the drive. The engine can be used for low-level data storage in NoSQL DBMS, software storages (SDS, Software-Defined Storage) such as Ceph and Scality RING, platforms for processing large amounts of data (Big Data), high-performance computing (HPC) systems, Internet of Things (IoT) devices ) and solutions for machine learning systems. As an example of integrating the engine into third-party projects, a variant of the document-oriented DBMS MongoDB has been prepared, translated to the use of HSE.

Main features of HSE:

  • Support for generic and extended operators for handling data in key/value format;
  • Full support for transactions with the ability to isolate storage slices through the creation of snapshots (snapshots can also be used to maintain independent collections in one storage);
  • Ability to use cursors to iterate over data in snapshot-based views;
  • Data model optimized for mixed load types;
  • Flexible storage reliability management mechanisms;
  • Customizable data orchestration schemes (distribution across different types of memory present in the storage);
  • A library with a C API that can dynamically link to any application. Bindings for Python and Java;
  • Support for storing keys and data in a compressed form.
  • Ability to scale up to terabytes of data and hundreds of billions of keys in storage;
  • Efficient processing of thousands of parallel operations;
  • The ability to use different classes of SSD drives in the same storage to optimize performance and extend the life of the drive.

The significant version number change in HSE 3.0 is due to changes in the API, CLI, configuration options, REST interface, and storage format that break backwards compatibility. In preparing the new release, the focus was on optimizing storage to improve performance under some critical workloads. Among the most notable improvements:

  • The performance of cursor operations is now independent of the length of the filter, which makes it possible to iterate over keys without reducing throughput using a cursor with arbitrary filters.
  • Read and write performance has been improved in situations where monotonically increasing keys are used, for example, when storing slices of parameter values ​​recorded at certain intervals, in monitoring systems, financial platforms, and systems for polling sensor states.
  • The API provides the ability to control compression at the level of individual values, which allows you to keep both compressed and uncompressed records in the same storage.
  • New KVDB opening modes have been added that allow you to create queries to the database in read-only storages.

Source: opennet.ru

Add a comment