LizardFS 3.13.0-rc2 cluster file system update

After a year-long hiatus in development resumed work on a new branch of a fault-tolerant distributed file system LizardF 3.13 ΠΈ published second candidate for release. Recently occurred change of owners of the company developing LizardFS, new management was adopted and developers changed. For the past two years, the project has pulled away from the community and did not pay due attention to it, but the new team intends to revive the old relationship with the community and establish close interaction with it. The project code is written in C and C++ and spreads under GPLv3 licenses.

LizardFS is a distributed cluster file system that allows you to disperse data across different servers, but provide access to them in the form of a single large partition, which is operated by analogy with traditional disk partitions. A mounted partition with LizardFS supports POSIX file attributes, ACLs, locks, sockets, pipes, device files, symbolic and hard links. The system does not have a single point of failure, all components are redundant. Parallelization of operations with data is supported (several clients can simultaneously access files).

To ensure fault tolerance, data is divided into replicas, which are distributed over different nodes with redundancy (several copies are placed on different nodes). In case of failure of nodes or drives, the system continues to work without loss of information and automatically redistributes data taking into account the remaining nodes. To expand the storage, it is enough to connect new nodes to it without stopping work for maintenance (the system itself replicates part of the data to new servers and balances the storage taking into account new servers). You can do the same to reduce the size of the cluster - you can simply turn off the obsolete equipment that is being decommissioned.

Data and metadata are stored separately. For operation, it is recommended to install two metadata servers operating in master-slave mode, as well as at least two data storage servers (chunkserver). Additionally, log servers can be used to back up metadata, which store information about changes in metadata and allow you to restore work in case of damage to all existing metadata servers. Each file is divided into blocks (chunk), up to 64 MB in size. Blocks are distributed among storage servers in accordance with the selected replication mode: standard (explicit definition of the number of copies to be placed on different nodes, including in relation to individual directories - for important data, the number of copies can be increased, and for non-essential data, reduced), XOR (RAID5 ) and EC (RAID6).

Storage can scale up to petabyte sizes. Of the areas of application, archiving, storage of virtual machine images, multimedia data, backups, use as DRC (Disaster Recovery Center) and as storage in high-performance computing clusters are mentioned. LizardFS provides a very high read speed for files of any size, and when writing, it shows good performance when writing entire large and medium files, when there is no constant modification, intensive work with open files and one-time operations with a bunch of small files.

LizardFS 3.13.0-rc2 cluster file system update

Among the features of the file system, one can also note the support for snapshots that reflect the state of files at a certain time, and the built-in implementation of the "recycle bin" (files are not deleted immediately and are available for recovery for some time). Access to a partition can be restricted by IP address or password (similar to NFS). There are quota and QoS mechanisms to limit the size and bandwidth for certain categories of users. It is possible to create geographically distributed storages, the segments of which are located in different data centers.

The LizardFS project was founded in 2013 as a fork of MooseFS, and differs mainly in the presence of a replication mode based on Reed-Solomon error correction codes (analogous to raidzN), extended ACL support, the presence of a client for the Windows platform, additional optimizations (for example, when combining a client and a storage server, blocks are given with current node, and the metadata is cached in memory), a more flexible configuration system, data read-ahead support, directory quotas, and internal rework.

The release of LizardFS 3.13.0 is scheduled for release at the end of December. The main innovation of LizardFS 3.13 is the use of a consensus algorithm to ensure fault tolerance (switching master servers in case of failure). Raft (using uRaft's own implementation, which was previously used in commercial products). Using uRaft simplifies setup and reduces failover delays, but requires at least three nodes running, one of which is used for quorum.

Among other changes: a new client based on the FUSE3 subsystem, solving problems with error correction, the nfs-ganesha plugin was rewritten in the C language. The 3.13.0-rc2 update fixes several critical bugs that made previous test releases of the 3.13 branch unusable (the fixes for the 3.12 branch have not yet been published, and the upgrade from 3.12 to 3.13 still results in a complete loss of data).

In 2020, work will focus on developing
Agama, a new, completely rewritten LizardFS core that developers claim will provide a three times performance increase compared to the 3.12 branch. In Agama, a transition will be made to an event-oriented architecture (event driven), asynchronous I / O based on asio, work primarily in user space (to reduce dependence on kernel caching mechanisms). Additionally, a new debugging subsystem and a network activity analyzer with support for performance auto-tuning will be offered.

Full support for write versioning will be added to the LizardFS client, which will improve the reliability of failover, solve problems that arise when different clients access the same data, and allow for significant performance improvements. The client will be transferred to its own network subsystem running in user space. The first working prototype of LizardFS based on Agama is planned to be ready in the second quarter of 2020. At the same time, they promise to implement tools for integrating LizardFS with the Kubernetes platform.

Source: opennet.ru

Add a comment