Release of Nebula Graph 3.2 graph-oriented DBMS

The release of the open DBMS Nebula Graph 3.2 has been published, designed to efficiently store large sets of interconnected data that form a graph that can have billions of nodes and trillions of links. The project is written in C++ and distributed under the Apache 2.0 license. Client libraries for accessing the DBMS are prepared for Go, Python, and Java.

The DBMS uses a distributed architecture without resource sharing (shared-nothing), which implies the launch of independent and self-sufficient graphd query processing processes and storaged storage processes. The orchestration of data movement and the provision of meta-information about the graph is handled by a meta-service. To ensure data consistency, a protocol based on the RAFT algorithm is used.

Main features of Nebula Graph:

  • Enforce security by granting access only to authenticated users whose permissions are set through a role-based access control (RBAC) system.
  • Ability to connect different types of storage engines. Support for expanding the query generation language with new algorithms.
  • Ensuring minimal latency when reading or writing data and maintaining high throughput. When tested in a cluster of one graphd node and three storaged nodes, a 632 GB database including a graph of 1.2 billion vertices and 8.4 billion edges, the delays were at the level of several milliseconds, and the throughput was up to 140 thousand requests per second.
  • Linear scalability.
  • SQL-like query language, powerful enough and easy to understand. Supported operations are GO (bidirectional traversal of graph vertices), GROUP BY, ORDER BY, LIMIT, UNION, UNION DISTINCT, INTERSECT, MINUS, PIPE (using the result from a previous query). Indexes and user-defined variables are supported.
  • Ensuring high availability and fault tolerance.
  • Support for creating snapshots with a snapshot of the state of the database to simplify the creation of backups.
  • Ready for industrial use (already used in the infrastructure of JD, Meituan and Xiaohongshu).
  • The ability to change the storage scheme and update data without stopping or affecting ongoing operations.
  • TTL support to limit data lifetime.
  • Commands to manage settings and storage hosts.
  • Tools for managing jobs and scheduling the launch of jobs (of jobs, COMPACT and FLUSH are currently supported).
  • Operations for finding the full path and the shortest path between given vertices.
  • OLAP interface for integration with third-party analytics platforms.
  • Utilities for importing data from CSV files or from Spark.
  • Export metrics for monitoring with Prometheus and Grafana.
  • Nebula Graph Studio web interface for graph operations visualization, graph navigation, design of data storage scheme and data loading.

In the new release:

  • Added support for the extract() function to extract a substring that matches a given expression.
  • Optimized settings in the configuration file.
  • Added optimization rules to remove the useless AppendVertices operator and disable edge and vertex filters.
  • The amount of data copied for the JOIN operation, as well as for the Traverse and AppendVertices operators, has been reduced.
  • Optimized SHORTEST PATH and SUBGRAPH performance
  • Improved memory allocation (using Arena Allocator).

Source: opennet.ru

Add a comment