Release of document-oriented DBMS Apache CouchDB 3.0

Took place release of a distributed document-oriented database Apache CouchDB 3.0belonging to the class of NoSQL systems. Project source code extend licensed under Apache 2.0.

Improvementsimplemented in Apache CouchDB 3.0:

  • Hardened security in the default configuration.
    At startup, the admin user must now be defined, without which the server will complete its work with an error (allows you to solve problems with starting servers that inadvertently leave access without authentication). Calling "/_all_dbs" now requires administrator rights, and all databases are created by default accessible only to the admin user (access parameters can be changed through the "_security" object). By default, editing objects in the _users database is prohibited;

  • Added by the ability to create user-defined segmented (partitioned) databases, which make it possible to define your own rules for distributing documents into segments (shard range). Special optimizations for sharded databases have been added to Mango views and indexes;
  • Implemented automatic division mode when segmenting (sharding). In the database, it is now possible to redistribute data across segments, taking into account the increase in the value of the q-factor used to determine the breakdown level;
  • Added by ken subsystem for automatic background indexing and keeping up-to-date secondary indexes (JavaScript, Mango, text search indexes) without explicitly launching their construction operations;
  • The smoosh process used for automatic packaging of the database has been completely rewritten;
  • New subsystem proposed IO Queueused to change the I/O priority for certain operations;
  • Implemented a regression testing system;
  • Added official support for arm64v8 (aarch64) and ppc64le (ppc64el) platforms;
  • Added support for linking to the SpiderMonkey 1.8.5 JavaScript engine (ESR branch of Firefox 60) with improved support for ES5, ES6 and ES2016+;
  • Search engine included Dreyfus based on Lucene, which greatly simplifies the deployment of a search engine based on CouchDB;
  • Added backend for logging using systemd-journald;
  • Added "[couchdb] single_node" setting, when set, CouchDB will automatically create system databases if they are missing;
  • Optimized the performance of the couch_server process;
  • Significantly improved installer for the Windows platform;
  • Views are limited to 2^28 (268435456) results. This limit can be separately configured for regular and segmented views using the query_limit and partition_query_limit options in the "[query_server_config]" section;
  • Removed a separate HTTP local node management interface that runs on network port 5986, the functionality of which is now available through a common cluster management interface;
  • The maximum document size has been reduced to 8 MB, which may cause data replication issues from older servers after upgrading to CouchDB 3.0. You can use the "[couchdb] max_document_size" setting to increase the limit;
  • A lot of deprecated features have been cleaned up, such as _replicator and _external calls, disk_size and data_size fields, delayed_commits option;
  • CouchDB now requires Erlang/OTP 20.3.8.11+, 21.2.3+ or 22.0.5 to run. Theoretically, working with the Erlang/OTP 19 branch is preserved, but it is covered by tests.

Recall that CouchDB stores data in an ordered list format and allows partial replication of data between several databases in the β€œmaster-master” mode with simultaneous detection and resolution of conflict situations. Each server keeps its own local data set, synchronized with other servers, which can be taken offline and periodically replicate changes. In particular, this feature makes CouchDB an attractive solution for synchronizing program settings between different computers. CouchDB-based solutions have been deployed by companies such as the BBC, Apple and CERN.

CouchDB queries and data indexing can be done according to the paradigm MapReduce, using the JavaScript language to form the data sampling logic. The core of the system is written in the Erlang language, which is optimized for creating distributed systems serving many parallel requests. The view server is written in C and is based on a JavaScript engine from the Mozilla project. Access to the database is performed using the HTTP protocol using the RESTful JSON API, which allows you to access data, including from web applications running in the browser.

The unit of data storage is a document that has a unique identifier, version, and contains an arbitrary set of named fields in the key/value format. To organize a pseudo-structured data set from arbitrary documents (aggregation and sampling), the concept of forming views (view) is used, for the definition of which the JavaScript language is used. JavaScript can also define functions to validate data when new documents are added within a particular view.

Source: opennet.ru

Add a comment