Distributed DBMS TiDB 4.0 Release

Available distributed DBMS release TiDB4.0developed under the influence of Google technologies Spanner ΠΈ F1. TiDB belongs to the category of hybrid HTAP (Hybrid Transactional/Analytical Processing) systems capable of both providing real-time transactions (OLTP) and processing analytical queries. The project is written in Go and spreads licensed under Apache 2.0.

TiDB Features:

  • Support for SQL and the provision of a client interface that is compatible with the MySQL protocol, which makes it easy to adapt existing applications written for MySQL to TiDB, and also allows you to use common client libraries. In addition to the MySQL protocol, you can use the JSON-based API and the connector for Spark to access the DBMS.
  • Of the features of SQL, indexes, aggregate functions, GROUP BY, ORDER BY, DISTINCT expressions, merges (LEFT JOIN / RIGHT JOIN / CROSS JOIN), views, window functions and subqueries are supported. The provided opportunities are enough to organize work with TiDB of such web applications as PhpMyAdmin, Gogs and WordPress;
  • Scale-out and resiliency: Storage and processing power can be scaled up simply by adding new nodes. Data is distributed across nodes with redundancy to allow operations to continue if individual nodes fail. Failures are handled automatically.
  • The system guarantees consistency and looks like one big DBMS to the client software, despite the fact that data from many nodes is actually used to complete the transaction.
  • Different backends can be used for physical data storage on nodes, for example, GoLevelDB and BoltDB local storage engines or native distributed storage engines TiKV and TiFlash. TiKV stores data in line-by-line format in key/value format and is more optimal for transaction processing (OLTP) tasks. TiFlash stores data in a column-based manner and allows you to achieve higher performance when solving analytical problems (OLAP).
  • The ability to asynchronously change the storage scheme, allowing you to add columns and indexes on the fly without stopping the processing of ongoing operations.

In the new release:

  • By default, the distributed garbage collector Green GC is enabled, which can significantly increase the speed of garbage collection in large clusters and improve stability;
  • Added support for large transactions, the size of which is limited almost by the size of physical memory. Single transaction size limit increased from 100 MB to 10 GB;
  • Added support for BACKUP and RESTORE commands for backup;
  • Added the ability to set a lock on tables;
  • Added MySQL-compatible transaction isolation mechanism at the read level (READ COMMITTED);
  • Support for LIKE and WHERE expressions has been added to the "ADMIN SHOW DDL JOBS" command;
  • Added the oom-use-tmp-storage parameter, which allows using temporary files for caching intermediate results in conditions of insufficient RAM;
  • Added Random keyword to assign random values ​​to attributes;
  • The LOAD DATA command now has the ability to use hexadecimal and binary expressions;
  • Added 15 parameters to control optimizer behavior;
  • Added tools for diagnosing the performance of SQL queries. Added slow query log available through system tables SLOW_QUERY / CLUSTER_SLOW_QUERY;
  • Added support for functions for working with sequences;
  • Added the ability to dynamically change the configuration parameters read from the PD (Placement Driver, cluster management server). Added the ability to use the "SET CONFIG" statement to change the settings of PD/TiKV nodes.
  • Added setting max-server-connections to limit the maximum number of simultaneous connections to the server (4096 by default);
  • Improved performance in situations where the requested columns are completely covered by indexes;
  • Added query optimization based on merging indexes;
  • Improved performance of operations with ranges of values;
  • Reduced CPU load by caching the results of accessing indexes and filtering out duplicates;
  • Added support for a new string storage format that allows you to increase the performance of tables with a large number of columns;
  • The GROUP_CONCAT function now supports the "ORDER BY" expression;
  • Added the ability to extract data from the TiFlash log via SQL;
  • The "RECOVER TABLE" command implements support for recovering truncated tables;
  • Added DDLJobs system table to query details about running DDL jobs;
  • Added the ability to use the SHOW CONFIG command to show PD and TiKV settings;
  • Switched on default coprocessor cache;
  • The number of goroutines in the commit retry phase can now be controlled using the committer-concurrency setting;
  • Added the ability to display the regions of the table partition;
  • Added the ability to limit the size of temporary storage to tidb-server;
  • Added support for "insert into tbl_name partition(partition_name_list)" and "replace into tbl_name partition(partition_name_list)" operations;
  • In the hash used for partitioning (partitioning), support has been added for filtering on the basis of "is null";
  • For partitioned tables, support for checking, cleaning, and restoring indexes has been added.

Source: opennet.ru

Add a comment