Using Clickhouse as a replacement for ELK, Big Query and TimescaleDB

clickhouse is an open-source columnar database management system for online analytical query processing (OLAP) created by Yandex. It is used by Yandex, CloudFlare, VK.com, Badoo and other services around the world to store really large amounts of data (insertion of thousands of rows per second or petabytes of data stored on disk).

In a normal, "string" DBMS, examples of which are MySQL, Postgres, MS SQL Server, data is stored in this order:

Using Clickhouse as a replacement for ELK, Big Query and TimescaleDB

In this case, the values ​​related to one row are physically stored side by side. In columnar DBMS, values ​​from different columns are stored separately, and the data of one column is stored together:

Using Clickhouse as a replacement for ELK, Big Query and TimescaleDB

Examples of columnar DBMSs are Vertica, Paraccel (Actian Matrix, Amazon Redshift), Sybase IQ, Exasol, Infobright, InfiniDB, MonetDB (VectorWise, Actian Vector), LucidDB, SAP HANA, Google Dremel, Google PowerDrill, Druid, kdb+.

The company is a mail forwarder Qwintry I started using Clickhouse in 2018 for reporting and was very impressed with its simplicity, scalability, SQL support, and speed. The speed of this DBMS bordered on magic.

Ease

Clickhouse installs on Ubuntu with a single command. If you know SQL, you can immediately start using Clickhouse for your needs. However, this does not mean that you can "show create table" in MySQL and copy-paste SQL in Clickhouse.

Compared to MySQL, there are important data type differences in the table schema definitions in this DBMS, so you still need some time to change the table schema definitions and learn the table engines to get comfortable.

Clickhouse works great without any additional software, but if you want to use replication you'll need to install ZooKeeper. Query performance analysis shows excellent results - the system tables contain all the information, and all the data can be obtained using old and boring SQL.

Performance

  • Benchmark Clickhouse versus Vertica and MySQL comparisons on configuration server: two sockets Intel® Xeon® CPU E5-2650 v2 @ 2.60GHz; 128 GiB RAM; md RAID-5 on 8 6TB SATA HDD, ext4.
  • Benchmark comparison of Clickhouse with Amazon RedShift cloud storage.
  • Blog excerpts Cloudflare about Clickhouse performance:

Using Clickhouse as a replacement for ELK, Big Query and TimescaleDB

The ClickHouse database has a very simple design - all nodes in the cluster have the same functionality and use only ZooKeeper for coordination. We built a small cluster of several nodes and performed testing, during which we found that the system has quite impressive performance, which corresponds to the claimed advantages in analytical DBMS benchmarks. We decided to take a closer look at the concept behind ClickHouse. The first obstacle to research was the lack of tools and the small community of ClickHouse, so we delved into the design of this DBMS to understand how it works.

ClickHouse does not support receiving data directly from Kafka, as it is just a database, so we wrote our own adapter service in Go. It read Cap'n Proto encoded messages from Kafka, converted them to TSV, and inserted them into ClickHouse in batches via the HTTP interface. We later rewrote this service to use the Go library in conjunction with our own ClickHouse interface to improve performance. When evaluating the performance of receiving packets, we discovered an important thing - it turned out that for ClickHouse this performance strongly depends on the size of the packet, that is, the number of rows inserted at the same time. To understand why this happens, we studied how ClickHouse stores data.

The main engine, or rather, a family of table engines used by ClickHouse for storing data, is MergeTree. This engine is conceptually similar to the LSM algorithm used in Google BigTable or Apache Cassandra, but avoids building an intermediate memory table and writes data directly to disk. This gives it excellent write throughput, as each inserted packet is only sorted by the "primary key" primary key, compressed, and written to disk to form a segment.

The absence of a memory table or any concept of “freshness” of data also means that they can only be added, the system does not support changing or deleting. As of today, the only way to delete data is to delete it by calendar month, as segments never cross a month boundary. The ClickHouse team is actively working on making this feature customizable. On the other hand, it makes writing and merging segments contention-free, so receive throughput scales linearly with the number of parallel inserts until I/O or cores saturate.
However, this circumstance also means that the system is not suitable for small packets, so Kafka services and inserters are used for buffering. Further, ClickHouse in the background continues to continuously merge segments, so that many small pieces of information will be combined and recorded more times, thus increasing the intensity of recording. However, too many unrelated parts will cause aggressive throttling of inserts as long as the merge continues. We have found that the best compromise between real-time data ingestion and ingestion performance is to accept a limited number of inserts per second into the table.

The key to table read performance is the indexing and location of the data on disk. No matter how fast the processing is, when the engine needs to scan terabytes of data from disk and only use a fraction of it, it will take time. ClickHouse is a column store, so each segment contains a file for each column (column) with sorted values ​​for each row. Thus, entire columns not present in the query can first be skipped, and then multiple cells can be processed in parallel with vectorized execution. To avoid a full scan, each segment has a small index file.

Given that all columns are sorted by the "primary key", the index file only contains the labels (captured rows) of every Nth row, in order to be able to keep them in memory even for very large tables. For example, you can set the default settings to “mark every 8192th row”, then “meager” indexing of a table with 1 trillion. lines that fits easily into memory would only take 122 characters.

System development

The development and improvement of Clickhouse can be traced on Github repos and make sure that the process of “growing up” is happening at an impressive pace.

Using Clickhouse as a replacement for ELK, Big Query and TimescaleDB

Popularity

It seems that Clickhouse's popularity is growing exponentially, especially in the Russian-speaking community. Last year's High load 2018 conference (Moscow, November 8-9, 2018) showed that monsters like vk.com and Badoo use Clickhouse, which inserts data (for example, logs) from tens of thousands of servers simultaneously. In a 40 minute video Yuri Nasretdinov from the VKontakte team talks about how it's done. Soon we will post the transcript on Habr for the convenience of working with the material.

Areas of use

After spending some time researching, I think there are areas where ClickHouse can be useful or able to completely replace other more traditional and popular solutions such as MySQL, PostgreSQL, ELK, Google Big Query, Amazon RedShift, TimescaleDB, Hadoop, MapReduce, Pinot and Druid. The following are the details of using ClickHouse to upgrade or completely replace the above DBMS.

Extending MySQL and PostgreSQL

Most recently, we partially replaced MySQL with ClickHouse for the newsletter platform Mautic newsletter. The problem was that MySQL due to ill-conceived design logged every email sent and every link in that email with a base64 hash, creating a huge MySQL table (email_stats). After sending only 10 million emails to the service's subscribers, this table occupied 150 GB of file space, and MySQL began to "stupid" on simple queries. To fix the file space issue, we successfully used InnoDB table compression, which reduced it by a factor of 4. However, it still doesn't make sense to store more than 20-30 million emails in MySQL just for the sake of reading history, as any simple query that for some reason has to do a full scan results in swap and heavy I/O overhead, about which we regularly received Zabbix warnings.

Using Clickhouse as a replacement for ELK, Big Query and TimescaleDB

Clickhouse uses two compression algorithms that reduce the amount of data by about 3-4 times, but in this particular case, the data was especially "compressible".

Using Clickhouse as a replacement for ELK, Big Query and TimescaleDB

ELK Replacement

Based on my own experience, the ELK stack (ElasticSearch, Logstash and Kibana, in this particular case ElasticSearch) requires much more resources to run than is needed to store logs. ElasticSearch is a great engine if you want good full-text log search (and I don't think you really need it), but I'm wondering why it has become the de facto standard logging engine. Its ingestion performance, combined with Logstash, gave us problems even at fairly light workloads and required the addition of more and more RAM and disk space. As a database, Clickhouse is better than ElasticSearch for the following reasons:

  • SQL dialect support;
  • The best degree of compression of stored data;
  • Support for Regex search instead of full text search;
  • Improved query scheduling and better overall performance.

Currently, the biggest problem that arises when comparing ClickHouse with ELK is the lack of solutions for uploading logs, as well as the lack of documentation and tutorials on this topic. At the same time, each user can set up ELK using the Digital Ocean manual, which is very important for the rapid implementation of such technologies. There is a database engine here, but there is no Filebeat for ClickHouse yet. Yes, there is fluent and a system for working with logs log house, there is a tool click tail to enter log file data into ClickHouse, but all this takes more time. However, ClickHouse still leads the way due to its simplicity, so even beginners can easily install it and start fully functional use in just 10 minutes.

Preferring minimalist solutions, I tried using FluentBit, a very low memory log upload tool, with ClickHouse, while trying to avoid using Kafka. However, minor incompatibilities need to be addressed, such as date format issuesbefore it can be done without the proxy layer that converts data from FluentBit to ClickHouse.

As an alternative to Kibana, you can use ClickHouse as a backend grafana. As far as I understand, this can cause performance issues when rendering a huge number of data points, especially with older versions of Grafana. In Qwintry, we have not tried this yet, but complaints about this appear from time to time on the ClickHouse support channel in Telegram.

Replacement of Google Big Query and Amazon RedShift (solution for large companies)

The ideal use case for BigQuery is to load 1TB of JSON data and run analytic queries on it. Big Query is a great product whose scalability is hard to overestimate. This is much more complex software than ClickHouse running on an internal cluster, but from the client's point of view, it has a lot in common with ClickHouse. BigQuery can quickly "price up" once you start paying for each SELECT, so it's a real SaaS solution with all its pros and cons.

ClickHouse is the best choice when you run a lot of computationally expensive queries. The more SELECT queries you run every day, the more point it makes to replace Big Query with ClickHouse, because such a replacement will save you thousands of dollars when it comes to many terabytes of data being processed. This does not apply to stored data, which is quite cheap to process in Big Query.

In an article by Alexander Zaitsev, co-founder of Altinity "Moving to ClickHouse" describes the benefits of such a DBMS migration.

TimescaleDB Replacement

TimescaleDB is a PostgreSQL extension that optimizes working with timeseries in a regular database (https://docs.timescale.com/v1.0/introduction, https://habr.com/ru/company/zabbix/blog/458530/).

Although ClickHouse is not a serious competitor in the time series niche, but in terms of columnar structure and vector query execution, it is much faster than TimescaleDB in most cases of processing analytical queries. At the same time, the performance of receiving ClickHouse packet data is about 3 times higher, in addition, it uses 20 times less disk space, which is really important for processing large volumes of historical data: 
https://www.altinity.com/blog/ClickHouse-for-time-series.

Unlike ClickHouse, the only way to save some disk space in TimescaleDB is to use ZFS or similar file systems.

Upcoming updates to ClickHouse will likely introduce delta compression, which will make it even more suitable for processing and storing time series data. TimescaleDB may be a better choice than bare ClickHouse in the following cases:

  • small installations with very little RAM (<3 GB);
  • a large number of small INSERTs that you do not want to buffer into large fragments;
  • better consistency, uniformity and ACID requirements;
  • PostGIS support;
  • merge with existing PostgreSQL tables, since Timescale DB is essentially PostgreSQL.

Competition with Hadoop and MapReduce systems

Hadoop and other MapReduce products can perform a lot of complex calculations, but they tend to run at huge latency. ClickHouse fixes this problem by processing terabytes of data and producing results almost instantly. Thus, ClickHouse is much more efficient for performing fast, interactive analytical research, which should be of interest to data scientists.

Competition with Pinot and Druid

ClickHouse's closest competitors are the columnar, linearly scalable open source products Pinot and Druid. An excellent job comparing these systems is published in the article Romana Leventova from 1 February 2018

Using Clickhouse as a replacement for ELK, Big Query and TimescaleDB

This article needs to be updated - it says that ClickHouse does not support UPDATE and DELETE operations, which is not entirely true in relation to the latest versions.

We don't have much experience with these DBMSs, but I don't like the complexity of the underlying infrastructure that is required to run Druid and Pinot - it's a whole bunch of "moving parts" surrounded by Java from all sides.

Druid and Pinot are Apache incubator projects, which are covered in detail by Apache on their GitHub project pages. Pinot appeared in the incubator in October 2018, and Druid was born 8 months earlier - in February.

The lack of information about how AFS works raises some, and perhaps stupid, questions for me. I wonder if the authors of Pinot noticed that the Apache Foundation is more disposed towards Druid, and did such an attitude towards a competitor cause a feeling of envy? Will the development of Druid slow down and the development of Pinot accelerate if the sponsors supporting the former suddenly become interested in the latter?

Disadvantages of ClickHouse

Immaturity: Obviously, this is still a boring technology, but in any case, nothing like this is seen in other columnar DBMS.

Small inserts don't perform well at high speed: inserts must be split into large chunks because the performance of small inserts degrades in proportion to the number of columns in each row. This is how ClickHouse stores data on disk - each column means 1 file or more, so to insert 1 row containing 100 columns, you need to open and write at least 100 files. This is why insert buffering requires an intermediary (unless the client itself provides buffering) - usually Kafka or some kind of queuing system. You can also use the Buffer table engine to later copy large chunks of data into MergeTree tables.

Table joins are limited by server RAM, but at least they are there! For example, Druid and Pinot do not have such connections at all, since they are difficult to implement directly in distributed systems that do not support moving large chunks of data between nodes.

Conclusions

In the coming years, we plan to make extensive use of ClickHouse in Qwintry, as this DBMS provides an excellent balance of performance, low overhead, scalability, and simplicity. I'm pretty sure it will spread quickly once the ClickHouse community comes up with more ways to use it in small and medium installations.

Some ads 🙂

Thank you for staying with us. Do you like our articles? Want to see more interesting content? Support us by placing an order or recommending to friends, cloud VPS for developers from $4.99, a unique analogue of entry-level servers, which was invented by us for you: The whole truth about VPS (KVM) E5-2697 v3 (6 Cores) 10GB DDR4 480GB SSD 1Gbps from $19 or how to share a server? (available with RAID1 and RAID10, up to 24 cores and up to 40GB DDR4).

Dell R730xd 2 times cheaper in Equinix Tier IV data center in Amsterdam? Only here 2 x Intel TetraDeca-Core Xeon 2x E5-2697v3 2.6GHz 14C 64GB DDR4 4x960GB SSD 1Gbps 100 TV from $199 in the Netherlands! Dell R420 - 2x E5-2430 2.2Ghz 6C 128GB DDR3 2x960GB SSD 1Gbps 100TB - from $99! Read about How to build infrastructure corp. class with the use of Dell R730xd E5-2650 v4 servers worth 9000 euros for a penny?

Source: habr.com

Add a comment