1.1 ื‘ื™ืœื™ืึธืŸ ื˜ืึทืงืกื™ ื˜ืจื™ืคึผืก: 108-ื”ืึทืจืฅ ืงืœื™ืงื›ืึธื•ืกืข ืงื ื•ื™ืœ

ื“ื™ ืื™ื‘ืขืจื–ืขืฆื•ื ื’ ืคื•ืŸ ื“ืขื ืึทืจื˜ื™ืงืœ ืื™ื– ื’ืขื•ื•ืขืŸ ืฆื•ื’ืขื’ืจื™ื™ื˜ ืกืคึผืึทืกื™ืคื™ืงืœื™ ืคึฟืึทืจ ื“ื™ ืกื˜ื•ื“ืขื ื˜ืŸ ืคื•ืŸ ื“ืขื ืงื•ืจืก "ื“ืึทื˜ืึท ื™ื ื–ืฉืขื ื™ืจ".

1.1 ื‘ื™ืœื™ืึธืŸ ื˜ืึทืงืกื™ ื˜ืจื™ืคึผืก: 108-ื”ืึทืจืฅ ืงืœื™ืงื›ืึธื•ืกืข ืงื ื•ื™ืœ

ืงืœื™ืงืงื›ืึทื•ืก ืื™ื– ืึทืŸ ืึธืคึฟืŸ ืžืงื•ืจ ืงืึธืœื•ืžื ืขืจ ื“ืึทื˜ืึทื‘ื™ื™ืก. ืขืก ืื™ื– ืึท ื’ืจื•ื™ืก ืกื•ื•ื™ื•ื•ืข ื•ื•ื• ื”ื•ื ื“ืขืจื˜ืขืจ ืคื•ืŸ ืึทื ืึทืœื™ืก ืงืขื ืขืŸ ื’ืขืฉื•ื•ื™ื ื“ ืึธื ืคืจืขื’ ื“ื™ื˜ื™ื™ืœื“ ื“ืึทื˜ืŸ, ืืคื™ืœื• ื•ื•ื™ ื˜ืขื ืก ืคื•ืŸ ื‘ื™ืœื™ืึทื ื– ืคื•ืŸ ื ื™ื™ึท ืจืขืงืึธืจื“ืก ื–ืขื ืขืŸ ืืจื™ื™ืŸ ืคึผืขืจ ื˜ืึธื’. ื™ื ืคืจืึทืกื˜ืจืึทืงื˜ืฉืขืจ ืงืึธืก ืฆื• ืฉื˜ื™ืฆืŸ ืึทื–ืึท ืึท ืกื™ืกื˜ืขื ืงืขืŸ ื–ื™ื™ืŸ ื•ื•ื™ ื”ื•ื™ืš ื•ื•ื™ $ 100 ืคึผืขืจ ื™ืึธืจ, ืื•ืŸ ืคึผืึทื˜ืขื ื˜ืฉืึทืœื™ ื”ืึทืœื‘ ื“ื™ืคึผืขื ื“ื™ื ื’ ืื•ื™ืฃ ื‘ืึทื ื™ืฅ. ืื™ืŸ ืื™ื™ืŸ ืคื•ื ื˜, ื“ื™ ืงืœื™ืงื›ืึธื•ืกืข ื™ื ืกื˜ืึทืœื™ืจื•ื ื’ ืคื•ืŸ ื™ืึทื ื“ืขืงืก ืžืขื˜ืจื™ืงืก ื›ึผื•ืœืœ 10 ื˜ืจื™ืœื™ืึทืŸ ืจืขืงืึธืจื“ืก. ืื™ืŸ ืึทื“ื™ืฉืึทืŸ ืฆื• ื™ืึทื ื“ืขืงืก, ClickHouse ื”ืื˜ ืื•ื™ืš ื’ืขืคึฟื•ื ืขืŸ ื”ืฆืœื—ื” ืžื™ื˜ ื‘ืœื•ืžื‘ืขืจื’ ืื•ืŸ ืงืœืึธื•ื“ืคืœืึทืจืข.

ืฆื•ื•ื™ื™ ื™ืืจ ืฆื•ืจื™ืง ืื™ืš ืคืืจื‘ืจืื›ื˜ ืงืึธืžืคึผืึทืจืึทื˜ื™ื•ื•ืข ืึทื ืึทืœื™ืกื™ืก ื“ืึทื˜ืึทื‘ื™ื™ืกื™ื– ื ื™ืฆืŸ ืื™ื™ืŸ ืžืึทืฉื™ืŸ, ืื•ืŸ ืขืก ืื™ื– ื’ืขื•ื•ืืจืŸ ื“ื™ ืฉื ืขืœืกื˜ืข ืคืจื™ื™ ื“ืึทื˜ืึทื‘ื™ื™ืก ื•ื•ื™ื™ื›ื•ื•ืืจื’ ืื™ืš ื•ื•ืข ืืœืฅ ื’ืขื–ืขืŸ. ื–ื™ื ื˜ ื“ืขืžืึธืœื˜, ื“ืขื•ื•ืขืœืึธืคึผืขืจืก ื”ืึธื‘ืŸ ื ื™ืฉื˜ ืคืืจืฉื˜ืืคื˜ ืึทื“ื™ื ื’ ืคึฟืขื™ึดืงื™ื™ื˜ืŸ, ืึทืจื™ื™ึทื ื’ืขืจืขื›ื ื˜ ืฉื˜ื™ืฆืŸ ืคึฟืึทืจ Kafka, HDFS ืื•ืŸ ZStandard ืงืึทืžืคึผืจืขืฉืึทืŸ. ืœืขืฆื˜ืข ื™ืึธืจ ื–ื™ื™ ืฆื•ื’ืขื’ืขื‘ืŸ ืฉื˜ื™ืฆืŸ ืคึฟืึทืจ ืงืึทืกืงื™ื™ื“ื™ื ื’ ืงืึทืžืคึผืจืขืฉืึทืŸ ืžืขื˜ื”ืึธื“ืก, ืื•ืŸ ื“ืขืœื˜ืึท-ืคื•ืŸ-ื“ืขืœื˜ื ืงืึธื“ื™ืจื•ื ื’ ืื™ื– ื’ืขื•ื•ืขืŸ ืžืขื’ืœืขืš. ื•ื•ืขืŸ ืงืึทืžืคึผืจืขืกื™ื ื’ ืฆื™ื™ื˜ ืกืขืจื™ืข ื“ืึทื˜ืŸ, ืžืึธืก ื•ื•ืึทืœื•ืขืก ืงืขื ืขืŸ ื–ื™ื™ืŸ ืงืึทืžืคึผืจืขืกื˜ ืžื™ื˜ ื“ืขืœื˜ืึท ืงืึธื“ื™ืจื•ื ื’, ืึธื‘ืขืจ ืคึฟืึทืจ ืงืึธื•ื ื˜ืขืจืก ืขืก ื•ื•ืึธืœื˜ ื–ื™ื™ืŸ ื‘ืขืกืขืจ ืฆื• ื ื•ืฆืŸ ื“ืขืœื˜ืึท-ื“ื•ืจืš-ื“ืขืœื˜ืึท ืงืึธื“ื™ืจื•ื ื’. ื’ื•ื˜ ืงืึทืžืคึผืจืขืฉืึทืŸ ืื™ื– ื’ืขื•ื•ืืจืŸ ื“ืขืจ ืฉืœื™ืกืœ ืฆื• ClickHouse ืก ืคืึธืจืฉื˜ืขืœื•ื ื’.

ืงืœื™ืงื›ืึธื•ืกืข ื‘ืืฉื˜ื™ื™ื˜ ืคื•ืŸ 170 ื˜ื•ื™ื–ื ื˜ ืฉื•ืจื•ืช ืคื•ืŸ C ++ ืงืึธื“, ืขืงืกืงืœื•ื“ื™ื ื’ ื“ืจื™ื˜-ืคึผืึทืจื˜ื™ื™ ืœื™ื™ื‘ืจืขืจื™ื–, ืื•ืŸ ืื™ื– ืื™ื™ื ืขืจ ืคื•ืŸ ื“ื™ ืงืœืขื ืกื˜ืขืจ ืคื•ื ืื ื“ืขืจื’ืขื˜ื™ื™ืœื˜ ื“ืึทื˜ืึทื‘ื™ื™ืก ืงืึธื“ืขื‘ืึทืกืขืก. ืื™ืŸ ืคืึทืจื’ืœื™ื™ึทืš, SQLite ืฉื˜ื™ืฆื˜ ื ื™ืฉื˜ ืคืึทืจืฉืคึผืจื™ื™ื˜ื•ื ื’ ืื•ืŸ ื‘ืืฉื˜ื™ื™ื˜ ืคื•ืŸ 235 ื˜ื•ื™ื–ื ื˜ ืฉื•ืจื•ืช ืคื•ืŸ ืงืึธื“ C. ื•ื•ื™ ืคื•ืŸ ื“ืขื ืฉืจื™ื™ื‘ืŸ, 207 ืขื ื“ื–ืฉืึทื ื™ืจื– ื”ืึธื‘ืŸ ืงืึทื ื˜ืจื™ื‘ื™ื•ื˜ื™ื“ ืฆื• ืงืœื™ืงืงื”ืึธื•ืกืข, ืื•ืŸ ื“ื™ ื™ื ื˜ืขื ืกื™ื˜ื™ ืคื•ืŸ ืงืึทืžื™ืฅ ืื™ื– ื™ื ืงืจื™ืกื™ื ื’ ืœืขืฆื˜ื ืก.

ืื™ืŸ ืžืขืจืฅ 2017, ClickHouse ืื ื’ืขื”ื•ื™ื‘ืŸ ืฆื• ืคื™ืจืŸ ื˜ื•ื™ืฉืŸ ืงืœืึธืฅ ื•ื•ื™ ืึทืŸ ื’ืจื™ื ื’ ื•ื•ืขื’ ืฆื• ื”ืึทืœื˜ืŸ ืฉืคึผื•ืจ ืคื•ืŸ ืึทื ื˜ื•ื•ื™ืงืœื•ื ื’. ื–ื™ื™ ืื•ื™ืš ืฆืขื‘ืจืื›ืŸ ื“ื™ ืžืึทื ืึทืœื™ื˜ื™ืง ื“ืึทืงื™ื•ืžืขื ื˜ื™ื™ืฉืึทืŸ ื˜ืขืงืข ืื™ืŸ ืึท ืžืึทืจืงื“ืึธื•ื•ืŸ-ื‘ืื–ื™ืจื˜ ื˜ืขืงืข ื›ื™ื™ืขืจืึทืจืงื™. ื™ืฉื•ื– ืื•ืŸ ืคึฟืขื™ึดืงื™ื™ื˜ืŸ ื–ืขื ืขืŸ ื˜ืจืึทืงื˜ ื“ื•ืจืš GitHub, ืื•ืŸ ืื™ืŸ ืึทืœื’ืขืžื™ื™ืŸ, ื“ื™ ื•ื•ื™ื™ื›ื•ื•ืืจื’ ืื™ื– ื’ืขื•ื•ืืจืŸ ืคื™ืœ ืžืขืจ ืฆื•ื˜ืจื™ื˜ืœืขืš ืื™ืŸ ื“ื™ ืœืขืฆื˜ืข ื™ืึธืจืŸ.

ืื™ืŸ ื“ืขื ืึทืจื˜ื™ืงืœ, ืื™ืš ื•ื•ืขืœ ื ืขืžืขืŸ ืึท ืงื•ืง ืื™ืŸ ื“ื™ ืคืึธืจืฉื˜ืขืœื•ื ื’ ืคื•ืŸ ืึท ืงืœื™ืงื›ืึธื•ืกืข ืงื ื•ื™ืœ ืื•ื™ืฃ AWS EC2 ื ื™ืฆืŸ 36-ื”ืึทืจืฅ ืคึผืจืึทืกืขืกืขืจื– ืื•ืŸ NVMe ืกื˜ืึธืจื™ื“ื–ืฉ.

ื“ืขืจื”ื™ื™ึทื ื˜ื™ืงืŸ: ื ื•ื•ืึธืš ื ืึธืš ืขืจื™ื“ื–ืฉื ืึทืœื™ ืืจื•ื™ืกื’ืขื‘ืŸ ื“ืขื ืคึผืึธืกื˜ืŸ, ืื™ืš ืจื™ืจืึทืŸ ื“ื™ ืคึผืจืึธื‘ืข ืžื™ื˜ ืึท ื™ืžืคึผืจื•ื•ื•ื“ ืงืึทื ืคื™ื’ื™ืขืจื™ื™ืฉืึทืŸ ืื•ืŸ ืึทื˜ืฉื™ื•ื•ื“ ืคื™ืœ ื‘ืขืกืขืจ ืจืขื–ื•ืœื˜ืึทื˜ืŸ. ื“ืขืจ ืคึผืึธืกื˜ืŸ ืื™ื– ื“ืขืจื”ื™ื™ึทื ื˜ื™ืงื˜ ืฆื• ืคืึทืจื˜ืจืึทื›ื˜ื  ื“ื™ ืขื ื“ืขืจื•ื ื’ืขืŸ.

ืœืึธื ื˜ืฉื™ื ื’ ืึท AWS EC2 ืงื ื•ื™ืœ

ืื™ืš ื•ื•ืขืœ ื ื•ืฆืŸ ื“ืจื™ื™ c5d.9xlarge EC2 ื™ื ืกื˜ืึทื ืกื™ื– ืคึฟืึทืจ ื“ืขื ืคึผืึธืกื˜ืŸ. ื™ืขื“ืขืจ ืคื•ืŸ ื–ื™ื™ ื›ึผื•ืœืœ 36 ื•ื•ื™ืจื˜ื•ืึทืœ ืงืคึผื•ืก, 72 GB ืคื•ืŸ ื‘ืึทืจืึทืŸ, 900 GB ืคื•ืŸ NVMe SSD ืกื˜ืึธืจื™ื“ื–ืฉ ืื•ืŸ ืฉื˜ื™ืฆื˜ 10 ื’ื™ื’ืื‘ื™ื™ื˜ ื ืขืฅ. ื–ื™ื™ ืงืึธืกื˜ืŸ $ 1,962 / ืฉืขื” ื™ืขื“ืขืจ ืื™ืŸ ื“ื™ eu-west-1 ื’ืขื’ื ื˜ ื•ื•ืขืŸ ื–ื™ื™ ืœื•ื™ืคืŸ ืื•ื™ืฃ ืคืึธื“ืขืจื•ื ื’. ืื™ืš ื•ื•ืขืœ ื ื•ืฆืŸ Ubuntu Server 16.04 LTS ื•ื•ื™ ื“ื™ ืึธืคึผืขืจื™ื™ื˜ื™ื ื’ ืกื™ืกื˜ืขื.

ื“ื™ ืคื™ื™ืจื•ื•ืึทืœ ืื™ื– ืงืึทื ืคื™ื’ื™ืขืจื“ ืึทื–ื•ื™ ืึทื– ื™ืขื“ืขืจ ืžืึทืฉื™ืŸ ืงืขื ืขืŸ ื™ื‘ืขืจื’ืขื‘ืŸ ืžื™ื˜ ื™ืขื“ืขืจ ืื ื“ืขืจืข ืึธืŸ ืจื™ืกื˜ืจื™ืงืฉืึทื ื–, ืื•ืŸ ื‘ืœื•ื™ื– ืžื™ื™ืŸ IPv4 ืึทื“ืจืขืก ืื™ื– ื•ื•ื™ื™ื˜ืœื™ืกื˜ืขื“ ื“ื•ืจืš SSH ืื™ืŸ ื“ืขื ืงื ื•ื™ืœ.

NVMe ืคืึธืจ ืื™ืŸ ืึทืคึผืขืจื™ื™ืฉืึทื ืึทืœ ื’ืจื™ื™ื˜ืงื™ื™ึทื˜ ืฉื˜ืึทื˜

ืคึฟืึทืจ ClickHouse ืฆื• ืึทืจื‘ืขื˜ืŸ, ืื™ืš ื•ื•ืขืœ ืžืึทื›ืŸ ืึท ื˜ืขืงืข ืกื™ืกื˜ืขื ืื™ืŸ ื“ื™ EXT4 ืคึฟืึธืจืžืึทื˜ ืื•ื™ืฃ ืึท NVMe ืคืึธืจ ืื•ื™ืฃ ื™ืขื“ืขืจ ืคื•ืŸ ื“ื™ ืกืขืจื•ื•ืขืจืก.

$ sudo mkfs -t ext4 /dev/nvme1n1
$ sudo mkdir /ch
$ sudo mount /dev/nvme1n1 /ch

ืึทืžืึธืœ ืึทืœืฅ ืื™ื– ืงืึทื ืคื™ื’ื™ืขืจื“, ืื™ืจ ืงืขื ืขืŸ ื–ืขืŸ ื“ื™ ื‘ืืจื’ ืคื•ื ื˜ ืื•ืŸ ื“ื™ 783 ื’ื™ื’ืื‘ื™ื™ื˜ ืคื•ืŸ ืคึผืœืึทืฅ ื‘ื ื™ืžืฆื ืื•ื™ืฃ ื™ืขื“ืขืจ ืกื™ืกื˜ืขื.

$ lsblk

NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
loop0         7:0    0  87.9M  1 loop /snap/core/5742
loop1         7:1    0  16.5M  1 loop /snap/amazon-ssm-agent/784
nvme0n1     259:1    0     8G  0 disk
โ””โ”€nvme0n1p1 259:2    0     8G  0 part /
nvme1n1     259:0    0 838.2G  0 disk /ch

$ df -h

Filesystem      Size  Used Avail Use% Mounted on
udev             35G     0   35G   0% /dev
tmpfs           6.9G  8.8M  6.9G   1% /run
/dev/nvme0n1p1  7.7G  967M  6.8G  13% /
tmpfs            35G     0   35G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs            35G     0   35G   0% /sys/fs/cgroup
/dev/loop0       88M   88M     0 100% /snap/core/5742
/dev/loop1       17M   17M     0 100% /snap/amazon-ssm-agent/784
tmpfs           6.9G     0  6.9G   0% /run/user/1000
/dev/nvme1n1    825G   73M  783G   1% /ch

ื“ื™ ื“ืึทื˜ืึทืกืขื˜ ืื™ืš ื•ื•ืขื˜ ื ื•ืฆืŸ ืื™ืŸ ื“ืขื ืคึผืจืึธื‘ืข ืื™ื– ืึท ื“ืึทื˜ืŸ ื“ืึทืžืคึผ ื•ื•ืึธืก ืื™ืš ื“ื–ืฉืขื ืขืจื™ื™ื˜ืึทื“ ืคึฟื•ืŸ 1.1 ื‘ื™ืœื™ืึธืŸ ื˜ืึทืงืกื™ ืจื™ื“ืขืก ื’ืขื ื•ืžืขืŸ ืื™ืŸ ื ื™ื• ื™ืืจืง ืกื™ื˜ื™ ืื™ื‘ืขืจ ื–ืขืงืก ื™ืึธืจ. ืื•ื™ืฃ ื“ืขื ื‘ืœืึธื’ ืื™ื™ืŸ ื‘ื™ืœื™ืึธืŸ ื˜ืึทืงืกื™ ื˜ืจื™ืคึผืก ืื™ืŸ ืจืขื“ืฉื™ืคื˜ ื“ืขื˜ืึทื™ืœืก ื•ื•ื™ ืื™ืš ื’ืขื–ืืžืœื˜ ื“ืขื ื“ืึทื˜ืŸ ืฉื˜ืขืœืŸ. ื–ื™ื™ ื–ืขื ืขืŸ ืกื˜ืึธืจื“ ืื™ืŸ AWS S3, ืึทื–ื•ื™ ืื™ืš ื•ื•ืขืœ ืงืึทื ืคื™ื’ื™ืขืจ ื“ื™ AWS CLI ืžื™ื˜ ืžื™ื™ืŸ ืึทืงืกืขืก ืื•ืŸ ืกื•ื“ ืฉืœื™ืกืœืขืŸ.

$ sudo apt update
$ sudo apt install awscli
$ aws configure

ืื™ืš ื•ื•ืขื˜ ืฉื˜ืขืœืŸ ื“ืขื ืงืœื™ืขื ื˜ ืก ืงืึทื ืงืขืจืึทื ื˜ ื‘ืขื˜ืŸ ืฉื™ืขื•ืจ ืฆื• 100 ืึทื–ื•ื™ ืึทื– ื˜ืขืงืขืก ืืจืืคืงืืคื™ืข ืคืึทืกื˜ืขืจ ื•ื•ื™ ื“ื™ ืคืขืœื™ืงื™ื™ึทื˜ ืกืขื˜ื˜ื™ื ื’ืก.

$ aws configure set 
    default.s3.max_concurrent_requests 
    100

ืื™ืš ื•ื•ืขื˜ ืืจืืคืงืืคื™ืข ื“ื™ ื“ืึทื˜ืึทื˜ืึทืก ืคื•ืŸ ื“ื™ ื˜ืึทืงืกื™ ืจื™ื“ืขืก ืคึฟื•ืŸ AWS S3 ืื•ืŸ ืงืจืึธื ืขืก ืื•ื™ืฃ ืึท NVMe ืคืึธืจ ืื•ื™ืฃ ื“ืขืจ ืขืจืฉื˜ืขืจ ืกืขืจื•ื•ืขืจ. ื“ืขืจ ื“ืึทื˜ืึทืกืขื˜ ืื™ื– ~ 104 ื’ื‘ ืื™ืŸ GZIP-ืงืึธืžืคึผืจืขืกื˜ ืงืกื•ื• ืคึฟืึธืจืžืึทื˜.

$ sudo mkdir -p /ch/csv
$ sudo chown -R ubuntu /ch/csv
$ aws s3 sync s3://<bucket>/csv /ch/csv

ืงืœื™ืงื›ืึธื•ืกืข ื™ื ืกื˜ืึทืœื™ืจื•ื ื’

ืื™ืš ื•ื•ืขืœ ื™ื ืกื˜ืึทืœื™ืจืŸ ื“ื™ OpenJDK ืคืึทืจืฉืคึผืจื™ื™ื˜ื•ื ื’ ืคึฟืึทืจ Java 8 ื•ื•ื™ ืขืก ืื™ื– ืคืืจืœืื ื’ื˜ ืฆื• ืœื•ื™ืคืŸ Apache ZooKeeper, ื•ื•ืึธืก ืื™ื– ืคืืจืœืื ื’ื˜ ืคึฟืึทืจ ืึท ืคื•ื ืื ื“ืขืจื’ืขื˜ื™ื™ืœื˜ ื™ื ืกื˜ืึทืœื™ืจื•ื ื’ ืคื•ืŸ ClickHouse ืื•ื™ืฃ ืึทืœืข ื“ืจื™ื™ ืžืืฉื™ื ืขืŸ.

$ sudo apt update
$ sudo apt install 
    openjdk-8-jre 
    openjdk-8-jdk-headless

ื“ืขืจื ืึธืš ืื™ืš ืฉื˜ืขืœืŸ ื“ื™ ืกื•ื•ื™ื•ื•ืข ื‘ื™ื™ึทื˜ืขื•ื•ื“ื™ืง JAVA_HOME.

$ sudo vi /etc/profile
 
export JAVA_HOME=/usr
 
$ source /etc/profile

ืื™ืš ื•ื•ืขืœ ื“ืขืจื ืึธืš ื ื•ืฆืŸ Ubuntu ืก ืคึผืขืงืœ ืคืึทืจื•ื•ืึทืœื˜ื•ื ื’ ืกื™ืกื˜ืขื ืฆื• ื™ื ืกื˜ืึทืœื™ืจืŸ ClickHouse 18.16.1, ื’ืœืึทื ืกื™ื– ืื•ืŸ ZooKeeper ืื•ื™ืฃ ืึทืœืข ื“ืจื™ื™ ืžืืฉื™ื ืขืŸ.

$ sudo apt-key adv 
    --keyserver hkp://keyserver.ubuntu.com:80 
    --recv E0C56BD4
$ echo "deb http://repo.yandex.ru/clickhouse/deb/stable/ main/" | 
    sudo tee /etc/apt/sources.list.d/clickhouse.list
$ sudo apt-get update

$ sudo apt install 
    clickhouse-client 
    clickhouse-server 
    glances 
    zookeeperd

ืื™ืš ื•ื•ืขื˜ ืžืึทื›ืŸ ืึท ื•ื•ืขื’ื•ื•ื™ื™ึทื–ืขืจ ืคึฟืึทืจ ClickHouse ืื•ืŸ ืื•ื™ืš ื˜ืึธืŸ ืขื˜ืœืขื›ืข ืงืึทื ืคื™ื’ื™ืขืจื™ื™ืฉืึทืŸ ืึธื•ื•ื•ืขืจืจื™ื™ื“ื– ืื•ื™ืฃ ืึทืœืข ื“ืจื™ื™ ืกืขืจื•ื•ืขืจืก.

$ sudo mkdir /ch/clickhouse
$ sudo chown -R clickhouse /ch/clickhouse

$ sudo mkdir -p /etc/clickhouse-server/conf.d
$ sudo vi /etc/clickhouse-server/conf.d/taxis.conf

ื“ืึธืก ื–ืขื ืขืŸ ื“ื™ ืงืึทื ืคื™ื’ื™ืขืจื™ื™ืฉืึทื ื– ื•ื•ืึธืก ืื™ืš ื•ื•ืขืœ ื ื•ืฆืŸ.

<?xml version="1.0"?>
<yandex>
    <listen_host>0.0.0.0</listen_host>
    <path>/ch/clickhouse/</path>

 <remote_servers>
        <perftest_3shards>
            <shard>
                <replica>
                    <host>172.30.2.192</host>
                    <port>9000</port>
                 </replica>
            </shard>
            <shard>
                 <replica>
                    <host>172.30.2.162</host>
                    <port>9000</port>
                 </replica>
            </shard>
            <shard>
                 <replica>
                    <host>172.30.2.36</host>
                    <port>9000</port>
                 </replica>
            </shard>
        </perftest_3shards>
    </remote_servers>

  <zookeeper-servers>
        <node>
            <host>172.30.2.192</host>
            <port>2181</port>
        </node>
        <node>
            <host>172.30.2.162</host>
            <port>2181</port>
        </node>
        <node>
            <host>172.30.2.36</host>
            <port>2181</port>
        </node>
    </zookeeper-servers>

 <macros>
        <shard>03</shard>
        <replica>01</replica>
    </macros>
</yandex>

ืื™ืš ื•ื•ืขืœ ืœื•ื™ืคืŸ ZooKeeper ืื•ืŸ ื“ื™ ืงืœื™ืงืงื”ืึธื•ืกืข ืกืขืจื•ื•ืขืจ ืื•ื™ืฃ ืึทืœืข ื“ืจื™ื™ ืžืืฉื™ื ืขืŸ.

$ sudo /etc/init.d/zookeeper start
$ sudo service clickhouse-server start

ื•ืคึผืœืึธืึทื“ื™ื ื’ ื“ืึทื˜ืŸ ืฆื• ืงืœื™ืงืงื›ืึทื•ืก

ืื•ื™ืฃ ื“ืขืจ ืขืจืฉื˜ืขืจ ืกืขืจื•ื•ืขืจ ืื™ืš ื•ื•ืขื˜ ืžืึทื›ืŸ ืึท ื™ืึทื–ื“ืข ื˜ื™ืฉ (trips), ื•ื•ืึธืก ื•ื•ืขื˜ ืงืจืึธื ืึท ื“ืึทื˜ืึทืกืขื˜ ืคื•ืŸ ื˜ืึทืงืกื™ ื˜ืจื™ืคึผืก ื ื™ืฆืŸ ื“ื™ ืงืœืึธืฅ ืžืึธื˜ืึธืจ.

$ clickhouse-client --host=0.0.0.0
 
CREATE TABLE trips (
    trip_id                 UInt32,
    vendor_id               String,

    pickup_datetime         DateTime,
    dropoff_datetime        Nullable(DateTime),

    store_and_fwd_flag      Nullable(FixedString(1)),
    rate_code_id            Nullable(UInt8),
    pickup_longitude        Nullable(Float64),
    pickup_latitude         Nullable(Float64),
    dropoff_longitude       Nullable(Float64),
    dropoff_latitude        Nullable(Float64),
    passenger_count         Nullable(UInt8),
    trip_distance           Nullable(Float64),
    fare_amount             Nullable(Float32),
    extra                   Nullable(Float32),
    mta_tax                 Nullable(Float32),
    tip_amount              Nullable(Float32),
    tolls_amount            Nullable(Float32),
    ehail_fee               Nullable(Float32),
    improvement_surcharge   Nullable(Float32),
    total_amount            Nullable(Float32),
    payment_type            Nullable(String),
    trip_type               Nullable(UInt8),
    pickup                  Nullable(String),
    dropoff                 Nullable(String),

    cab_type                Nullable(String),

    precipitation           Nullable(Int8),
    snow_depth              Nullable(Int8),
    snowfall                Nullable(Int8),
    max_temperature         Nullable(Int8),
    min_temperature         Nullable(Int8),
    average_wind_speed      Nullable(Int8),

    pickup_nyct2010_gid     Nullable(Int8),
    pickup_ctlabel          Nullable(String),
    pickup_borocode         Nullable(Int8),
    pickup_boroname         Nullable(String),
    pickup_ct2010           Nullable(String),
    pickup_boroct2010       Nullable(String),
    pickup_cdeligibil       Nullable(FixedString(1)),
    pickup_ntacode          Nullable(String),
    pickup_ntaname          Nullable(String),
    pickup_puma             Nullable(String),

    dropoff_nyct2010_gid    Nullable(UInt8),
    dropoff_ctlabel         Nullable(String),
    dropoff_borocode        Nullable(UInt8),
    dropoff_boroname        Nullable(String),
    dropoff_ct2010          Nullable(String),
    dropoff_boroct2010      Nullable(String),
    dropoff_cdeligibil      Nullable(String),
    dropoff_ntacode         Nullable(String),
    dropoff_ntaname         Nullable(String),
    dropoff_puma            Nullable(String)
) ENGINE = Log;

ืื™ืš ื“ืขืžืึธืœื˜ ืขืงืกื˜ืจืึทืงื˜ ืื•ืŸ ืœืึธื“ืŸ ื™ืขื“ืขืจ ืคื•ืŸ ื“ื™ ืงืกื•ื• ื˜ืขืงืขืก ืื™ืŸ ืึท ื™ืึทื–ื“ืข ื˜ื™ืฉ (trips). ื“ื™ ืคืืœื’ืขื ื“ืข ืื™ื– ื’ืขื•ื•ืขืŸ ื’ืขืขื ื“ื™ืงื˜ ืื™ืŸ 55 ืžื™ื ื•ื˜ ืื•ืŸ 10 ืกืขืงื•ื ื“ืขืก. ื ืึธืš ื“ืขื ืึธืคึผืขืจืึทืฆื™ืข, ื“ื™ ื’ืจื™ื™ืก ืคื•ืŸ ื“ื™ ื“ืึทื˜ืŸ ื•ื•ืขื’ื•ื•ื™ื™ึทื–ืขืจ ืื™ื– ื’ืขื•ื•ืขืŸ 134 ื’ื™ื’ืื‘ื™ื™ื˜.

$ time (for FILENAME in /ch/csv/trips_x*.csv.gz; do
            echo $FILENAME
            gunzip -c $FILENAME | 
                clickhouse-client 
                    --host=0.0.0.0 
                    --query="INSERT INTO trips FORMAT CSV"
        done)

ื“ื™ ื™ืžืคึผืึธืจื˜ ื’ื™ื›ืงื™ื™ึทื˜ ืื™ื– ื’ืขื•ื•ืขืŸ 155 ืžื‘ ืคื•ืŸ ื•ื ืงืึธืžืคึผืจืขืกืกืขื“ ืงืกื•ื• ืื™ื ื”ืึทืœื˜ ืคึผืขืจ ืกืขืงื•ื ื“ืข. ืื™ืš ื›ืึธืฉืขื“ ืึทื– ื“ืึธืก ืื™ื– ื’ืขื•ื•ืขืŸ ืจืขื›ื˜ ืฆื• ืึท ื‘ืึทื˜ืึทืœื ืขืง ืื™ืŸ GZIP ื“ื™ืงืึทืžืคึผืจืขืฉืึทืŸ. ืขืก ืงืขืŸ ื”ืึธื‘ืŸ ื’ืขื•ื•ืขืŸ ืคืึทืกื˜ืขืจ ืฆื• ืึทื ื–ื™ืคึผ ืึทืœืข ื“ื™ ื’ื–ื™ืคึผืขื“ ื˜ืขืงืขืก ืื™ืŸ ืคึผืึทืจืึทืœืขืœ ื ื™ืฆืŸ ืงืกืึทืจื’ืก ืื•ืŸ ื“ืขืžืึธืœื˜ ืœืึธื“ืŸ ื“ื™ ืึทื ื–ื™ืคึผืขื“ ื“ืึทื˜ืŸ. ื•ื ื˜ืขืจ ืื™ื– ืึท ื‘ืึทืฉืจื™ื™ึทื‘ื•ื ื’ ืคื•ืŸ ื•ื•ืึธืก ืื™ื– ื’ืขื•ื•ืขืŸ ืจืขืคึผืึธืจื˜ืขื“ ื‘ืขืฉืึทืก ื“ื™ CSV ืึทืจื™ื™ึทื ืคื™ืจ ืคึผืจืึธืฆืขืก.

$ sudo glances

ip-172-30-2-200 (Ubuntu 16.04 64bit / Linux 4.4.0-1072-aws)                                                                                                 Uptime: 0:11:42
CPU       8.2%  nice:     0.0%                           LOAD    36-core                           MEM      9.8%  active:    5.20G                           SWAP      0.0%
user:     6.0%  irq:      0.0%                           1 min:    2.24                            total:  68.7G  inactive:  61.0G                           total:       0
system:   0.9%  iowait:   1.3%                           5 min:    1.83                            used:   6.71G  buffers:   66.4M                           used:        0
idle:    91.8%  steal:    0.0%                           15 min:   1.01                            free:   62.0G  cached:    61.6G                           free:        0

NETWORK     Rx/s   Tx/s   TASKS 370 (507 thr), 2 run, 368 slp, 0 oth sorted automatically by cpu_percent, flat view
ens5        136b    2Kb
lo         343Mb  343Mb     CPU%  MEM%  VIRT   RES   PID USER        NI S    TIME+ IOR/s IOW/s Command
                           100.4   1.5 1.65G 1.06G  9909 ubuntu       0 S  1:01.33     0     0 clickhouse-client --host=0.0.0.0 --query=INSERT INTO trips FORMAT CSV
DISK I/O     R/s    W/s     85.1   0.0 4.65M  708K  9908 ubuntu       0 R  0:50.60   32M     0 gzip -d -c /ch/csv/trips_xac.csv.gz
loop0          0      0     54.9   5.1 8.14G 3.49G  8091 clickhous    0 S  1:44.23     0   45M /usr/bin/clickhouse-server --config=/etc/clickhouse-server/config.xml
loop1          0      0      4.5   0.0     0     0   319 root         0 S  0:07.50    1K     0 kworker/u72:2
nvme0n1        0     3K      2.3   0.0 91.1M 28.9M  9912 root         0 R  0:01.56     0     0 /usr/bin/python3 /usr/bin/glances
nvme0n1p1      0     3K      0.3   0.0     0     0   960 root       -20 S  0:00.10     0     0 kworker/28:1H
nvme1n1    32.1M   495M      0.3   0.0     0     0  1058 root       -20 S  0:00.90     0     0 kworker/23:1H

ืื™ืš ื•ื•ืขื˜ ืคืจื™ื™ ืคึผืœืึทืฅ ืื•ื™ืฃ ื“ื™ NVMe ืคืึธืจ ื“ื•ืจืš ื•ื™ืกืžืขืงืŸ ื“ื™ ืึธืจื™ื’ื™ื ืขืœ ืงืกื•ื• ื˜ืขืงืขืก ืื™ื™ื“ืขืจ ืื™ืš ืคืึธืจื–ืขืฆืŸ.

$ sudo rm -fr /ch/csv

ื’ืขืจ ืฆื• ื–ื™ื™ึทืœ ืคืึธืจื

ื“ื™ Log ClickHouse ืžืึธื˜ืึธืจ ื•ื•ืขื˜ ืงืจืึธื ื“ืึทื˜ืŸ ืื™ืŸ ืึท ืจื•ื“ืขืจืŸ-ืึธืจื™ืขื ื˜ื™ื“ ืคึฟืึธืจืžืึทื˜. ืฆื• ืึธื ืคืจืขื’ ื“ืึทื˜ืŸ ืคืึทืกื˜ืขืจ, ืื™ืš ื’ืขืจ ืขืก ืฆื• ืงืึธืœื•ืžื ืขืจ ืคึฟืึธืจืžืึทื˜ ืžื™ื˜ ื“ื™ MergeTree ืžืึธื˜ืึธืจ.

$ clickhouse-client --host=0.0.0.0

ื“ื™ ืคืืœื’ืขื ื“ืข ืื™ื– ื’ืขื•ื•ืขืŸ ื’ืขืขื ื“ื™ืงื˜ ืื™ืŸ 34 ืžื™ื ื•ื˜ ืื•ืŸ 50 ืกืขืงื•ื ื“ืขืก. ื ืึธืš ื“ืขื ืึธืคึผืขืจืึทืฆื™ืข, ื“ื™ ื’ืจื™ื™ืก ืคื•ืŸ ื“ื™ ื“ืึทื˜ืŸ ื•ื•ืขื’ื•ื•ื™ื™ึทื–ืขืจ ืื™ื– ื’ืขื•ื•ืขืŸ 237 ื’ื™ื’ืื‘ื™ื™ื˜.

CREATE TABLE trips_mergetree
    ENGINE = MergeTree(pickup_date, pickup_datetime, 8192)
    AS SELECT
        trip_id,
        CAST(vendor_id AS Enum8('1' = 1,
                                '2' = 2,
                                'CMT' = 3,
                                'VTS' = 4,
                                'DDS' = 5,
                                'B02512' = 10,
                                'B02598' = 11,
                                'B02617' = 12,
                                'B02682' = 13,
                                'B02764' = 14)) AS vendor_id,
        toDate(pickup_datetime)                 AS pickup_date,
        ifNull(pickup_datetime, toDateTime(0))  AS pickup_datetime,
        toDate(dropoff_datetime)                AS dropoff_date,
        ifNull(dropoff_datetime, toDateTime(0)) AS dropoff_datetime,
        assumeNotNull(store_and_fwd_flag)       AS store_and_fwd_flag,
        assumeNotNull(rate_code_id)             AS rate_code_id,

        assumeNotNull(pickup_longitude)         AS pickup_longitude,
        assumeNotNull(pickup_latitude)          AS pickup_latitude,
        assumeNotNull(dropoff_longitude)        AS dropoff_longitude,
        assumeNotNull(dropoff_latitude)         AS dropoff_latitude,
        assumeNotNull(passenger_count)          AS passenger_count,
        assumeNotNull(trip_distance)            AS trip_distance,
        assumeNotNull(fare_amount)              AS fare_amount,
        assumeNotNull(extra)                    AS extra,
        assumeNotNull(mta_tax)                  AS mta_tax,
        assumeNotNull(tip_amount)               AS tip_amount,
        assumeNotNull(tolls_amount)             AS tolls_amount,
        assumeNotNull(ehail_fee)                AS ehail_fee,
        assumeNotNull(improvement_surcharge)    AS improvement_surcharge,
        assumeNotNull(total_amount)             AS total_amount,
        assumeNotNull(payment_type)             AS payment_type_,
        assumeNotNull(trip_type)                AS trip_type,

        pickup AS pickup,
        pickup AS dropoff,

        CAST(assumeNotNull(cab_type)
            AS Enum8('yellow' = 1, 'green' = 2))
                                AS cab_type,

        precipitation           AS precipitation,
        snow_depth              AS snow_depth,
        snowfall                AS snowfall,
        max_temperature         AS max_temperature,
        min_temperature         AS min_temperature,
        average_wind_speed      AS average_wind_speed,

        pickup_nyct2010_gid     AS pickup_nyct2010_gid,
        pickup_ctlabel          AS pickup_ctlabel,
        pickup_borocode         AS pickup_borocode,
        pickup_boroname         AS pickup_boroname,
        pickup_ct2010           AS pickup_ct2010,
        pickup_boroct2010       AS pickup_boroct2010,
        pickup_cdeligibil       AS pickup_cdeligibil,
        pickup_ntacode          AS pickup_ntacode,
        pickup_ntaname          AS pickup_ntaname,
        pickup_puma             AS pickup_puma,

        dropoff_nyct2010_gid    AS dropoff_nyct2010_gid,
        dropoff_ctlabel         AS dropoff_ctlabel,
        dropoff_borocode        AS dropoff_borocode,
        dropoff_boroname        AS dropoff_boroname,
        dropoff_ct2010          AS dropoff_ct2010,
        dropoff_boroct2010      AS dropoff_boroct2010,
        dropoff_cdeligibil      AS dropoff_cdeligibil,
        dropoff_ntacode         AS dropoff_ntacode,
        dropoff_ntaname         AS dropoff_ntaname,
        dropoff_puma            AS dropoff_puma
    FROM trips;

ื“ืึธืก ืื™ื– ื•ื•ื™ ื“ืขืจ ื‘ืœื™ืง ืจืขื–ื•ืœื˜ืึทื˜ ื’ืขืงื•ืงื˜ ื•ื•ื™ ื‘ืขืฉืึทืก ื“ื™ ืึธืคึผืขืจืึทืฆื™ืข:

ip-172-30-2-200 (Ubuntu 16.04 64bit / Linux 4.4.0-1072-aws)                                                                                                 Uptime: 1:06:09
CPU      10.3%  nice:     0.0%                           LOAD    36-core                           MEM     16.1%  active:    13.3G                           SWAP      0.0%
user:     7.9%  irq:      0.0%                           1 min:    1.87                            total:  68.7G  inactive:  52.8G                           total:       0
system:   1.6%  iowait:   0.8%                           5 min:    1.76                            used:   11.1G  buffers:   71.8M                           used:        0
idle:    89.7%  steal:    0.0%                           15 min:   1.95                            free:   57.6G  cached:    57.2G                           free:        0

NETWORK     Rx/s   Tx/s   TASKS 367 (523 thr), 1 run, 366 slp, 0 oth sorted automatically by cpu_percent, flat view
ens5         1Kb    8Kb
lo           2Kb    2Kb     CPU%  MEM%  VIRT   RES   PID USER        NI S    TIME+ IOR/s IOW/s Command
                           241.9  12.8 20.7G 8.78G  8091 clickhous    0 S 30:36.73   34M  125M /usr/bin/clickhouse-server --config=/etc/clickhouse-server/config.xml
DISK I/O     R/s    W/s      2.6   0.0 90.4M 28.3M  9948 root         0 R  1:18.53     0     0 /usr/bin/python3 /usr/bin/glances
loop0          0      0      1.3   0.0     0     0   203 root         0 S  0:09.82     0     0 kswapd0
loop1          0      0      0.3   0.1  315M 61.3M 15701 ubuntu       0 S  0:00.40     0     0 clickhouse-client --host=0.0.0.0
nvme0n1        0     3K      0.3   0.0     0     0     7 root         0 S  0:00.83     0     0 rcu_sched
nvme0n1p1      0     3K      0.0   0.0     0     0   142 root         0 S  0:00.22     0     0 migration/27
nvme1n1    25.8M   330M      0.0   0.0 59.7M 1.79M  2764 ubuntu       0 S  0:00.00     0     0 (sd-pam)

ืื™ืŸ ื“ื™ ืœืขืฆื˜ืข ืคึผืจืึธื‘ืข, ืขื˜ืœืขื›ืข ืฉืคืืœื˜ืŸ ื–ืขื ืขืŸ ืงืึธื ื•ื•ืขืจื˜ืขื“ ืื•ืŸ ืจื™ืงืึทืœืงื™ืึทืœื™ื™ื˜ื™ื“. ืื™ืš ื’ืขืคึฟื•ื ืขืŸ ืึทื– ืขื˜ืœืขื›ืข ืคื•ืŸ โ€‹โ€‹ื“ื™ ืคืึทื ื’ืงืฉืึทื ื– ืึทืจื‘ืขื˜ ื ื™ื˜ ืžืขืจ ื•ื•ื™ ื“ืขืจื•ื•ืึทืจื˜ ืื•ื™ืฃ ื“ืขื ื“ืึทื˜ืึทืกืขื˜. ืฆื• ืกืึธืœื•ื•ืข ื“ืขื ืคึผืจืึธื‘ืœืขื, ืื™ืš ืึทื•ื•ืขืงื’ืขื ื•ืžืขืŸ ื“ื™ ื™ื ืึทืคึผืจืึธื•ืคึผืจื™ื™ื˜ ืคืึทื ื’ืงืฉืึทื ื– ืื•ืŸ ืœืึธื•ื“ื™ื“ ื“ื™ ื“ืึทื˜ืŸ ืึธืŸ ืงืึทื ื•ื•ืขืจื˜ื™ื ื’ ืฆื• ืžืขืจ ื’ืจืึทื ื™ืึทืœืขืจ ื˜ื™ื™ืคึผืก.

ืคืึทืจืฉืคึผืจื™ื™ื˜ื•ื ื’ ืคื•ืŸ ื“ืึทื˜ืŸ ืึทืจื™ื‘ืขืจ ื“ื™ ืงื ื•ื™ืœ

ืื™ืš ื•ื•ืขืœ ืคืึทืจืฉืคึผืจื™ื™ื˜ืŸ ื“ื™ ื“ืึทื˜ืŸ ืื™ื‘ืขืจ ืึทืœืข ื“ืจื™ื™ ืงื ื•ื™ืœ ื ืึธื•ื“ื–. ืฆื• ืึธื ื”ื™ื™ื‘ืŸ, ืื•ื ื˜ืŸ ืื™ืš ื•ื•ืขืœ ืžืึทื›ืŸ ืึท ื˜ื™ืฉ ืื•ื™ืฃ ืึทืœืข ื“ืจื™ื™ ืžืืฉื™ื ืขืŸ.

$ clickhouse-client --host=0.0.0.0

CREATE TABLE trips_mergetree_third (
    trip_id                 UInt32,
    vendor_id               String,
    pickup_date             Date,
    pickup_datetime         DateTime,
    dropoff_date            Date,
    dropoff_datetime        Nullable(DateTime),
    store_and_fwd_flag      Nullable(FixedString(1)),
    rate_code_id            Nullable(UInt8),
    pickup_longitude        Nullable(Float64),
    pickup_latitude         Nullable(Float64),
    dropoff_longitude       Nullable(Float64),
    dropoff_latitude        Nullable(Float64),
    passenger_count         Nullable(UInt8),
    trip_distance           Nullable(Float64),
    fare_amount             Nullable(Float32),
    extra                   Nullable(Float32),
    mta_tax                 Nullable(Float32),
    tip_amount              Nullable(Float32),
    tolls_amount            Nullable(Float32),
    ehail_fee               Nullable(Float32),
    improvement_surcharge   Nullable(Float32),
    total_amount            Nullable(Float32),
    payment_type            Nullable(String),
    trip_type               Nullable(UInt8),
    pickup                  Nullable(String),
    dropoff                 Nullable(String),

    cab_type                Nullable(String),

    precipitation           Nullable(Int8),
    snow_depth              Nullable(Int8),
    snowfall                Nullable(Int8),
    max_temperature         Nullable(Int8),
    min_temperature         Nullable(Int8),
    average_wind_speed      Nullable(Int8),

    pickup_nyct2010_gid     Nullable(Int8),
    pickup_ctlabel          Nullable(String),
    pickup_borocode         Nullable(Int8),
    pickup_boroname         Nullable(String),
    pickup_ct2010           Nullable(String),
    pickup_boroct2010       Nullable(String),
    pickup_cdeligibil       Nullable(FixedString(1)),
    pickup_ntacode          Nullable(String),
    pickup_ntaname          Nullable(String),
    pickup_puma             Nullable(String),

    dropoff_nyct2010_gid    Nullable(UInt8),
    dropoff_ctlabel         Nullable(String),
    dropoff_borocode        Nullable(UInt8),
    dropoff_boroname        Nullable(String),
    dropoff_ct2010          Nullable(String),
    dropoff_boroct2010      Nullable(String),
    dropoff_cdeligibil      Nullable(String),
    dropoff_ntacode         Nullable(String),
    dropoff_ntaname         Nullable(String),
    dropoff_puma            Nullable(String)
) ENGINE = MergeTree(pickup_date, pickup_datetime, 8192);

ื“ืขืจื ืึธืš ืื™ืš ื•ื•ืขืœ ืžืึทื›ืŸ ื–ื™ื›ืขืจ ืึทื– ื“ืขืจ ืขืจืฉื˜ืขืจ ืกืขืจื•ื•ืขืจ ืงืขื ืขืŸ ื–ืขืŸ ืึทืœืข ื“ืจื™ื™ ื ืึธื•ื“ื– ืื™ืŸ ื“ืขื ืงื ื•ื™ืœ.

SELECT *
FROM system.clusters
WHERE cluster = 'perftest_3shards'
FORMAT Vertical;
Row 1:
โ”€โ”€โ”€โ”€โ”€โ”€
cluster:          perftest_3shards
shard_num:        1
shard_weight:     1
replica_num:      1
host_name:        172.30.2.192
host_address:     172.30.2.192
port:             9000
is_local:         1
user:             default
default_database:
Row 2:
โ”€โ”€โ”€โ”€โ”€โ”€
cluster:          perftest_3shards
shard_num:        2
shard_weight:     1
replica_num:      1
host_name:        172.30.2.162
host_address:     172.30.2.162
port:             9000
is_local:         0
user:             default
default_database:

Row 3:
โ”€โ”€โ”€โ”€โ”€โ”€
cluster:          perftest_3shards
shard_num:        3
shard_weight:     1
replica_num:      1
host_name:        172.30.2.36
host_address:     172.30.2.36
port:             9000
is_local:         0
user:             default
default_database:

ื“ืขืจื ืึธืš ืื™ืš ื•ื•ืขืœ ื“ืขืคื™ื ื™ืจืŸ ืึท ื ื™ื™ึทืข ื˜ื™ืฉ ืื•ื™ืฃ ื“ืขืจ ืขืจืฉื˜ืขืจ ืกืขืจื•ื•ืขืจ ื•ื•ืึธืก ืื™ื– ื‘ืื–ื™ืจื˜ ืื•ื™ืฃ ื“ื™ ืกื˜ืฉืขืžืึท trips_mergetree_third ืื•ืŸ ื ื™ืฆื˜ ื“ื™ ื“ื™ืกื˜ืจื™ื‘ื™ื•ื˜ื™ื“ ืžืึธื˜ืึธืจ.

CREATE TABLE trips_mergetree_x3
    AS trips_mergetree_third
    ENGINE = Distributed(perftest_3shards,
                         default,
                         trips_mergetree_third,
                         rand());

ืื™ืš ื•ื•ืขืœ ื“ืขืจื ืึธืš ื ืึธื›ืžืึทื›ืŸ ื“ื™ ื“ืึทื˜ืŸ ืคื•ืŸ ื“ื™ MergeTree ื‘ืื–ื™ืจื˜ ื˜ื™ืฉ ืฆื• ืึทืœืข ื“ืจื™ื™ ืกืขืจื•ื•ืขืจืก. ื“ื™ ืคืืœื’ืขื ื“ืข ืื™ื– ื’ืขื•ื•ืขืŸ ื’ืขืขื ื“ื™ืงื˜ ืื™ืŸ 34 ืžื™ื ื•ื˜ ืื•ืŸ 44 ืกืขืงื•ื ื“ืขืก.

INSERT INTO trips_mergetree_x3
    SELECT * FROM trips_mergetree;

ื ืึธืš ื“ื™ ืื•ื™ื‘ืŸ ืึธืคึผืขืจืึทืฆื™ืข, ืื™ืš ื’ืขื’ืขื‘ืŸ ClickHouse 15 ืžื™ื ื•ื˜ ืฆื• ืžืึทืš ืึทื•ื•ืขืง ืคื•ืŸ ื“ื™ ืžืึทืงืกื™ืžื•ื ืกื˜ืึธืจื™ื“ื–ืฉ ืžื“ืจื’ื” ืฆื™ื™ื›ืŸ. ื“ื™ ื“ืึทื˜ืŸ ื“ื™ืจืขืงื˜ืขืจื™ื– ื–ืขื ืขืŸ ืจื™ืกืคึผืขืงื˜ื™ื•ื•ืœื™ 264 GB, 34 GB ืื•ืŸ 33 GB ืื•ื™ืฃ ื™ืขื“ืขืจ ืคื•ืŸ ื“ื™ ื“ืจื™ื™ ืกืขืจื•ื•ืขืจืก.

ClickHouse ืงื ื•ื™ืœ ืคืึธืจืฉื˜ืขืœื•ื ื’ ืืคืฉืืฆื•ื ื’

ื•ื•ืึธืก ืื™ืš ื’ืขื–ืขืŸ ื•ื•ื™ื™ึทื˜ืขืจ ืื™ื– ื“ื™ ืคืึทืกื˜ืึทืกื˜ ืžืึธืœ ืื™ืš ื•ื•ืข ื’ืขื–ืขืŸ ืคืœื™ืกื ื“ื™ืง ื™ืขื“ืขืจ ืึธื ืคึฟืจืขื’ ืื•ื™ืฃ ืึท ื˜ื™ืฉ ืงื™ื™ืคืœ ืžืืœ trips_mergetree_x3.

$ clickhouse-client --host=0.0.0.0

ื“ื™ ืคืืœื’ืขื ื“ืข ืื™ื– ื’ืขืขื ื“ื™ืงื˜ ืื™ืŸ 2.449 ืกืขืงื•ื ื“ืขืก.

SELECT cab_type, count(*)
FROM trips_mergetree_x3
GROUP BY cab_type;

ื“ื™ ืคืืœื’ืขื ื“ืข ืื™ื– ื’ืขืขื ื“ื™ืงื˜ ืื™ืŸ 0.691 ืกืขืงื•ื ื“ืขืก.

SELECT passenger_count,
       avg(total_amount)
FROM trips_mergetree_x3
GROUP BY passenger_count;

ื“ื™ ืคืืœื’ืขื ื“ืข ืื™ื– ื’ืขืขื ื“ื™ืงื˜ ืื™ืŸ 0 ืกืขืงื•ื ื“ืขืก.

SELECT passenger_count,
       toYear(pickup_date) AS year,
       count(*)
FROM trips_mergetree_x3
GROUP BY passenger_count,
         year;

ื“ื™ ืคืืœื’ืขื ื“ืข ืื™ื– ื’ืขืขื ื“ื™ืงื˜ ืื™ืŸ 0.983 ืกืขืงื•ื ื“ืขืก.

SELECT passenger_count,
       toYear(pickup_date) AS year,
       round(trip_distance) AS distance,
       count(*)
FROM trips_mergetree_x3
GROUP BY passenger_count,
         year,
         distance
ORDER BY year,
         count(*) DESC;

ืคึฟืึทืจ ืคืึทืจื’ืœื™ื™ึทืš, ืื™ืš ื’ืขืœืืคืŸ ื“ื™ ื–ืขืœื‘ืข ืงื•ื•ื™ืจื™ื– ืื•ื™ืฃ ืึท MergeTree-ื‘ืื–ื™ืจื˜ ื˜ื™ืฉ ื•ื•ืึธืก ืจื™ื–ื™ื™ื“ื– ื‘ืœื•ื™ื– ืื•ื™ืฃ ื“ืขืจ ืขืจืฉื˜ืขืจ ืกืขืจื•ื•ืขืจ.

ืคืึธืจืฉื˜ืขืœื•ื ื’ ืืคืฉืืฆื•ื ื’ ืคื•ืŸ ืื™ื™ืŸ ืงืœื™ืงื›ืึธื•ืกืข ื ืึธื“ืข

ื•ื•ืึธืก ืื™ืš ื’ืขื–ืขืŸ ื•ื•ื™ื™ึทื˜ืขืจ ืื™ื– ื“ื™ ืคืึทืกื˜ืึทืกื˜ ืžืึธืœ ืื™ืš ื•ื•ืข ื’ืขื–ืขืŸ ืคืœื™ืกื ื“ื™ืง ื™ืขื“ืขืจ ืึธื ืคึฟืจืขื’ ืื•ื™ืฃ ืึท ื˜ื™ืฉ ืงื™ื™ืคืœ ืžืืœ trips_mergetree_x3.

ื“ื™ ืคืืœื’ืขื ื“ืข ืื™ื– ื’ืขืขื ื“ื™ืงื˜ ืื™ืŸ 0.241 ืกืขืงื•ื ื“ืขืก.

SELECT cab_type, count(*)
FROM trips_mergetree
GROUP BY cab_type;

ื“ื™ ืคืืœื’ืขื ื“ืข ืื™ื– ื’ืขืขื ื“ื™ืงื˜ ืื™ืŸ 0.826 ืกืขืงื•ื ื“ืขืก.

SELECT passenger_count,
       avg(total_amount)
FROM trips_mergetree
GROUP BY passenger_count;

ื“ื™ ืคืืœื’ืขื ื“ืข ืื™ื– ื’ืขืขื ื“ื™ืงื˜ ืื™ืŸ 1.209 ืกืขืงื•ื ื“ืขืก.

SELECT passenger_count,
       toYear(pickup_date) AS year,
       count(*)
FROM trips_mergetree
GROUP BY passenger_count,
         year;

ื“ื™ ืคืืœื’ืขื ื“ืข ืื™ื– ื’ืขืขื ื“ื™ืงื˜ ืื™ืŸ 1.781 ืกืขืงื•ื ื“ืขืก.

SELECT passenger_count,
       toYear(pickup_date) AS year,
       round(trip_distance) AS distance,
       count(*)
FROM trips_mergetree
GROUP BY passenger_count,
         year,
         distance
ORDER BY year,
         count(*) DESC;

ืจืขืคืœืขืงื˜ื™ืึธื ืก ืื•ื™ืฃ ื“ื™ ืจืขื–ื•ืœื˜ืึทื˜ืŸ

ื“ืึธืก ืื™ื– ื“ืขืจ ืขืจืฉื˜ืขืจ ืžืึธืœ ืึทื– ืึท ืคืจื™ื™ ืงืคึผื•-ื‘ืื–ื™ืจื˜ ื“ืึทื˜ืึทื‘ื™ื™ืก ืื™ื– ื‘ื™ื›ื•ืœืช ืฆื• ืึทื•ื˜ืคึผืขืจืคืึธืจื ืึท ื’ืคึผื•-ื‘ืื–ื™ืจื˜ ื“ืึทื˜ืึทื‘ื™ื™ืก ืื™ืŸ ืžื™ื™ืŸ ื˜ืขืกืฅ. ื“ื™ ื’ืคึผื•-ื‘ืื–ื™ืจื˜ ื“ืึทื˜ืึทื‘ื™ื™ืก ื”ืื˜ ื“ื•ืจื›ื’ืขื’ืื ื’ืขืŸ ืฆื•ื•ื™ื™ ืจื™ื•ื•ื™ื–ืฉืึทื ื– ื–ื™ื ื˜ ื“ืขืžืึธืœื˜, ืึธื‘ืขืจ ื“ื™ ืคืึธืจืฉื˜ืขืœื•ื ื’ ื•ื•ืึธืก ClickHouse ืื™ื‘ืขืจื’ืขื’ืขื‘ืŸ ืื•ื™ืฃ ืึท ืื™ื™ืŸ ื ืึธื“ืข ืื™ื– ืคื•ื ื“ืขืกื˜ื•ื•ืขื’ืŸ ื–ื™ื™ืขืจ ื™ืžืคึผืจืขืกื™ื•ื•.

ืื™ืŸ ื“ืขืจ ื–ืขืœื‘ื™ืงืขืจ ืฆื™ื™ื˜, ื•ื•ืขืŸ ืขืงืกืึทืงื™ื•ื˜ื™ื ื’ ืึธื ืคึฟืจืขื’ 1 ืื•ื™ืฃ ืึท ืคื•ื ืื ื“ืขืจื’ืขื˜ื™ื™ืœื˜ ืžืึธื˜ืึธืจ, ื“ื™ ืึธื•ื•ื•ืขืจื›ืขื“ ืงืึธืก ื–ืขื ืขืŸ ืึท ืกื“ืจ ืคื•ืŸ ืžืึทื’ื ืึทื˜ื•ื“ ื”ืขื›ืขืจ. ืื™ืš ื”ืึธืคึฟืŸ ืื™ืš ืžื™ืกื˜ ืขืคึผืขืก ืื™ืŸ ืžื™ื™ืŸ ืคืึธืจืฉื•ื ื’ ืคึฟืึทืจ ื“ืขื ืคึผืึธืกื˜ืŸ ื•ื•ื™ื™ึทืœ ืขืก ื•ื•ืึธืœื˜ ื–ื™ื™ืŸ ืคื™ื™ึทืŸ ืฆื• ื–ืขืŸ ื“ื™ ืึธื ืคึฟืจืขื’ ืฆื™ื™ื˜ ื’ื™ื™ืŸ ืึทืจืึธืคึผ ื•ื•ืขืŸ ืื™ืš ืœื™ื™ื’ืŸ ืžืขืจ ื ืึธื•ื“ื– ืฆื• ื“ืขื ืงื ื•ื™ืœ. ืึธื‘ืขืจ, ืขืก ืื™ื– ื’ืจื•ื™ืก ืึทื– ื•ื•ืขืŸ ืขืงืกืึทืงื™ื•ื˜ื™ื ื’ ืื ื“ืขืจืข ืงื•ื•ื™ืจื™ื–, ืคืึธืจืฉื˜ืขืœื•ื ื’ ื’ืขื•ื•ืืงืกืŸ ืžื™ื˜ ื•ื•ืขื’ืŸ 2 ืžืืœ.

ืขืก ื•ื•ืึธืœื˜ ื–ื™ื™ืŸ ืึธื ื’ืขื ืขื ืฆื• ื–ืขืŸ ืงืœื™ืงื›ืึธื•ืกืข ื™ื•ื•ืึทืœื•ื• ืฆื• ืงืขื ืขืŸ ืฆื• ื‘ืึทื–ื•ื ื“ืขืจ ืกื˜ืึธืจื™ื“ื–ืฉ ืื•ืŸ ืงืึทืžืคึผื™ื•ื˜ื™ื ื’ ืึทื–ื•ื™ ื–ื™ื™ ืงืขื ืขืŸ ื–ื™ื™ืŸ ื™ื ื“ื™ืคึผืขื ื“ืึทื ื˜ืœื™ ื•ื•ืึธื’. HDFS ืฉื˜ื™ืฆืŸ, ื•ื•ืึธืก ืื™ื– ื’ืขื•ื•ืขืŸ ืฆื•ื’ืขื’ืขื‘ืŸ ืœืขืฆื˜ืข ื™ืึธืจ, ืงืขืŸ ื–ื™ื™ืŸ ืึท ืฉืจื™ื˜ ืฆื• ื“ืขื. ืื™ืŸ ื˜ืขืจืžื™ื ืขืŸ ืคื•ืŸ ืงืึทืžืคึผื™ื•ื˜ื™ื ื’, ืื•ื™ื‘ ืึท ืื™ื™ืŸ ืึธื ืคึฟืจืขื’ ืงืขื ืขืŸ ื–ื™ื™ืŸ ืึทืงืกืขืœืขืจื™ื™ื˜ื™ื“ ื“ื•ืจืš ืึทื“ื™ื ื’ ืžืขืจ ื ืึธื•ื“ื– ืฆื• ื“ืขื ืงื ื•ื™ืœ, ื“ื™ ืฆื•ืงื•ื ืคึฟื˜ ืคื•ืŸ ื“ืขื ื•ื•ื™ื™ื›ื•ื•ืืจื’ ืื™ื– ื–ื™ื™ืขืจ ื”ืขืœ.

ื“ืื ืง ืื™ืจ ืคึฟืึทืจ ื’ืขื ื•ืžืขืŸ ื“ื™ ืฆื™ื™ื˜ ืฆื• ืœื™ื™ืขื ืขืŸ ื“ืขื ืคึผืึธืกื˜ืŸ. ืื™ืš ืคืึธืจืฉืœืึธื’ืŸ ืงืึทื ืกืึทืœื˜ื™ื ื’, ืึทืจืงืึทื˜ืขืงื˜ืฉืขืจ ืื•ืŸ ืคื™ืจ ืึทื ื˜ื•ื•ื™ืงืœื•ื ื’ ื‘ืึทื“ื™ื ื•ื ื’ืก ืฆื• ืงืœื™ื™ืึทื ืฅ ืื™ืŸ ืฆืคื•ืŸ ืึทืžืขืจื™ืงืข ืื•ืŸ ืื™ื™ืจืึธืคึผืข. ืื•ื™ื‘ ืื™ืจ ื•ื•ืึธืœื˜ ื•ื•ื™ ืฆื• ื“ื™ืกืงื•ื˜ื™ืจืŸ ื•ื•ื™ ืžื™ื™ืŸ ืคึฟื™ืจืœื™ื™ื’ืŸ ืงืขื ืขืŸ ื”ืขืœืคึฟืŸ ื“ื™ื™ืŸ ื’ืขืฉืขืคื˜, ื‘ื™ื˜ืข ืงืึธื ื˜ืึทืงื˜ ืžื™ืจ ื“ื•ืจืš ืœื™ื ืงืขื“ื™ืŸ.

ืžืงื•ืจ: www.habr.com

ืœื™ื™ื’ืŸ ืึท ื‘ืึทืžืขืจืงื•ื ื’