αž€αžΆαžšαž’αŸ’αžœαžΎαžŠαŸ†αžŽαžΎαžšαžαžΆαž˜αžαžΆαž€αŸ‹αžŸαŸŠαžΈαž…αŸ†αž“αž½αž“ 1.1 αž–αžΆαž“αŸ‹αž›αžΆαž“αžŠαž„αŸ– αž…αž„αŸ’αž€αŸ„αž˜ ClickHouse 108-core

αž€αžΆαžšαž”αž€αž”αŸ’αžšαŸ‚αž’αžαŸ’αžαž”αž‘αžαŸ’αžšαžΌαžœαž”αžΆαž“αžšαŸ€αž”αž…αŸ†αž‡αžΆαž–αž·αžŸαŸαžŸαžŸαž˜αŸ’αžšαžΆαž”αŸ‹αžŸαž·αžŸαŸ’αžŸαž“αŸƒαžœαž‚αŸ’αž‚αžŸαž·αž€αŸ’αžŸαžΆ "αžœαž·αžŸαŸ’αžœαž€αžšαž‘αž·αž“αŸ’αž“αž“αŸαž™".

αž€αžΆαžšαž’αŸ’αžœαžΎαžŠαŸ†αžŽαžΎαžšαžαžΆαž˜αžαžΆαž€αŸ‹αžŸαŸŠαžΈαž…αŸ†αž“αž½αž“ 1.1 αž–αžΆαž“αŸ‹αž›αžΆαž“αžŠαž„αŸ– αž…αž„αŸ’αž€αŸ„αž˜ ClickHouse 108-core

αž…αž»αž…αž•αŸ’αž‘αŸ‡ αž‚αžΊαž‡αžΆαž˜αžΌαž›αžŠαŸ’αž‹αžΆαž“αž‘αž·αž“αŸ’αž“αž“αŸαž™ columnar αž”αŸ’αžšαž—αž–αž”αžΎαž€αž…αŸ†αž αŸ” αžœαžΆαž‡αžΆαž”αžšαž·αž™αžΆαž€αžΆαžŸαžŠαŸαž›αŸ’αž’αž˜αž½αž™ αžŠαŸ‚αž›αž’αŸ’αž“αž€αžœαž·αž—αžΆαž‚αžšαžΆαž”αŸ‹αžšαž™αž“αžΆαž€αŸ‹αž’αžΆαž…αžŸαžΆαž€αžŸαž½αžšαž‘αž·αž“αŸ’αž“αž“αŸαž™αž›αž˜αŸ’αž’αž·αžαž”αžΆαž“αž™αŸ‰αžΆαž„αž†αžΆαž”αŸ‹αžšαž αŸαžŸ αž‘αŸ„αŸ‡αž”αžΈαž‡αžΆαž€αŸ†αžŽαžαŸ‹αžαŸ’αžšαžΆαžαŸ’αž˜αžΈαžšαžΆαž”αŸ‹αžŸαž·αž”αž–αžΆαž“αŸ‹αž›αžΆαž“αžαŸ’αžšαžΌαžœαž”αžΆαž“αž”αž‰αŸ’αž…αžΌαž›αž€αŸ’αž“αž»αž„αž˜αž½αž™αžαŸ’αž„αŸƒαž€αŸαžŠαŸ„αž™αŸ” αž€αžΆαžšαž…αŸ†αžŽαžΆαž™αž›αžΎαž αŸαžŠαŸ’αž‹αžΆαžšαž…αž“αžΆαžŸαž˜αŸ’αž–αŸαž“αŸ’αž’αžŠαžΎαž˜αŸ’αž”αžΈαž‚αžΆαŸ†αž‘αŸ’αžšαž”αŸ’αžšαž–αŸαž“αŸ’αž’αž”αŸ‚αž”αž“αŸαŸ‡αž’αžΆαž…αžαŸ’αž–αžŸαŸ‹αžŠαž›αŸ‹ 100 αžŠαž»αž›αŸ’αž›αžΆαžšαž€αŸ’αž“αž»αž„αž˜αž½αž™αž†αŸ’αž“αžΆαŸ† αž αžΎαž™αž’αžΆαž…αž˜αžΆαž“αž–αžΆαž€αŸ‹αž€αžŽαŸ’αžαžΆαž›αž’αžΆαžŸαŸ’αžšαŸαž™αž›αžΎαž€αžΆαžšαž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αŸ” αž“αŸ…αž…αŸ†αžŽαž»αž…αž˜αž½αž™ αž€αžΆαžšαžŠαŸ†αž‘αžΎαž„ ClickHouse αž–αžΈ Yandex Metrics αž˜αžΆαž“αž€αŸ†αžŽαžαŸ‹αžαŸ’αžšαžΆ 10 αž›αžΆαž“αž›αžΆαž“αŸ” αž”αž“αŸ’αžαŸ‚αž˜αž–αžΈαž›αžΎ Yandex ClickHouse αž€αŸαž”αžΆαž“αžšαž€αžƒαžΎαž‰αž—αžΆαž–αž‡αŸ„αž‚αž‡αŸαž™αž‡αžΆαž˜αž½αž™ Bloomberg αž“αž·αž„ Cloudflare αž•αž„αžŠαŸ‚αžšαŸ”

αž€αžΆαž›αž–αžΈαž–αžΈαžšαž†αŸ’αž“αžΆαŸ†αž˜αž»αž“αžαŸ’αž‰αž»αŸ†αž”αžΆαž“αž…αŸ†αžŽαžΆαž™ αž€αžΆαžšαžœαž·αž—αžΆαž‚αž”αŸ’αžšαŸ€αž”αž’αŸ€αž” αž˜αžΌαž›αžŠαŸ’αž‹αžΆαž“αž‘αž·αž“αŸ’αž“αž“αŸαž™αžŠαŸ„αž™αž”αŸ’αžšαžΎαž˜αŸ‰αžΆαžŸαŸŠαžΈαž“αžαŸ‚αž˜αž½αž™ αž αžΎαž™αžœαžΆαž”αžΆαž“αž€αŸ’αž›αžΆαž™αž‡αžΆ αž›αžΏαž“αž”αŸ†αž•αž»αžαŸ” αž€αž˜αŸ’αž˜αžœαž·αž’αžΈαž˜αžΌαž›αžŠαŸ’αž‹αžΆαž“αž‘αž·αž“αŸ’αž“αž“αŸαž™αž₯αžαž‚αž·αžαžαŸ’αž›αŸƒ αžŠαŸ‚αž›αžαŸ’αž‰αž»αŸ†αž’αŸ’αž›αžΆαž”αŸ‹αžƒαžΎαž‰αŸ” αž…αžΆαž”αŸ‹αžαžΆαŸ†αž„αž–αžΈαž–αŸαž›αž“αŸ„αŸ‡αž˜αž€ αž’αŸ’αž“αž€αž’αž—αž·αžœαžŒαŸ’αžαž“αŸαž”αžΆαž“αž”αž“αŸ’αžαž”αž“αŸ’αžαŸ‚αž˜αž›αž€αŸ’αžαžŽαŸˆαž–αž·αžŸαŸαžŸ αžšαž½αž˜αž‘αžΆαŸ†αž„αž€αžΆαžšαž‚αžΆαŸ†αž‘αŸ’αžšαžŸαž˜αŸ’αžšαžΆαž”αŸ‹αž€αžΆαžšαž”αž„αŸ’αž αžΆαž”αŸ‹ Kafka, HDFS αž“αž·αž„ ZStandard αŸ” αž€αžΆαž›αž–αžΈαž†αŸ’αž“αžΆαŸ†αž˜αž»αž“ αž–αž½αž€αž‚αŸαž”αžΆαž“αž”αž“αŸ’αžαŸ‚αž˜αž€αžΆαžšαž‚αžΆαŸ†αž‘αŸ’αžšαžŸαž˜αŸ’αžšαžΆαž”αŸ‹αžœαž·αž’αžΈαžŸαžΆαžŸαŸ’αžαŸ’αžšαž”αž„αŸ’αž αžΆαž”αŸ‹αž›αŸ’αž”αžΆαž€αŸ‹ αž“αž·αž„ αžŠαžΈαžŸαžŽαŸ’αž-αž–αžΈ-αžŠαžΈαžŸαžŽαŸ’αž αž€αžΆαžšαžŸαžšαžŸαŸαžšαž€αžΌαžŠαž”αžΆαž“αž€αŸ’αž›αžΆαž™αž‡αžΆαž’αžΆαž…αž’αŸ’αžœαžΎαž‘αŸ…αž”αžΆαž“αŸ” αž“αŸ…αž–αŸαž›αž”αž„αŸ’αž αžΆαž”αŸ‹αž‘αž·αž“αŸ’αž“αž“αŸαž™αžŸαŸŠαŸαžšαžΈαž–αŸαž›αžœαŸαž›αžΆ αžαž˜αŸ’αž›αŸƒαžšαž„αŸ’αžœαžΆαžŸαŸ‹αž’αžΆαž…αžαŸ’αžšαžΌαžœαž”αžΆαž“αž”αž„αŸ’αž αžΆαž”αŸ‹αž”αžΆαž“αž™αŸ‰αžΆαž„αž›αŸ’αž’αžŠαŸ„αž™αž”αŸ’αžšαžΎαž€αžΆαžšαž’αŸŠαž·αž“αž€αžΌαžŠ delta αž”αŸ‰αž»αž“αŸ’αžαŸ‚αžŸαž˜αŸ’αžšαžΆαž”αŸ‹ counters αžœαžΆαž‡αžΆαž€αžΆαžšαž”αŸ’αžšαžŸαžΎαžšαž€αŸ’αž“αž»αž„αž€αžΆαžšαž”αŸ’αžšαžΎαž€αžΆαžšαž’αŸŠαž·αž“αž€αžΌαžŠ delta-by-delta αŸ” αž€αžΆαžšαž”αž„αŸ’αž αžΆαž”αŸ‹αž›αŸ’αž’αž”αžΆαž“αž€αŸ’αž›αžΆαž™αž‡αžΆαž‚αž“αŸ’αž›αžΉαŸ‡αž“αŸƒαžŠαŸ†αžŽαžΎαžšαž€αžΆαžšαžšαž”αžŸαŸ‹ ClickHouse αŸ”

ClickHouse αž˜αžΆαž“ 170 αž–αžΆαž“αŸ‹αž”αž“αŸ’αž‘αžΆαžαŸ‹αž“αŸƒαž€αžΌαžŠ C++ αžŠαŸ„αž™αž˜αž·αž“αžšαžΆαž”αŸ‹αž”αž‰αŸ’αž…αžΌαž›αž”αžŽαŸ’αžŽαžΆαž›αŸαž™αž—αžΆαž‚αžΈαž‘αžΈαž”αžΈ αž“αž·αž„αž‡αžΆαž˜αžΌαž›αžŠαŸ’αž‹αžΆαž“αž‘αž·αž“αŸ’αž“αž“αŸαž™αž…αŸ‚αž€αž…αžΆαž™αžαžΌαž…αž”αŸ†αž•αž»αžαž˜αž½αž™αŸ” αž“αŸ…αž€αŸ’αž“αž»αž„αž€αžΆαžšαž”αŸ’αžšαŸ€αž”αž’αŸ€αž” SQLite αž˜αž·αž“αž‚αžΆαŸ†αž‘αŸ’αžšαž€αžΆαžšαž…αŸ‚αž€αž…αžΆαž™αž‘αŸ αž αžΎαž™αž˜αžΆαž“αž€αžΌαžŠ C αž…αŸ†αž“αž½αž“ 235 αž–αžΆαž“αŸ‹αž”αž“αŸ’αž‘αžΆαžαŸ‹αŸ” αžαžΆαž˜αž€αžΆαžšαžŸαžšαžŸαŸαžšαž“αŸαŸ‡ αžœαž·αžŸαŸ’αžœαž€αžšαž…αŸ†αž“αž½αž“ 207 αž“αžΆαž€αŸ‹αž”αžΆαž“αž…αžΌαž›αžšαž½αž˜αž…αŸ†αžŽαŸ‚αž€αžŠαž›αŸ‹ ClickHouse αž αžΎαž™αž’αžΆαŸ†αž„αžαž„αŸ‹αžŸαŸŠαžΈαžαŸαž“αŸƒαž€αžΆαžšαž”αŸ’αžšαž–αŸ’αžšαžΉαžαŸ’αžαž€αŸ†αž–αž»αž„αž€αžΎαž“αž‘αžΎαž„αž“αžΆαž–αŸαž›αžαŸ’αž˜αžΈαŸ—αž“αŸαŸ‡αŸ”

αž“αŸ…αžαŸ‚αž˜αžΈαž“αžΆαž†αŸ’αž“αžΆαŸ† 2017 ClickHouse αž”αžΆαž“αž…αžΆαž”αŸ‹αž•αŸ’αžαžΎαž˜αžŠαŸ†αžŽαžΎαžšαž€αžΆαžš αž€αŸ†αžŽαžαŸ‹αž αŸαžαž»αž•αŸ’αž›αžΆαžŸαŸ‹αž”αŸ’αžαžΌαžš αž‡αžΆαž˜αž’αŸ’αž™αŸ„αž”αžΆαž™αž„αžΆαž™αžŸαŸ’αžšαž½αž›αž€αŸ’αž“αž»αž„αž€αžΆαžšαžαžΆαž˜αžŠαžΆαž“αž€αžΆαžšαž’αž—αž·αžœαžŒαŸ’αžαž“αŸαŸ” αž–αž½αž€αž‚αŸαž€αŸαž”αžΆαž“αž”αŸ†αž”αŸ‚αž€αž―αž€αžŸαžΆαžšαž―αž€αžŸαžΆαžš monolithic αž‘αŸ…αž‡αžΆαž‹αžΆαž“αžΆαž“αž»αž€αŸ’αžšαž˜αž―αž€αžŸαžΆαžšαžŠαŸ‚αž›αž˜αžΆαž“αž˜αžΌαž›αžŠαŸ’αž‹αžΆαž“αž›αžΎ Markdown αŸ” αž”αž‰αŸ’αž αžΆ αž“αž·αž„αž˜αž»αžαž„αžΆαžšαž“αžΆαž“αžΆαžαŸ’αžšαžΌαžœαž”αžΆαž“αžαžΆαž˜αžŠαžΆαž“αžαžΆαž˜αžšαž™αŸˆ GitHub αž αžΎαž™αž‡αžΆαž‘αžΌαž‘αŸ…αž€αž˜αŸ’αž˜αžœαž·αž’αžΈαž”αžΆαž“αž€αŸ’αž›αžΆαž™αž‘αŸ…αž‡αžΆαž’αžΆαž…αž…αžΌαž›αž”αŸ’αžšαžΎαž”αžΆαž“αž€αžΆαž“αŸ‹αžαŸ‚αž…αŸ’αžšαžΎαž“αž€αŸ’αž“αž»αž„αžšαž™αŸˆαž–αŸαž›αž”αŸ‰αž»αž“αŸ’αž˜αžΆαž“αž†αŸ’αž“αžΆαŸ†αž…αž»αž„αž€αŸ’αžšαŸ„αž™αž“αŸαŸ‡αŸ”

αž“αŸ…αž€αŸ’αž“αž»αž„αž’αžαŸ’αžαž”αž‘αž“αŸαŸ‡ αžαŸ’αž‰αž»αŸ†αž“αžΉαž„αž–αž·αž“αž·αžαŸ’αž™αž˜αžΎαž›αžŠαŸ†αžŽαžΎαžšαž€αžΆαžšαžšαž”αžŸαŸ‹ ClickHouse cluster αž“αŸ…αž›αžΎ AWS EC2 αžŠαŸ„αž™αž”αŸ’αžšαžΎ processors 36-core αž“αž·αž„ NVMe storageαŸ”

αž’αžΆαž”αŸ‹αžŠαŸαžαŸ– αž˜αž½αž™αžŸαž”αŸ’αžαžΆαž αŸαž”αž“αŸ’αž‘αžΆαž”αŸ‹αž–αžΈαž€αžΆαžšαž”αŸ„αŸ‡αž•αŸ’αžŸαžΆαž™αž’αžαŸ’αžαž”αž‘αž“αŸαŸ‡αžŠαŸ†αž”αžΌαž„ αžαŸ’αž‰αž»αŸ†αž”αžΆαž“αž’αŸ’αžœαžΎαžαŸαžŸαŸ’αžαž˜αŸ’αžαž„αž‘αŸ€αžαž‡αžΆαž˜αž½αž™αž“αžΉαž„αž€αžΆαžšαž€αŸ†αžŽαžαŸ‹αžšαž…αž“αžΆαžŸαž˜αŸ’αž–αŸαž“αŸ’αž’αžŠαŸ‚αž›αž”αŸ’αžšαžŸαžΎαžšαž‘αžΎαž„ αž“αž·αž„αž‘αž‘αž½αž›αž”αžΆαž“αž›αž‘αŸ’αž’αž•αž›αž”αŸ’αžšαžŸαžΎαžšαž‡αžΆαž„αž˜αž»αž“αŸ” αž€αžΆαžšαž”αž„αŸ’αž αŸ„αŸ‡αž“αŸαŸ‡αžαŸ’αžšαžΌαžœαž”αžΆαž“αž’αŸ’αžœαžΎαž”αž…αŸ’αž…αž»αž”αŸ’αž”αž“αŸ’αž“αž—αžΆαž–αžŠαžΎαž˜αŸ’αž”αžΈαž†αŸ’αž›αž»αŸ‡αž”αž‰αŸ’αž…αžΆαŸ†αž„αž–αžΈαž€αžΆαžšαž•αŸ’αž›αžΆαžŸαŸ‹αž”αŸ’αžαžΌαžšαž‘αžΆαŸ†αž„αž“αŸαŸ‡αŸ”

αž”αžΎαž€αžŠαŸ†αžŽαžΎαžšαž€αžΆαžš AWS EC2 Cluster

αžαŸ’αž‰αž»αŸ†αž“αžΉαž„αž”αŸ’αžšαžΎαž€αžšαžŽαžΈ c5d.9xlarge EC2 αž…αŸ†αž“αž½αž“αž”αžΈαžŸαž˜αŸ’αžšαžΆαž”αŸ‹αž€αžΆαžšαž”αŸ’αžšαž€αžΆαžŸαž“αŸαŸ‡αŸ” αž–αž½αž€αžœαžΆαž“αžΈαž˜αž½αž™αŸ—αž˜αžΆαž“αžŸαŸŠαžΈαž—αžΈαž™αžΌαž“αž·αž˜αŸ’αž˜αž·αž 36 αž’αž„αŸ’αž‚αž…αž„αž…αžΆαŸ† 72 αž‡αžΈαž€αžΆαž”αŸƒ αž’αž„αŸ’αž‚αž•αŸ’αž‘αž»αž€αž‘αž·αž“αŸ’αž“αž“αŸαž™ NVMe SSD 900 αž‡αžΈαž€αžΆαž”αŸƒ αž“αž·αž„αž‚αžΆαŸ†αž‘αŸ’αžšαž”αžŽαŸ’αžαžΆαž‰ 10 αž‡αžΈαž αŸ’αž‚αžΆαž”αŸƒαŸ” αž–αž½αž€αž‚αŸαž˜αžΆαž“αžαž˜αŸ’αž›αŸƒ $1,962/αž˜αŸ‰αŸ„αž„ αž€αŸ’αž“αž»αž„αžαŸ†αž”αž“αŸ‹ eu-west-1 αž“αŸ…αž–αŸαž›αžŠαŸ†αžŽαžΎαžšαž€αžΆαžšαžαžΆαž˜αžαž˜αŸ’αžšαžΌαžœαž€αžΆαžšαŸ” αžαŸ’αž‰αž»αŸ†αž“αžΉαž„αž”αŸ’αžšαžΎ Ubuntu Server 16.04 LTS αž‡αžΆαž”αŸ’αžšαž–αŸαž“αŸ’αž’αž”αŸ’αžšαžαž·αž”αžαŸ’αžαž·αž€αžΆαžšαŸ”

αž‡αž‰αŸ’αž‡αžΆαŸ†αž„αž—αŸ’αž›αžΎαž„αžαŸ’αžšαžΌαžœαž”αžΆαž“αž€αŸ†αžŽαžαŸ‹αžšαž…αž“αžΆαžŸαž˜αŸ’αž–αŸαž“αŸ’αž’ αžŠαžΌαž…αŸ’αž“αŸαŸ‡αž˜αŸ‰αžΆαžŸαŸŠαžΈαž“αž“αžΈαž˜αž½αž™αŸ—αž’αžΆαž…αž‘αŸ†αž“αžΆαž€αŸ‹αž‘αŸ†αž“αž„αž‚αŸ’αž“αžΆαž‘αŸ…αžœαž·αž‰αž‘αŸ…αž˜αž€αžŠαŸ„αž™αž‚αŸ’αž˜αžΆαž“αž€αžΆαžšαžšαžΉαžαž”αž“αŸ’αžαžΉαž„ αž αžΎαž™αž˜αžΆαž“αžαŸ‚αž’αžΆαžŸαž™αžŠαŸ’αž‹αžΆαž“ IPv4 αžšαž”αžŸαŸ‹αžαŸ’αž‰αž»αŸ†αž”αŸ‰αž»αžŽαŸ’αžŽαŸ„αŸ‡αžŠαŸ‚αž›αžαŸ’αžšαžΌαžœαž”αžΆαž“αž”αž‰αŸ’αž…αžΌαž›αž€αŸ’αž“αž»αž„αž”αž‰αŸ’αž‡αžΈαžŸαžŠαŸ„αž™ SSH αž“αŸ…αž€αŸ’αž“αž»αž„αž…αž„αŸ’αž€αŸ„αž˜αŸ”

αžŠαŸ’αžšαžΆαž™ NVMe αž“αŸ…αž€αŸ’αž“αž»αž„αžŸαŸ’αžαžΆαž“αž—αžΆαž–αžαŸ’αžšαŸ€αž˜αžαŸ’αž›αž½αž“αž”αŸ’αžšαžαž·αž”αžαŸ’αžαž·αž€αžΆαžš

αžŠαžΎαž˜αŸ’αž”αžΈαž±αŸ’αž™ ClickHouse αžŠαŸ†αžŽαžΎαžšαž€αžΆαžš αžαŸ’αž‰αž»αŸ†αž“αžΉαž„αž”αž„αŸ’αž€αžΎαžαž”αŸ’αžšαž–αŸαž“αŸ’αž’αž―αž€αžŸαžΆαžšαž€αŸ’αž“αž»αž„αž‘αž˜αŸ’αžšαž„αŸ‹ EXT4 αž“αŸ…αž›αžΎαžŠαŸ’αžšαžΆαž™ NVMe αž“αŸ…αž›αžΎαž˜αŸ‰αžΆαžŸαŸŠαžΈαž“αž˜αŸαž“αžΈαž˜αž½αž™αŸ—αŸ”

$ sudo mkfs -t ext4 /dev/nvme1n1
$ sudo mkdir /ch
$ sudo mount /dev/nvme1n1 /ch

αž“αŸ…αž–αŸαž›αžŠαŸ‚αž›αž’αŸ’αžœαžΈαŸ—αžαŸ’αžšαžΌαžœαž”αžΆαž“αžαŸ†αž‘αžΎαž„ αž’αŸ’αž“αž€αž’αžΆαž…αž˜αžΎαž›αžƒαžΎαž‰αž…αŸ†αžŽαž»αž…αž˜αŸ‰αŸ„αž“ αž“αž·αž„αž‘αŸ†αž αŸ†αž•αŸ’αž‘αž»αž€ 783 GB αžŠαŸ‚αž›αž˜αžΆαž“αž“αŸ…αž›αžΎαž”αŸ’αžšαž–αŸαž“αŸ’αž’αž“αžΈαž˜αž½αž™αŸ—αŸ”

$ lsblk

NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
loop0         7:0    0  87.9M  1 loop /snap/core/5742
loop1         7:1    0  16.5M  1 loop /snap/amazon-ssm-agent/784
nvme0n1     259:1    0     8G  0 disk
└─nvme0n1p1 259:2    0     8G  0 part /
nvme1n1     259:0    0 838.2G  0 disk /ch

$ df -h

Filesystem      Size  Used Avail Use% Mounted on
udev             35G     0   35G   0% /dev
tmpfs           6.9G  8.8M  6.9G   1% /run
/dev/nvme0n1p1  7.7G  967M  6.8G  13% /
tmpfs            35G     0   35G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs            35G     0   35G   0% /sys/fs/cgroup
/dev/loop0       88M   88M     0 100% /snap/core/5742
/dev/loop1       17M   17M     0 100% /snap/amazon-ssm-agent/784
tmpfs           6.9G     0  6.9G   0% /run/user/1000
/dev/nvme1n1    825G   73M  783G   1% /ch

αžŸαŸ†αžŽαž»αŸ†αž‘αž·αž“αŸ’αž“αž“αŸαž™αžŠαŸ‚αž›αžαŸ’αž‰αž»αŸ†αž“αžΉαž„αž”αŸ’αžšαžΎαž€αŸ’αž“αž»αž„αž€αžΆαžšαž’αŸ’αžœαžΎαžαŸαžŸαŸ’αžαž“αŸαŸ‡αž‚αžΊαž‡αžΆαž‘αž·αž“αŸ’αž“αž“αŸαž™αžŠαŸ‚αž›αžαŸ’αž‰αž»αŸ†αž”αž„αŸ’αž€αžΎαžαž…αŸαž‰αž–αžΈαž€αžΆαžšαž‡αž·αŸ‡αžαžΆαž€αŸ‹αžŸαŸŠαžΈαž…αŸ†αž“αž½αž“ 1.1 αž–αžΆαž“αŸ‹αž›αžΆαž“αžŠαž„αž“αŸ…αž‘αžΈαž€αŸ’αžšαž»αž„αž‰αžΌαžœαž™αŸ‰αž€αž€αŸ’αž“αž»αž„αžšαž™αŸˆαž–αŸαž›αž”αŸ’αžšαžΆαŸ†αž˜αž½αž™αž†αŸ’αž“αžΆαŸ†αŸ” αž“αŸ…αž›αžΎαž”αŸ’αž›αž€αŸ‹ αž€αžΆαžšαž’αŸ’αžœαžΎαžŠαŸ†αžŽαžΎαžšαžαžΆαž€αŸ‹αžŸαŸŠαžΈαž˜αž½αž™αž–αžΆαž“αŸ‹αž›αžΆαž“αž“αŸ… Redshift αž›αž˜αŸ’αž’αž·αžαž’αŸ†αž–αžΈαžšαž”αŸ€αž”αžŠαŸ‚αž›αžαŸ’αž‰αž»αŸ†αž”αŸ’αžšαž˜αžΌαž›αžŸαŸ†αžŽαž»αŸ†αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸαŸ‡αŸ” αž–αž½αž€αžœαžΆαžαŸ’αžšαžΌαžœαž”αžΆαž“αžšαž€αŸ’αžŸαžΆαž‘αž»αž€αž€αŸ’αž“αž»αž„ AWS S3 αžŠαžΌαž…αŸ’αž“αŸαŸ‡αžαŸ’αž‰αž»αŸ†αž“αžΉαž„αž€αŸ†αžŽαžαŸ‹αžšαž…αž“αžΆαžŸαž˜αŸ’αž–αŸαž“αŸ’αž’ AWS CLI αž‡αžΆαž˜αž½αž™αž“αžΉαž„αž€αžΆαžšαž…αžΌαž›αž”αŸ’αžšαžΎ αž“αž·αž„αžŸαŸ„αžŸαž˜αŸ’αž„αžΆαžαŸ‹αžšαž”αžŸαŸ‹αžαŸ’αž‰αž»αŸ†αŸ”

$ sudo apt update
$ sudo apt install awscli
$ aws configure

αžαŸ’αž‰αž»αŸ†αž“αžΉαž„αž€αŸ†αžŽαžαŸ‹αžŠαŸ‚αž“αž€αŸ†αžŽαžαŸ‹αžŸαŸ†αžŽαžΎαžŸαŸ’αžšαž”αž‚αŸ’αž“αžΆαžšαž”αžŸαŸ‹αž’αžαž·αžαž·αž‡αž“αžŠαž›αŸ‹ 100 αžŠαžΎαž˜αŸ’αž”αžΈαž±αŸ’αž™αž―αž€αžŸαžΆαžšαž‘αžΆαž‰αž™αž€αž›αžΏαž“αž‡αžΆαž„αž€αžΆαžšαž€αŸ†αžŽαžαŸ‹αž›αŸ†αž“αžΆαŸ†αžŠαžΎαž˜αŸ”

$ aws configure set 
    default.s3.max_concurrent_requests 
    100

αžαŸ’αž‰αž»αŸ†αž“αžΉαž„αž‘αžΆαž‰αž™αž€αžŸαŸ†αžŽαž»αŸ†αž‘αž·αž“αŸ’αž“αž“αŸαž™αž‡αž·αŸ‡αžαžΆαž€αŸ‹αžŸαŸŠαžΈαž–αžΈ AWS S3 αž αžΎαž™αžšαž€αŸ’αžŸαžΆαž‘αž»αž€αžœαžΆαž“αŸ…αž›αžΎαžŠαŸ’αžšαžΆαž™ NVMe αž“αŸ…αž›αžΎαž˜αŸ‰αžΆαžŸαŸŠαžΈαž“αž˜αŸαžŠαŸ†αž”αžΌαž„αŸ” αžŸαŸ†αžŽαž»αŸ†αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸαŸ‡αž‚αžΊ ~104GB αž‡αžΆαž‘αž˜αŸ’αžšαž„αŸ‹ CSV αžŠαŸ‚αž›αž”αžΆαž“αž”αž„αŸ’αž αžΆαž”αŸ‹ GZIP αŸ”

$ sudo mkdir -p /ch/csv
$ sudo chown -R ubuntu /ch/csv
$ aws s3 sync s3://<bucket>/csv /ch/csv

αž€αžΆαžšαžŠαŸ†αž‘αžΎαž„ ClickHouse

αžαŸ’αž‰αž»αŸ†αž“αžΉαž„αžŠαŸ†αž‘αžΎαž„αž€αžΆαžšαž…αŸ‚αž€αž…αžΆαž™ OpenJDK αžŸαž˜αŸ’αžšαžΆαž”αŸ‹ Java 8 αžŠαžΌαž…αžŠαŸ‚αž›αžœαžΆαžαŸ’αžšαžΌαžœαž”αžΆαž“αž‘αžΆαž˜αž‘αžΆαžšαžŠαžΎαž˜αŸ’αž”αžΈαžŠαŸ†αžŽαžΎαžšαž€αžΆαžšαž€αž˜αŸ’αž˜αžœαž·αž’αžΈ Apache ZooKeeper αžŠαŸ‚αž›αžαŸ’αžšαžΌαžœαž”αžΆαž“αž‘αžΆαž˜αž‘αžΆαžšαžŸαž˜αŸ’αžšαžΆαž”αŸ‹αž€αžΆαžšαžŠαŸ†αž‘αžΎαž„αž…αŸ‚αž€αž…αžΆαž™ ClickHouse αž“αŸ…αž›αžΎαž˜αŸ‰αžΆαžŸαŸŠαžΈαž“αž‘αžΆαŸ†αž„αž”αžΈαŸ”

$ sudo apt update
$ sudo apt install 
    openjdk-8-jre 
    openjdk-8-jdk-headless

αž”αž“αŸ’αž‘αžΆαž”αŸ‹αž˜αž€αžαŸ’αž‰αž»αŸ†αž€αŸ†αžŽαžαŸ‹αž’αžαŸαžšαž”αžšαž·αžŸαŸ’αžαžΆαž“ JAVA_HOME.

$ sudo vi /etc/profile
 
export JAVA_HOME=/usr
 
$ source /etc/profile

αž”αž“αŸ’αž‘αžΆαž”αŸ‹αž˜αž€αžαŸ’αž‰αž»αŸ†αž“αžΉαž„αž”αŸ’αžšαžΎαž”αŸ’αžšαž–αŸαž“αŸ’αž’αž‚αŸ’αžšαž”αŸ‹αž‚αŸ’αžšαž„αž€αž‰αŸ’αž…αž”αŸ‹αžšαž”αžŸαŸ‹ Ubuntu αžŠαžΎαž˜αŸ’αž”αžΈαžŠαŸ†αž‘αžΎαž„ ClickHouse 18.16.1, glances αž“αž·αž„ ZooKeeper αž“αŸ…αž›αžΎαž˜αŸ‰αžΆαžŸαŸŠαžΈαž“αž‘αžΆαŸ†αž„αž”αžΈαŸ”

$ sudo apt-key adv 
    --keyserver hkp://keyserver.ubuntu.com:80 
    --recv E0C56BD4
$ echo "deb http://repo.yandex.ru/clickhouse/deb/stable/ main/" | 
    sudo tee /etc/apt/sources.list.d/clickhouse.list
$ sudo apt-get update

$ sudo apt install 
    clickhouse-client 
    clickhouse-server 
    glances 
    zookeeperd

αžαŸ’αž‰αž»αŸ†β€‹αž“αžΉαž„β€‹αž”αž„αŸ’αž€αžΎαžβ€‹αžαžβ€‹αžŸαž˜αŸ’αžšαžΆαž”αŸ‹ ClickHouse αž αžΎαž™β€‹αž€αŸβ€‹αž’αŸ’αžœαžΎβ€‹αž€αžΆαžšβ€‹αž”αžŠαž·αžŸαŸαž’β€‹αž€αžΆαžšβ€‹αž€αŸ†αžŽαžαŸ‹β€‹αžšαž…αž“αžΆαžŸαž˜αŸ’αž–αŸαž“αŸ’αž’β€‹αž˜αž½αž™β€‹αž…αŸ†αž“αž½αž“β€‹αž“αŸ…β€‹αž›αžΎβ€‹ servers αž‘αžΆαŸ†αž„β€‹αž”αžΈαŸ”

$ sudo mkdir /ch/clickhouse
$ sudo chown -R clickhouse /ch/clickhouse

$ sudo mkdir -p /etc/clickhouse-server/conf.d
$ sudo vi /etc/clickhouse-server/conf.d/taxis.conf

αž‘αžΆαŸ†αž„αž“αŸαŸ‡αž‚αžΊαž‡αžΆαž€αžΆαžšαž€αŸ†αžŽαžαŸ‹αžšαž…αž“αžΆαžŸαž˜αŸ’αž–αŸαž“αŸ’αž’αž”αžŠαž·αžŸαŸαž’αžŠαŸ‚αž›αžαŸ’αž‰αž»αŸ†αž“αžΉαž„αž”αŸ’αžšαžΎαŸ”

<?xml version="1.0"?>
<yandex>
    <listen_host>0.0.0.0</listen_host>
    <path>/ch/clickhouse/</path>

 <remote_servers>
        <perftest_3shards>
            <shard>
                <replica>
                    <host>172.30.2.192</host>
                    <port>9000</port>
                 </replica>
            </shard>
            <shard>
                 <replica>
                    <host>172.30.2.162</host>
                    <port>9000</port>
                 </replica>
            </shard>
            <shard>
                 <replica>
                    <host>172.30.2.36</host>
                    <port>9000</port>
                 </replica>
            </shard>
        </perftest_3shards>
    </remote_servers>

  <zookeeper-servers>
        <node>
            <host>172.30.2.192</host>
            <port>2181</port>
        </node>
        <node>
            <host>172.30.2.162</host>
            <port>2181</port>
        </node>
        <node>
            <host>172.30.2.36</host>
            <port>2181</port>
        </node>
    </zookeeper-servers>

 <macros>
        <shard>03</shard>
        <replica>01</replica>
    </macros>
</yandex>

αž”αž“αŸ’αž‘αžΆαž”αŸ‹αž˜αž€αžαŸ’αž‰αž»αŸ†αž“αžΉαž„αžŠαŸ†αžŽαžΎαžšαž€αžΆαžš ZooKeeper αž“αž·αž„αž˜αŸ‰αžΆαžŸαŸŠαžΈαž“αž˜αŸ ClickHouse αž“αŸ…αž›αžΎαž˜αŸ‰αžΆαžŸαŸŠαžΈαž“αž‘αžΆαŸ†αž„αž”αžΈαŸ”

$ sudo /etc/init.d/zookeeper start
$ sudo service clickhouse-server start

αž€αŸ†αž–αž»αž„αž•αŸ’αž‘αž»αž€αž‘αž·αž“αŸ’αž“αž“αŸαž™αž‘αŸ… ClickHouse

αž“αŸ…αž›αžΎαž˜αŸ‰αžΆαžŸαŸŠαžΈαž“αž˜αŸαžŠαŸ†αž”αžΌαž„ αžαŸ’αž‰αž»αŸ†αž“αžΉαž„αž”αž„αŸ’αž€αžΎαžαžαžΆαžšαžΆαž„αž’αŸ’αžœαžΎαžŠαŸ†αžŽαžΎαžš (trips) αžŠαŸ‚αž›αž“αžΉαž„αžšαž€αŸ’αžŸαžΆαž‘αž»αž€αžŸαŸ†αžŽαž»αŸ†αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸƒαž€αžΆαžšαž’αŸ’αžœαžΎαžŠαŸ†αžŽαžΎαžšαžαžΆαž€αŸ‹αžŸαŸŠαžΈαžŠαŸ„αž™αž”αŸ’αžšαžΎαž˜αŸ‰αžΆαžŸαŸŠαžΈαž“ Log αŸ”

$ clickhouse-client --host=0.0.0.0
 
CREATE TABLE trips (
    trip_id                 UInt32,
    vendor_id               String,

    pickup_datetime         DateTime,
    dropoff_datetime        Nullable(DateTime),

    store_and_fwd_flag      Nullable(FixedString(1)),
    rate_code_id            Nullable(UInt8),
    pickup_longitude        Nullable(Float64),
    pickup_latitude         Nullable(Float64),
    dropoff_longitude       Nullable(Float64),
    dropoff_latitude        Nullable(Float64),
    passenger_count         Nullable(UInt8),
    trip_distance           Nullable(Float64),
    fare_amount             Nullable(Float32),
    extra                   Nullable(Float32),
    mta_tax                 Nullable(Float32),
    tip_amount              Nullable(Float32),
    tolls_amount            Nullable(Float32),
    ehail_fee               Nullable(Float32),
    improvement_surcharge   Nullable(Float32),
    total_amount            Nullable(Float32),
    payment_type            Nullable(String),
    trip_type               Nullable(UInt8),
    pickup                  Nullable(String),
    dropoff                 Nullable(String),

    cab_type                Nullable(String),

    precipitation           Nullable(Int8),
    snow_depth              Nullable(Int8),
    snowfall                Nullable(Int8),
    max_temperature         Nullable(Int8),
    min_temperature         Nullable(Int8),
    average_wind_speed      Nullable(Int8),

    pickup_nyct2010_gid     Nullable(Int8),
    pickup_ctlabel          Nullable(String),
    pickup_borocode         Nullable(Int8),
    pickup_boroname         Nullable(String),
    pickup_ct2010           Nullable(String),
    pickup_boroct2010       Nullable(String),
    pickup_cdeligibil       Nullable(FixedString(1)),
    pickup_ntacode          Nullable(String),
    pickup_ntaname          Nullable(String),
    pickup_puma             Nullable(String),

    dropoff_nyct2010_gid    Nullable(UInt8),
    dropoff_ctlabel         Nullable(String),
    dropoff_borocode        Nullable(UInt8),
    dropoff_boroname        Nullable(String),
    dropoff_ct2010          Nullable(String),
    dropoff_boroct2010      Nullable(String),
    dropoff_cdeligibil      Nullable(String),
    dropoff_ntacode         Nullable(String),
    dropoff_ntaname         Nullable(String),
    dropoff_puma            Nullable(String)
) ENGINE = Log;

αž”αž“αŸ’αž‘αžΆαž”αŸ‹αž˜αž€αžαŸ’αž‰αž»αŸ†αžŸαŸ’αžšαž„αŸ‹αž…αŸαž‰ αž“αž·αž„αž•αŸ’αž‘αž»αž€αž―αž€αžŸαžΆαžš CSV αž“αžΈαž˜αž½αž™αŸ—αž‘αŸ…αž€αŸ’αž“αž»αž„αžαžΆαžšαžΆαž„αž’αŸ’αžœαžΎαžŠαŸ†αžŽαžΎαžš (trips) αžαžΆαž„αž€αŸ’αžšαŸ„αž˜αž“αŸαŸ‡αžαŸ’αžšαžΌαžœαž”αžΆαž“αž”αž‰αŸ’αž…αž”αŸ‹αž€αŸ’αž“αž»αž„αžšαž™αŸˆαž–αŸαž› 55 αž“αžΆαž‘αžΈ αž“αž·αž„ 10 αžœαž·αž“αžΆαž‘αžΈαŸ” αž”αž“αŸ’αž‘αžΆαž”αŸ‹αž–αžΈαž”αŸ’αžšαžαž·αž”αžαŸ’αžαž·αž€αžΆαžšαž“αŸαŸ‡αž‘αŸ†αž αŸ†αž“αŸƒαžαžαž‘αž·αž“αŸ’αž“αž“αŸαž™αž‚αžΊ 134 GB αŸ”

$ time (for FILENAME in /ch/csv/trips_x*.csv.gz; do
            echo $FILENAME
            gunzip -c $FILENAME | 
                clickhouse-client 
                    --host=0.0.0.0 
                    --query="INSERT INTO trips FORMAT CSV"
        done)

αž›αŸ’αž”αžΏαž“αž“αžΆαŸ†αž…αžΌαž›αž‚αžΊ 155 MB αž“αŸƒαž˜αžΆαžαž·αž€αžΆ CSV αžŠαŸ‚αž›αž˜αž·αž“αž”αžΆαž“αž”αž„αŸ’αž αžΆαž”αŸ‹αž€αŸ’αž“αž»αž„αž˜αž½αž™αžœαž·αž“αžΆαž‘αžΈαŸ” αžαŸ’αž‰αž»αŸ†β€‹αžŸαž„αŸ’αžŸαŸαž™β€‹αžαžΆβ€‹αž“αŸαŸ‡β€‹αž˜αž€β€‹αž–αžΈβ€‹αž”αž‰αŸ’αž αžΆβ€‹αžŸαŸ’αž‘αŸ‡β€‹αž€αŸ’αž“αž»αž„β€‹αž€αžΆαžšβ€‹αž”αž„αŸ’αž αžΆαž”αŸ‹ GZIP αŸ” αžœαžΆαž”αŸ’αžšαž αŸ‚αž›αž‡αžΆαž›αžΏαž“αž‡αžΆαž„αž˜αž»αž“αž€αŸ’αž“αž»αž„αž€αžΆαžšαž–αž“αŸ’αž›αžΆαž―αž€αžŸαžΆαžš gzipped αž‘αžΆαŸ†αž„αž’αžŸαŸ‹αžŸαŸ’αžšαž”αž‚αŸ’αž“αžΆαžŠαŸ„αž™αž”αŸ’αžšαžΎ xargs αž αžΎαž™αž”αž“αŸ’αž‘αžΆαž”αŸ‹αž˜αž€αž•αŸ’αž‘αž»αž€αž‘αž·αž“αŸ’αž“αž“αŸαž™αžŠαŸ‚αž›αž–αž“αŸ’αž›αžΆαŸ” αžαžΆαž„αž€αŸ’αžšαŸ„αž˜αž“αŸαŸ‡αž‚αžΊαž‡αžΆαž€αžΆαžšαž–αž·αž–αžŽαŸŒαž“αžΆαž’αŸ†αž–αžΈαž’αŸ’αžœαžΈαžŠαŸ‚αž›αžαŸ’αžšαžΌαžœαž”αžΆαž“αžšαžΆαž™αž€αžΆαžšαžŽαŸαž€αŸ’αž“αž»αž„αž’αŸ†αž‘αž»αž„αž–αŸαž›αžŠαŸ†αžŽαžΎαžšαž€αžΆαžšαž“αžΆαŸ†αž…αžΌαž› CSV αŸ”

$ sudo glances

ip-172-30-2-200 (Ubuntu 16.04 64bit / Linux 4.4.0-1072-aws)                                                                                                 Uptime: 0:11:42
CPU       8.2%  nice:     0.0%                           LOAD    36-core                           MEM      9.8%  active:    5.20G                           SWAP      0.0%
user:     6.0%  irq:      0.0%                           1 min:    2.24                            total:  68.7G  inactive:  61.0G                           total:       0
system:   0.9%  iowait:   1.3%                           5 min:    1.83                            used:   6.71G  buffers:   66.4M                           used:        0
idle:    91.8%  steal:    0.0%                           15 min:   1.01                            free:   62.0G  cached:    61.6G                           free:        0

NETWORK     Rx/s   Tx/s   TASKS 370 (507 thr), 2 run, 368 slp, 0 oth sorted automatically by cpu_percent, flat view
ens5        136b    2Kb
lo         343Mb  343Mb     CPU%  MEM%  VIRT   RES   PID USER        NI S    TIME+ IOR/s IOW/s Command
                           100.4   1.5 1.65G 1.06G  9909 ubuntu       0 S  1:01.33     0     0 clickhouse-client --host=0.0.0.0 --query=INSERT INTO trips FORMAT CSV
DISK I/O     R/s    W/s     85.1   0.0 4.65M  708K  9908 ubuntu       0 R  0:50.60   32M     0 gzip -d -c /ch/csv/trips_xac.csv.gz
loop0          0      0     54.9   5.1 8.14G 3.49G  8091 clickhous    0 S  1:44.23     0   45M /usr/bin/clickhouse-server --config=/etc/clickhouse-server/config.xml
loop1          0      0      4.5   0.0     0     0   319 root         0 S  0:07.50    1K     0 kworker/u72:2
nvme0n1        0     3K      2.3   0.0 91.1M 28.9M  9912 root         0 R  0:01.56     0     0 /usr/bin/python3 /usr/bin/glances
nvme0n1p1      0     3K      0.3   0.0     0     0   960 root       -20 S  0:00.10     0     0 kworker/28:1H
nvme1n1    32.1M   495M      0.3   0.0     0     0  1058 root       -20 S  0:00.90     0     0 kworker/23:1H

αžαŸ’αž‰αž»αŸ†αž“αžΉαž„αž”αž„αŸ’αž€αžΎαž“αž‘αŸ†αž αŸ†αž•αŸ’αž‘αž»αž€αž“αŸ…αž›αžΎαžŠαŸ’αžšαžΆαž™ NVMe αžŠαŸ„αž™αž›αž»αž”αž―αž€αžŸαžΆαžš CSV αžŠαžΎαž˜αž˜αž»αž“αž–αŸαž›αž”αž“αŸ’αžαŸ”

$ sudo rm -fr /ch/csv

αž”αŸ†αž”αŸ’αž›αŸ‚αž„αž‘αŸ…αž‡αžΆαž‘αž˜αŸ’αžšαž„αŸ‹αž‡αž½αžšαžˆαžš

αž˜αŸ‰αžΆαžŸαŸŠαžΈαž“ Log ClickHouse αž“αžΉαž„αžšαž€αŸ’αžŸαžΆαž‘αž»αž€αž‘αž·αž“αŸ’αž“αž“αŸαž™αž€αŸ’αž“αž»αž„αž‘αž˜αŸ’αžšαž„αŸ‹αžαž˜αŸ’αžšαž„αŸ‹αž‡αž½αžšαŸ” αžŠαžΎαž˜αŸ’αž”αžΈαžŸαžΆαž€αžŸαž½αžšαž‘αž·αž“αŸ’αž“αž“αŸαž™αž€αžΆαž“αŸ‹αžαŸ‚αž›αžΏαž“ αžαŸ’αž‰αž»αŸ†αž”αžΆαž“αž”αŸ†αž”αŸ’αž›αŸ‚αž„αžœαžΆαž‘αŸ…αž‡αžΆαž‘αž˜αŸ’αžšαž„αŸ‹αž‡αž½αžšαžˆαžšαžŠαŸ„αž™αž”αŸ’αžšαžΎαž˜αŸ‰αžΆαžŸαŸŠαžΈαž“ MergeTreeαŸ”

$ clickhouse-client --host=0.0.0.0

αžαžΆαž„αž€αŸ’αžšαŸ„αž˜αž“αŸαŸ‡αžαŸ’αžšαžΌαžœαž”αžΆαž“αž”αž‰αŸ’αž…αž”αŸ‹αž€αŸ’αž“αž»αž„αžšαž™αŸˆαž–αŸαž› 34 αž“αžΆαž‘αžΈ αž“αž·αž„ 50 αžœαž·αž“αžΆαž‘αžΈαŸ” αž”αž“αŸ’αž‘αžΆαž”αŸ‹αž–αžΈαž”αŸ’αžšαžαž·αž”αžαŸ’αžαž·αž€αžΆαžšαž“αŸαŸ‡αž‘αŸ†αž αŸ†αž“αŸƒαžαžαž‘αž·αž“αŸ’αž“αž“αŸαž™αž‚αžΊ 237 GB αŸ”

CREATE TABLE trips_mergetree
    ENGINE = MergeTree(pickup_date, pickup_datetime, 8192)
    AS SELECT
        trip_id,
        CAST(vendor_id AS Enum8('1' = 1,
                                '2' = 2,
                                'CMT' = 3,
                                'VTS' = 4,
                                'DDS' = 5,
                                'B02512' = 10,
                                'B02598' = 11,
                                'B02617' = 12,
                                'B02682' = 13,
                                'B02764' = 14)) AS vendor_id,
        toDate(pickup_datetime)                 AS pickup_date,
        ifNull(pickup_datetime, toDateTime(0))  AS pickup_datetime,
        toDate(dropoff_datetime)                AS dropoff_date,
        ifNull(dropoff_datetime, toDateTime(0)) AS dropoff_datetime,
        assumeNotNull(store_and_fwd_flag)       AS store_and_fwd_flag,
        assumeNotNull(rate_code_id)             AS rate_code_id,

        assumeNotNull(pickup_longitude)         AS pickup_longitude,
        assumeNotNull(pickup_latitude)          AS pickup_latitude,
        assumeNotNull(dropoff_longitude)        AS dropoff_longitude,
        assumeNotNull(dropoff_latitude)         AS dropoff_latitude,
        assumeNotNull(passenger_count)          AS passenger_count,
        assumeNotNull(trip_distance)            AS trip_distance,
        assumeNotNull(fare_amount)              AS fare_amount,
        assumeNotNull(extra)                    AS extra,
        assumeNotNull(mta_tax)                  AS mta_tax,
        assumeNotNull(tip_amount)               AS tip_amount,
        assumeNotNull(tolls_amount)             AS tolls_amount,
        assumeNotNull(ehail_fee)                AS ehail_fee,
        assumeNotNull(improvement_surcharge)    AS improvement_surcharge,
        assumeNotNull(total_amount)             AS total_amount,
        assumeNotNull(payment_type)             AS payment_type_,
        assumeNotNull(trip_type)                AS trip_type,

        pickup AS pickup,
        pickup AS dropoff,

        CAST(assumeNotNull(cab_type)
            AS Enum8('yellow' = 1, 'green' = 2))
                                AS cab_type,

        precipitation           AS precipitation,
        snow_depth              AS snow_depth,
        snowfall                AS snowfall,
        max_temperature         AS max_temperature,
        min_temperature         AS min_temperature,
        average_wind_speed      AS average_wind_speed,

        pickup_nyct2010_gid     AS pickup_nyct2010_gid,
        pickup_ctlabel          AS pickup_ctlabel,
        pickup_borocode         AS pickup_borocode,
        pickup_boroname         AS pickup_boroname,
        pickup_ct2010           AS pickup_ct2010,
        pickup_boroct2010       AS pickup_boroct2010,
        pickup_cdeligibil       AS pickup_cdeligibil,
        pickup_ntacode          AS pickup_ntacode,
        pickup_ntaname          AS pickup_ntaname,
        pickup_puma             AS pickup_puma,

        dropoff_nyct2010_gid    AS dropoff_nyct2010_gid,
        dropoff_ctlabel         AS dropoff_ctlabel,
        dropoff_borocode        AS dropoff_borocode,
        dropoff_boroname        AS dropoff_boroname,
        dropoff_ct2010          AS dropoff_ct2010,
        dropoff_boroct2010      AS dropoff_boroct2010,
        dropoff_cdeligibil      AS dropoff_cdeligibil,
        dropoff_ntacode         AS dropoff_ntacode,
        dropoff_ntaname         AS dropoff_ntaname,
        dropoff_puma            AS dropoff_puma
    FROM trips;

αž“αŸαŸ‡β€‹αž‡αžΆβ€‹αž’αŸ’αžœαžΈβ€‹αžŠαŸ‚αž›β€‹αž›αž‘αŸ’αž’αž•αž›β€‹αž˜αžΎαž›β€‹αž‘αŸ…β€‹αžŠαžΌαž…β€‹αž‡αžΆβ€‹αž’αŸ†αž‘αž»αž„β€‹αž–αŸαž›β€‹αž”αŸ’αžšαžαž·αž”αžαŸ’αžαž·αž€αžΆαžšαŸ–

ip-172-30-2-200 (Ubuntu 16.04 64bit / Linux 4.4.0-1072-aws)                                                                                                 Uptime: 1:06:09
CPU      10.3%  nice:     0.0%                           LOAD    36-core                           MEM     16.1%  active:    13.3G                           SWAP      0.0%
user:     7.9%  irq:      0.0%                           1 min:    1.87                            total:  68.7G  inactive:  52.8G                           total:       0
system:   1.6%  iowait:   0.8%                           5 min:    1.76                            used:   11.1G  buffers:   71.8M                           used:        0
idle:    89.7%  steal:    0.0%                           15 min:   1.95                            free:   57.6G  cached:    57.2G                           free:        0

NETWORK     Rx/s   Tx/s   TASKS 367 (523 thr), 1 run, 366 slp, 0 oth sorted automatically by cpu_percent, flat view
ens5         1Kb    8Kb
lo           2Kb    2Kb     CPU%  MEM%  VIRT   RES   PID USER        NI S    TIME+ IOR/s IOW/s Command
                           241.9  12.8 20.7G 8.78G  8091 clickhous    0 S 30:36.73   34M  125M /usr/bin/clickhouse-server --config=/etc/clickhouse-server/config.xml
DISK I/O     R/s    W/s      2.6   0.0 90.4M 28.3M  9948 root         0 R  1:18.53     0     0 /usr/bin/python3 /usr/bin/glances
loop0          0      0      1.3   0.0     0     0   203 root         0 S  0:09.82     0     0 kswapd0
loop1          0      0      0.3   0.1  315M 61.3M 15701 ubuntu       0 S  0:00.40     0     0 clickhouse-client --host=0.0.0.0
nvme0n1        0     3K      0.3   0.0     0     0     7 root         0 S  0:00.83     0     0 rcu_sched
nvme0n1p1      0     3K      0.0   0.0     0     0   142 root         0 S  0:00.22     0     0 migration/27
nvme1n1    25.8M   330M      0.0   0.0 59.7M 1.79M  2764 ubuntu       0 S  0:00.00     0     0 (sd-pam)

αž“αŸ…αž€αŸ’αž“αž»αž„αž€αžΆαžšαž’αŸ’αžœαžΎαžαŸαžŸαŸ’αžαž…αž»αž„αž€αŸ’αžšαŸ„αž™ αž‡αž½αžšαžˆαžšαž‡αžΆαž…αŸ’αžšαžΎαž“αžαŸ’αžšαžΌαžœαž”αžΆαž“αž”αŸ†αž”αŸ’αž›αŸ‚αž„ αž“αž·αž„αž‚αžŽαž“αžΆαž‘αžΎαž„αžœαž·αž‰αŸ” αžαŸ’αž‰αž»αŸ†αž”αžΆαž“αžšαž€αžƒαžΎαž‰αžαžΆαž˜αž»αžαž„αžΆαžšαž‘αžΆαŸ†αž„αž“αŸαŸ‡αž˜αž½αž™αž…αŸ†αž“αž½αž“αž›αŸ‚αž„αžŠαŸ†αžŽαžΎαžšαž€αžΆαžšαžŠαžΌαž…αž€αžΆαžšαžšαŸ†αž–αžΉαž„αž‘αž»αž€αž“αŸ…αž›αžΎαžŸαŸ†αžŽαž»αŸ†αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸαŸ‡αž‘αŸ€αžαž αžΎαž™αŸ” αžŠαžΎαž˜αŸ’αž”αžΈαžŠαŸ„αŸ‡αžŸαŸ’αžšαžΆαž™αž”αž‰αŸ’αž αžΆαž“αŸαŸ‡ αžαŸ’αž‰αž»αŸ†αž”αžΆαž“αžŠαž€αž˜αž»αžαž„αžΆαžšαž˜αž·αž“αžŸαž˜αžšαž˜αŸ’αž™αž…αŸαž‰ αž αžΎαž™αž•αŸ’αž‘αž»αž€αž‘αž·αž“αŸ’αž“αž“αŸαž™αžŠαŸ„αž™αž˜αž·αž“αž”αŸ†αž”αŸ’αž›αŸ‚αž„αž‘αŸ…αž‡αžΆαž”αŸ’αžšαž—αŸαž‘αž€αŸ’αžšαž‘αžΆαž”αž“αŸ’αžαŸ‚αž˜αŸ”

αž€αžΆαžšαž…αŸ‚αž€αž…αžΆαž™αž‘αž·αž“αŸ’αž“αž“αŸαž™αž“αŸ…αž‘αžΌαž‘αžΆαŸ†αž„αž…αž„αŸ’αž€αŸ„αž˜

αžαŸ’αž‰αž»αŸ†β€‹αž“αžΉαž„β€‹αž…αŸ‚αž€αž…αžΆαž™β€‹αž‘αž·αž“αŸ’αž“αž“αŸαž™β€‹αž“αŸ…β€‹αž‘αžΌαž‘αžΆαŸ†αž„β€‹αžαŸ’αž“αžΆαŸ†αž„β€‹αž…αž„αŸ’αž€αŸ„αž˜β€‹αž‘αžΆαŸ†αž„β€‹αž”αžΈαŸ” αžŠαžΎαž˜αŸ’αž”αžΈαž…αžΆαž”αŸ‹αž•αŸ’αžαžΎαž˜ αžαžΆαž„αž€αŸ’αžšαŸ„αž˜αžαŸ’αž‰αž»αŸ†αž“αžΉαž„αž”αž„αŸ’αž€αžΎαžαžαžΆαžšαžΆαž„αž˜αž½αž™αž“αŸ…αž›αžΎαž˜αŸ‰αžΆαžŸαŸŠαžΈαž“αž‘αžΆαŸ†αž„αž”αžΈαŸ”

$ clickhouse-client --host=0.0.0.0

CREATE TABLE trips_mergetree_third (
    trip_id                 UInt32,
    vendor_id               String,
    pickup_date             Date,
    pickup_datetime         DateTime,
    dropoff_date            Date,
    dropoff_datetime        Nullable(DateTime),
    store_and_fwd_flag      Nullable(FixedString(1)),
    rate_code_id            Nullable(UInt8),
    pickup_longitude        Nullable(Float64),
    pickup_latitude         Nullable(Float64),
    dropoff_longitude       Nullable(Float64),
    dropoff_latitude        Nullable(Float64),
    passenger_count         Nullable(UInt8),
    trip_distance           Nullable(Float64),
    fare_amount             Nullable(Float32),
    extra                   Nullable(Float32),
    mta_tax                 Nullable(Float32),
    tip_amount              Nullable(Float32),
    tolls_amount            Nullable(Float32),
    ehail_fee               Nullable(Float32),
    improvement_surcharge   Nullable(Float32),
    total_amount            Nullable(Float32),
    payment_type            Nullable(String),
    trip_type               Nullable(UInt8),
    pickup                  Nullable(String),
    dropoff                 Nullable(String),

    cab_type                Nullable(String),

    precipitation           Nullable(Int8),
    snow_depth              Nullable(Int8),
    snowfall                Nullable(Int8),
    max_temperature         Nullable(Int8),
    min_temperature         Nullable(Int8),
    average_wind_speed      Nullable(Int8),

    pickup_nyct2010_gid     Nullable(Int8),
    pickup_ctlabel          Nullable(String),
    pickup_borocode         Nullable(Int8),
    pickup_boroname         Nullable(String),
    pickup_ct2010           Nullable(String),
    pickup_boroct2010       Nullable(String),
    pickup_cdeligibil       Nullable(FixedString(1)),
    pickup_ntacode          Nullable(String),
    pickup_ntaname          Nullable(String),
    pickup_puma             Nullable(String),

    dropoff_nyct2010_gid    Nullable(UInt8),
    dropoff_ctlabel         Nullable(String),
    dropoff_borocode        Nullable(UInt8),
    dropoff_boroname        Nullable(String),
    dropoff_ct2010          Nullable(String),
    dropoff_boroct2010      Nullable(String),
    dropoff_cdeligibil      Nullable(String),
    dropoff_ntacode         Nullable(String),
    dropoff_ntaname         Nullable(String),
    dropoff_puma            Nullable(String)
) ENGINE = MergeTree(pickup_date, pickup_datetime, 8192);

αž”αž“αŸ’αž‘αžΆαž”αŸ‹αž˜αž€αžαŸ’αž‰αž»αŸ†αž“αžΉαž„αž’αŸ’αžœαžΎαž±αŸ’αž™αž”αŸ’αžšαžΆαž€αžŠαžαžΆαž˜αŸ‰αžΆαžŸαŸŠαžΈαž“αž˜αŸαžŠαŸ†αž”αžΌαž„αž’αžΆαž…αž˜αžΎαž›αžƒαžΎαž‰αžαŸ’αž“αžΆαŸ†αž„αž‘αžΆαŸ†αž„αž”αžΈαž“αŸ…αž€αŸ’αž“αž»αž„αž…αž„αŸ’αž€αŸ„αž˜αŸ”

SELECT *
FROM system.clusters
WHERE cluster = 'perftest_3shards'
FORMAT Vertical;
Row 1:
──────
cluster:          perftest_3shards
shard_num:        1
shard_weight:     1
replica_num:      1
host_name:        172.30.2.192
host_address:     172.30.2.192
port:             9000
is_local:         1
user:             default
default_database:
Row 2:
──────
cluster:          perftest_3shards
shard_num:        2
shard_weight:     1
replica_num:      1
host_name:        172.30.2.162
host_address:     172.30.2.162
port:             9000
is_local:         0
user:             default
default_database:

Row 3:
──────
cluster:          perftest_3shards
shard_num:        3
shard_weight:     1
replica_num:      1
host_name:        172.30.2.36
host_address:     172.30.2.36
port:             9000
is_local:         0
user:             default
default_database:

αž”αž“αŸ’αž‘αžΆαž”αŸ‹αž˜αž€αžαŸ’αž‰αž»αŸ†αž“αžΉαž„αž€αŸ†αžŽαžαŸ‹αžαžΆαžšαžΆαž„αžαŸ’αž˜αžΈαž˜αž½αž™αž“αŸ…αž›αžΎαž˜αŸ‰αžΆαžŸαŸŠαžΈαž“αž˜αŸαžŠαŸ†αž”αžΌαž„αžŠαŸ‚αž›αž•αŸ’αž’αŸ‚αž€αž›αžΎαž‚αŸ’αžšαŸ„αž„αž€αžΆαžšαžŽαŸ trips_mergetree_third αž“αž·αž„αž”αŸ’αžšαžΎαž˜αŸ‰αžΆαžŸαŸŠαžΈαž“αž…αŸ‚αž€αž…αžΆαž™αŸ”

CREATE TABLE trips_mergetree_x3
    AS trips_mergetree_third
    ENGINE = Distributed(perftest_3shards,
                         default,
                         trips_mergetree_third,
                         rand());

αž”αž“αŸ’αž‘αžΆαž”αŸ‹αž˜αž€αžαŸ’αž‰αž»αŸ†αž“αžΉαž„αž…αž˜αŸ’αž›αž„αž‘αž·αž“αŸ’αž“αž“αŸαž™αž–αžΈαžαžΆαžšαžΆαž„αžŠαŸ‚αž›αž˜αžΆαž“αž˜αžΌαž›αžŠαŸ’αž‹αžΆαž“αž›αžΎ MergeTree αž‘αŸ…αž€αžΆαž“αŸ‹αž˜αŸ‰αžΆαžŸαŸŠαžΈαž“αž˜αŸαž‘αžΆαŸ†αž„αž”αžΈαŸ” αžαžΆαž„αž€αŸ’αžšαŸ„αž˜αž“αŸαŸ‡αžαŸ’αžšαžΌαžœαž”αžΆαž“αž”αž‰αŸ’αž…αž”αŸ‹αž€αŸ’αž“αž»αž„αžšαž™αŸˆαž–αŸαž› 34 αž“αžΆαž‘αžΈ αž“αž·αž„ 44 αžœαž·αž“αžΆαž‘αžΈαŸ”

INSERT INTO trips_mergetree_x3
    SELECT * FROM trips_mergetree;

αž”αž“αŸ’αž‘αžΆαž”αŸ‹αž–αžΈαž”αŸ’αžšαžαž·αž”αžαŸ’αžαž·αž€αžΆαžšαžαžΆαž„αž›αžΎ αžαŸ’αž‰αž»αŸ†αž”αžΆαž“αž•αŸ’αžαž›αŸ‹αž±αŸ’αž™ ClickHouse 15 αž“αžΆαž‘αžΈαžŠαžΎαž˜αŸ’αž”αžΈαž•αŸ’αž›αžΆαžŸαŸ‹αž‘αžΈαž†αŸ’αž„αžΆαž™αž–αžΈαžŸαž‰αŸ’αž‰αžΆαž€αž˜αŸ’αžšαž·αžαž•αŸ’αž‘αž»αž€αž’αžαž·αž”αžšαž˜αžΆαŸ” αžαžαž‘αž·αž“αŸ’αž“αž“αŸαž™αž”αžΆαž“αž”αž‰αŸ’αž…αž”αŸ‹αž‚αžΊ 264 GB, 34 GB αž“αž·αž„ 33 GB αžšαŸ€αž„αž‚αŸ’αž“αžΆαž“αŸ…αž›αžΎαž˜αŸ‰αžΆαžŸαŸŠαžΈαž“αž˜αŸαž‘αžΆαŸ†αž„αž”αžΈαŸ”

αž€αžΆαžšαžœαžΆαž™αžαž˜αŸ’αž›αŸƒαž€αžΆαžšαž’αž“αž»αžœαžαŸ’αžαž…αž„αŸ’αž€αŸ„αž˜ ClickHouse

αž’αŸ’αžœαžΈβ€‹αžŠαŸ‚αž›β€‹αžαŸ’αž‰αž»αŸ†β€‹αž”αžΆαž“β€‹αžƒαžΎαž‰β€‹αž”αž“αŸ’αž‘αžΆαž”αŸ‹β€‹αž‚αžΊβ€‹αž‡αžΆβ€‹αž–αŸαž›β€‹αžœαŸαž›αžΆβ€‹αž›αžΏαž“β€‹αž”αŸ†αž•αž»αžβ€‹αžŠαŸ‚αž›β€‹αžαŸ’αž‰αž»αŸ†β€‹αž”αžΆαž“β€‹αžƒαžΎαž‰β€‹αžšαžαŸ‹β€‹αžŸαŸ†αžŽαž½αžšβ€‹αž“αžΈαž˜αž½αž™αŸ—β€‹αž›αžΎβ€‹αžαžΆαžšαžΆαž„β€‹αž…αŸ’αžšαžΎαž“β€‹αžŠαž„ trips_mergetree_x3.

$ clickhouse-client --host=0.0.0.0

αžαžΆαž„αž€αŸ’αžšαŸ„αž˜αž“αŸαŸ‡αž”αžΆαž“αž”αž‰αŸ’αž…αž”αŸ‹αž€αŸ’αž“αž»αž„αžšαž™αŸˆαž–αŸαž› 2.449 αžœαž·αž“αžΆαž‘αžΈαŸ”

SELECT cab_type, count(*)
FROM trips_mergetree_x3
GROUP BY cab_type;

αžαžΆαž„αž€αŸ’αžšαŸ„αž˜αž“αŸαŸ‡αž”αžΆαž“αž”αž‰αŸ’αž…αž”αŸ‹αž€αŸ’αž“αž»αž„αžšαž™αŸˆαž–αŸαž› 0.691 αžœαž·αž“αžΆαž‘αžΈαŸ”

SELECT passenger_count,
       avg(total_amount)
FROM trips_mergetree_x3
GROUP BY passenger_count;

αžαžΆαž„αž€αŸ’αžšαŸ„αž˜αž“αŸαŸ‡αž”αžΆαž“αž”αž‰αŸ’αž…αž”αŸ‹αž€αŸ’αž“αž»αž„αžšαž™αŸˆαž–αŸαž› 0 αžœαž·αž“αžΆαž‘αžΈαŸ”

SELECT passenger_count,
       toYear(pickup_date) AS year,
       count(*)
FROM trips_mergetree_x3
GROUP BY passenger_count,
         year;

αžαžΆαž„αž€αŸ’αžšαŸ„αž˜αž“αŸαŸ‡αž”αžΆαž“αž”αž‰αŸ’αž…αž”αŸ‹αž€αŸ’αž“αž»αž„αžšαž™αŸˆαž–αŸαž› 0.983 αžœαž·αž“αžΆαž‘αžΈαŸ”

SELECT passenger_count,
       toYear(pickup_date) AS year,
       round(trip_distance) AS distance,
       count(*)
FROM trips_mergetree_x3
GROUP BY passenger_count,
         year,
         distance
ORDER BY year,
         count(*) DESC;

αžŸαž˜αŸ’αžšαžΆαž”αŸ‹αž€αžΆαžšαž”αŸ’αžšαŸ€αž”αž’αŸ€αž” αžαŸ’αž‰αž»αŸ†αž”αžΆαž“αžŠαŸ†αžŽαžΎαžšαž€αžΆαžšαžŸαŸ†αžŽαž½αžšαžŠαžΌαž…αž‚αŸ’αž“αžΆαž“αŸ…αž›αžΎαžαžΆαžšαžΆαž„αžŠαŸ‚αž›αž˜αžΆαž“αž˜αžΌαž›αžŠαŸ’αž‹αžΆαž“αž›αžΎ MergeTree αžŠαŸ‚αž›αžšαžŸαŸ‹αž“αŸ…αžαŸ‚αž›αžΎαž˜αŸ‰αžΆαžŸαŸŠαžΈαž“αž˜αŸαžŠαŸ†αž”αžΌαž„αž”αŸ‰αž»αžŽαŸ’αžŽαŸ„αŸ‡αŸ”

αž€αžΆαžšαžœαžΆαž™αžαž˜αŸ’αž›αŸƒαž€αžΆαžšαž’αž“αž»αžœαžαŸ’αžαž“αŸƒαžαŸ’αž“αžΆαŸ†αž„ ClickHouse αž˜αž½αž™αŸ”

αž’αŸ’αžœαžΈβ€‹αžŠαŸ‚αž›β€‹αžαŸ’αž‰αž»αŸ†β€‹αž”αžΆαž“β€‹αžƒαžΎαž‰β€‹αž”αž“αŸ’αž‘αžΆαž”αŸ‹β€‹αž‚αžΊβ€‹αž‡αžΆβ€‹αž–αŸαž›β€‹αžœαŸαž›αžΆβ€‹αž›αžΏαž“β€‹αž”αŸ†αž•αž»αžβ€‹αžŠαŸ‚αž›β€‹αžαŸ’αž‰αž»αŸ†β€‹αž”αžΆαž“β€‹αžƒαžΎαž‰β€‹αžšαžαŸ‹β€‹αžŸαŸ†αžŽαž½αžšβ€‹αž“αžΈαž˜αž½αž™αŸ—β€‹αž›αžΎβ€‹αžαžΆαžšαžΆαž„β€‹αž…αŸ’αžšαžΎαž“β€‹αžŠαž„ trips_mergetree_x3.

αžαžΆαž„αž€αŸ’αžšαŸ„αž˜αž“αŸαŸ‡αž”αžΆαž“αž”αž‰αŸ’αž…αž”αŸ‹αž€αŸ’αž“αž»αž„αžšαž™αŸˆαž–αŸαž› 0.241 αžœαž·αž“αžΆαž‘αžΈαŸ”

SELECT cab_type, count(*)
FROM trips_mergetree
GROUP BY cab_type;

αžαžΆαž„αž€αŸ’αžšαŸ„αž˜αž“αŸαŸ‡αž”αžΆαž“αž”αž‰αŸ’αž…αž”αŸ‹αž€αŸ’αž“αž»αž„αžšαž™αŸˆαž–αŸαž› 0.826 αžœαž·αž“αžΆαž‘αžΈαŸ”

SELECT passenger_count,
       avg(total_amount)
FROM trips_mergetree
GROUP BY passenger_count;

αžαžΆαž„αž€αŸ’αžšαŸ„αž˜αž“αŸαŸ‡αž”αžΆαž“αž”αž‰αŸ’αž…αž”αŸ‹αž€αŸ’αž“αž»αž„αžšαž™αŸˆαž–αŸαž› 1.209 αžœαž·αž“αžΆαž‘αžΈαŸ”

SELECT passenger_count,
       toYear(pickup_date) AS year,
       count(*)
FROM trips_mergetree
GROUP BY passenger_count,
         year;

αžαžΆαž„αž€αŸ’αžšαŸ„αž˜αž“αŸαŸ‡αž”αžΆαž“αž”αž‰αŸ’αž…αž”αŸ‹αž€αŸ’αž“αž»αž„αžšαž™αŸˆαž–αŸαž› 1.781 αžœαž·αž“αžΆαž‘αžΈαŸ”

SELECT passenger_count,
       toYear(pickup_date) AS year,
       round(trip_distance) AS distance,
       count(*)
FROM trips_mergetree
GROUP BY passenger_count,
         year,
         distance
ORDER BY year,
         count(*) DESC;

αž€αžΆαžšαž†αŸ’αž›αž»αŸ‡αž”αž‰αŸ’αž…αžΆαŸ†αž„αž›αžΎαž›αž‘αŸ’αž’αž•αž›

αž“αŸαŸ‡αž‡αžΆαž›αžΎαž€αžŠαŸ†αž”αžΌαž„αžŠαŸ‚αž›αž˜αžΌαž›αžŠαŸ’αž‹αžΆαž“αž‘αž·αž“αŸ’αž“αž“αŸαž™αž•αŸ’αž’αŸ‚αž€αž›αžΎαžŸαŸŠαžΈαž—αžΈαž™αžΌαž₯αžαž‚αž·αžαžαŸ’αž›αŸƒαž’αžΆαž…αžŠαŸ†αžŽαžΎαžšαž€αžΆαžšαž›αžΎαžŸαž–αžΈαž˜αžΌαž›αžŠαŸ’αž‹αžΆαž“αž‘αž·αž“αŸ’αž“αž“αŸαž™αžŠαŸ‚αž›αž˜αžΆαž“αž˜αžΌαž›αžŠαŸ’αž‹αžΆαž“αž›αžΎ GPU αž“αŸ…αž€αŸ’αž“αž»αž„αž€αžΆαžšαž’αŸ’αžœαžΎαžαŸαžŸαŸ’αžαžšαž”αžŸαŸ‹αžαŸ’αž‰αž»αŸ†αŸ” αž˜αžΌαž›αžŠαŸ’αž‹αžΆαž“αž‘αž·αž“αŸ’αž“αž“αŸαž™αžŠαŸ‚αž›αž˜αžΆαž“αž˜αžΌαž›αžŠαŸ’αž‹αžΆαž“αž›αžΎ GPU αž“αŸ„αŸ‡αž”αžΆαž“αž†αŸ’αž›αž„αž€αžΆαžαŸ‹αž€αžΆαžšαž€αŸ‚αž”αŸ’αžšαŸ‚αž…αŸ†αž“αž½αž“αž–αžΈαžšαž…αžΆαž”αŸ‹αžαžΆαŸ†αž„αž–αžΈαž–αŸαž›αž“αŸ„αŸ‡αž˜αž€ αž”αŸ‰αž»αž“αŸ’αžαŸ‚αžŠαŸ†αžŽαžΎαžšαž€αžΆαžšαžŠαŸ‚αž› ClickHouse αž•αŸ’αžαž›αŸ‹αž‡αžΌαž“αž“αŸ…αž›αžΎαžαŸ’αž“αžΆαŸ†αž„αžαŸ‚αž˜αž½αž™αž‚αžΊαž‚αž½αžšαž±αŸ’αž™αž…αžΆαž”αŸ‹αž’αžΆαžšαž˜αŸ’αž˜αžŽαŸαžαŸ’αž›αžΆαŸ†αž„αžŽαžΆαžŸαŸ‹αŸ”

αž“αŸ…αž–αŸαž›αžŠαŸ†αžŽαžΆαž›αž‚αŸ’αž“αžΆαž“αŸ„αŸ‡ αž“αŸ…αž–αŸαž›αž”αŸ’αžšαžαž·αž”αžαŸ’αžαž·αžŸαŸ†αžŽαž½αžšαž‘αžΈ 1 αž›αžΎαž˜αŸ‰αžΆαžŸαŸŠαžΈαž“αž…αŸ‚αž€αž…αžΆαž™ αž€αžΆαžšαž…αŸ†αžŽαžΆαž™αž›αžΎαžŸαž‚αžΊαž‡αžΆαž›αŸ†αžŠαžΆαž”αŸ‹αž“αŸƒαžšαŸ‰αž·αž…αž‘αŸαžšαžαŸ’αž–αžŸαŸ‹αž‡αžΆαž„αŸ” αžαŸ’αž‰αž»αŸ†αžŸαž„αŸ’αžƒαžΉαž˜αžαžΆαžαŸ’αž‰αž»αŸ†αžαž€αžαžΆαž“αž’αŸ’αžœαžΈαž˜αž½αž™αž“αŸ…αž€αŸ’αž“αž»αž„αž€αžΆαžšαžŸαŸ’αžšαžΆαžœαž‡αŸ’αžšαžΆαžœαžšαž”αžŸαŸ‹αžαŸ’αž‰αž»αŸ†αžŸαž˜αŸ’αžšαžΆαž”αŸ‹αž€αžΆαžšαž”αž„αŸ’αž αŸ„αŸ‡αž“αŸαŸ‡ αž–αŸ’αžšαŸ„αŸ‡αžœαžΆαž›αŸ’αž’αžŽαžΆαžŸαŸ‹αž€αŸ’αž“αž»αž„αž€αžΆαžšαžƒαžΎαž‰αž–αŸαž›αžœαŸαž›αžΆαžŸαŸ†αžŽαž½αžšαž’αŸ’αž›αžΆαž€αŸ‹αž…αž»αŸ‡ αž“αŸ…αž–αŸαž›αžŠαŸ‚αž›αžαŸ’αž‰αž»αŸ†αž”αž“αŸ’αžαŸ‚αž˜αžαŸ’αž“αžΆαŸ†αž„αž”αž“αŸ’αžαŸ‚αž˜αž‘αŸ€αžαž‘αŸ…αž€αŸ’αž“αž»αž„αž…αž„αŸ’αž€αŸ„αž˜αŸ” αž‘αŸ„αŸ‡αž™αŸ‰αžΆαž„αžŽαžΆαž€αŸαžŠαŸ„αž™ αžœαžΆαž‡αžΆαž€αžΆαžšαž”αŸ’αžšαžŸαžΎαžšαžŽαžΆαžŸαŸ‹αžŠαŸ‚αž›αž“αŸ…αž–αŸαž›αž’αž“αž»αžœαžαŸ’αžαžŸαŸ†αžŽαž½αžšαž•αŸ’αžŸαŸαž„αž‘αŸ€αž αž€αžΆαžšαž’αž“αž»αžœαžαŸ’αžαž€αžΎαž“αž‘αžΎαž„αž”αŸ’αžšαž αŸ‚αž› 2 αžŠαž„αŸ”

αžœαžΆαž–αž·αžαž‡αžΆαž›αŸ’αž’αžŽαžΆαžŸαŸ‹αžŠαŸ‚αž›αžƒαžΎαž‰ ClickHouse αžœαž·αžœαžαŸ’αžαž“αŸαž†αŸ’αž–αŸ„αŸ‡αž‘αŸ…αžšαž€αž€αžΆαžšαž”αŸ†αž”αŸ‚αž€αž€αžΆαžšαž•αŸ’αž‘αž»αž€ αž“αž·αž„αž€αžΆαžšαž‚αžŽαž“αžΆ αžŠαžΌαž…αŸ’αž“αŸαŸ‡αž–αž½αž€αž‚αŸαž’αžΆαž…αž’αŸ’αžœαžΎαž˜αžΆαžαŸ’αžšαžŠαŸ’αž‹αžΆαž“αžŠαŸ„αž™αž―αž€αžšαžΆαž‡αŸ’αž™αŸ” αž€αžΆαžšαž‚αžΆαŸ†αž‘αŸ’αžš HDFS αžŠαŸ‚αž›αžαŸ’αžšαžΌαžœαž”αžΆαž“αž”αž“αŸ’αžαŸ‚αž˜αž€αžΆαž›αž–αžΈαž†αŸ’αž“αžΆαŸ†αž˜αž»αž“αž’αžΆαž…αž‡αžΆαž‡αŸ†αž αžΆαž“αž†αŸ’αž–αŸ„αŸ‡αž‘αŸ…αžšαž€αž”αž‰αŸ’αž αžΆαž“αŸαŸ‡αŸ” αž“αŸ…αž€αŸ’αž“αž»αž„αž›αž€αŸ’αžαžαžŽαŸ’αžŒαž“αŸƒαž€αžΆαžšαž‚αžŽαž“αžΆ αž”αŸ’αžšαžŸαž·αž“αž”αžΎαžŸαŸ†αžŽαž½αžšαžαŸ‚αž˜αž½αž™αž’αžΆαž…αžαŸ’αžšαžΌαžœαž”αžΆαž“αž–αž“αŸ’αž›αžΏαž“αžŠαŸ„αž™αž€αžΆαžšαž”αž“αŸ’αžαŸ‚αž˜αžαŸ’αž“αžΆαŸ†αž„αž”αž“αŸ’αžαŸ‚αž˜αž‘αŸ€αžαž‘αŸ…αž€αŸ’αž“αž»αž„αž…αž„αŸ’αž€αŸ„αž˜αž“αŸ„αŸ‡ αž’αž“αžΆαž‚αžαž“αŸƒαž€αž˜αŸ’αž˜αžœαž·αž’αžΈαž“αŸαŸ‡αž‚αžΊαž—αŸ’αž›αžΊαžαŸ’αž›αžΆαŸ†αž„αŸ”

αž’αžšαž‚αž»αžŽαžŸαž˜αŸ’αžšαžΆαž”αŸ‹αž€αžΆαžšαž…αŸ†αžŽαžΆαž™αž–αŸαž›αž’αžΆαž“αž’αžαŸ’αžαž”αž‘αž“αŸαŸ‡αŸ” αžαŸ’αž‰αž»αŸ†αž•αŸ’αžαž›αŸ‹αž‡αžΌαž“αž€αžΆαžšαž”αŸ’αžšαžΉαž€αŸ’αžŸαžΆ αžŸαŸ’αžαžΆαž”αžαŸ’αž™αž€αž˜αŸ’αž˜ αž“αž·αž„αžŸαŸαžœαžΆαž€αž˜αŸ’αž˜αž’αž—αž·αžœαžŒαŸ’αžαž“αŸαž€αžΆαžšαž’αž“αž»αžœαžαŸ’αžαžŠαž›αŸ‹αž’αžαž·αžαž·αž‡αž“αž“αŸ…αž’αžΆαž˜αŸαžšαž·αž€αžαžΆαž„αž‡αžΎαž„ αž“αž·αž„αž’αžΊαžšαŸ‰αž»αž”αŸ” αž”αŸ’αžšαžŸαž·αž“αž”αžΎαž’αŸ’αž“αž€αž…αž„αŸ‹αž–αž·αž—αžΆαž€αŸ’αžŸαžΆαž–αžΈαžšαž”αŸ€αž”αžŠαŸ‚αž›αž€αžΆαžšαž•αŸ’αžαž›αŸ‹αž™αŸ„αž”αž›αŸ‹αžšαž”αžŸαŸ‹αžαŸ’αž‰αž»αŸ†αž’αžΆαž…αž‡αž½αž™αž’αžΆαž‡αžΈαžœαž€αž˜αŸ’αž˜αžšαž”αžŸαŸ‹αž’αŸ’αž“αž€ αžŸαžΌαž˜αž‘αžΆαž€αŸ‹αž‘αž„αž˜αž€αžαŸ’αž‰αž»αŸ†αžαžΆαž˜αžšαž™αŸˆ LinkedIn.

αž”αŸ’αžšαž—αž–: www.habr.com

αž”αž“αŸ’αžαŸ‚αž˜αž˜αžαž·αž™αŸ„αž”αž›αŸ‹