Hauv txoj kev tshawb no, kuv xav pom qhov kev txhim kho kev ua tau zoo tuaj yeem ua tiav los ntawm kev siv ClickHouse cov ntaub ntawv es tsis yog PostgreSQL. Kuv paub cov txiaj ntsig kev tsim khoom uas kuv tau txais los ntawm kev siv ClickHouse. Cov txiaj ntsig puas yuav txuas ntxiv yog tias kuv nkag mus rau ClickHouse los ntawm PostgreSQL siv Foreign Data Wrapper (FDW)?
Cov chaw khaws ntaub ntawv kawm yog PostgreSQL v11, clickhousedb_fdw thiab ClickHouse database. Thaum kawg, los ntawm PostgreSQL v11 peb yuav tau khiav ntau yam SQL queries routed los ntawm peb clickhousedb_fdw mus rau ClickHouse database. Peb mam li pom yuav ua li cas FDW qhov kev ua tau zoo piv rau tib cov lus nug uas khiav hauv ib txwm PostgreSQL thiab haiv neeg ClickHouse.
Clickhouse Database
ClickHouse yog qhov qhib qhov chaw columnar database tswj system uas tuaj yeem ua tiav qhov kev ua tau zoo 100-1000 lub sij hawm sai dua li cov txheej txheem ib txwm siv, muaj peev xwm ua tiav ntau tshaj li ib txhiab kab hauv tsawg dua ib ob.
Clickhousedb_fdw
clickhousedb_fdw - Cov ntaub ntawv sab nraud wrapper rau ClickHouse database, lossis FDW, yog qhov qhib qhov project los ntawm Percona.
Raws li koj yuav pom, qhov no muab FDW rau ClickHouse uas tso cai SELECT los ntawm, thiab INSERT INTO, ClickHouse database los ntawm PostgreSQL v11 server.
FDW txhawb nqa cov yam ntxwv zoo xws li kev sib sau ua ke thiab koom nrog. Qhov no txhim kho kev ua tau zoo los ntawm kev siv cov peev txheej ntawm cov chaw taws teeb server rau cov haujlwm siv nyiaj ntau.
Benchmark ib puag ncig
- Supermicro server:
- Intel® Xeon® CPU E5-2683 v3 @ 2.00GHz
- 2 sockets / 28 cores / 56 threads
- Nco: 256GB ntawm RAM
- Cia: Samsung SM863 1.9TB Enterprise SSD
- Filesystem: ext4/xfs
- OS: Linux smblade01 4.15.0-42-generic #45~16.04.1-Ubuntu
- PostgreSQL: version 11
Kev ntsuas ntsuas
Hloov chaw siv qee cov ntaub ntawv tsim los ntawm lub tshuab tsim rau qhov kev sim no, peb siv cov ntaub ntawv "Productivity by Time Reported Operator Time" los ntawm 1987 txog 2018. Koj tuaj yeem nkag mus rau cov ntaub ntawv
Cov ntaub ntawv loj yog 85 GB, muab ib lub rooj ntawm 109 kab.
Benchmark Queries
Nov yog cov lus nug uas kuv tau siv los sib piv ClickHouse, clickhousedb_fdw thiab PostgreSQL.
Q#
Cov lus nug muaj cov Aggregates thiab Pab Pawg Los ntawm
Q1
SELECT DayOfWeek, suav(*) AS c NTAWM lub sij hawm nyob qhov twg xyoo >= 2000 THIAB Xyoo <= 2008 GROUP BY DayOfWeek ORDER BY c DESC;
Q2
SELECT DayOfWeek, suav(*) AS c NTAWM lub sijhawm nyob qhov twg DepDelay>10 THIAB Xyoo>= 2000 THIAB Xyoo <= 2008 Pab Pawg Los Ntawm DayOfWeek ORDER BY c DESC;
Q3
SELECT Keeb Kwm, suav(*) AS c NTAWM lub sijhawm nyob qhov twg DepDelay>10 THIAB Xyoo>= 2000 THIAB Xyoo <= 2008 Pab Pawg Los Ntawm Keeb Kwm ORDER BY c DESC LIMIT 10;
Q4
SELECT Carrier, suav() Los ntawm lub sijhawm nyob qhov twg DepDelay> 10 THIAB Xyoo = 2007 GROUP BY Carrier ORDER BY count() DESC;
Q5
SELECT a.Carrier, c, c2, c1000/c2 as c3 NTAWM (SELECT Carrier, suav() Raws li c NTAWM lub sijhawm nyob qhov twg DepDelay> 10 THIAB Xyoo = 2007 GROUP BY Carrier ) ib qho kev koom nrog sab hauv ( SELECT Carrier, suav(*) AS c2 NTAWM lub sijhawm nyob qhov twg Xyoo = 2007 GROUP BY Carrier)b ntawm a.Carrier = b.Carrier Los ntawm c3 DESC;
Q6
SELECT a.Carrier, c, c2, c1000/c2 as c3 NTAWM (SELECT Carrier, suav() Raws li c NTAWM lub sij hawm nyob qhov twg DepDelay>10 THIAB Xyoo>= 2000 THIAB Xyoo <= 2008 GROUP BY Carrier) ib qho kev koom nrog sab hauv ( SELECT Carrier, suav(*) AS c2 NTAWM lub sij hawm nyob qhov twg xyoo >= 2000 THIAB Xyoo <= 2008 GUPBY 3 Carrier ) b ntawm a.Carrier = b.Carrier ORDER BY cXNUMX DESC;
Q7
SELECT Carrier, avg(DepDelay) * 1000 AS c3 NTAWM ontime WHERE Year >= 2000 THIAB Xyoo <= 2008 GROUP BY Carrier;
Q8
SELECT Xyoo, avg(DepDelay) Los ntawm lub sij hawm pab pawg los ntawm Xyoo;
Q9
xaiv Xyoo, suav (*) raws li c1 los ntawm pab pawg neeg nyob rau lub xyoo;
Q10
SELECT avg(cnt) NTAWM (XAIV Xyoo, Hli, suav(*) AS cnt NTAWM ontime WHERE DepDel15=1 Pab Pawg Los Ntawm Xyoo, Hli) a;
Q11
xaiv avg(c1) los ntawm (xaiv Xyoo, Lub Hlis, suav(*) raws li c1 los ntawm pawg neeg nyob rau lub sijhawm los ntawm Xyoo, Lub Hlis) a;
Q12
SELECT OriginCityName, DestCityName, suav(*) AS c NTAWM ontime GROUP BY OriginCityName, DestCityName ORDER BY c DESC LIMIT 10;
Q13
SELECT OriginCityName, suav(*) AS c NTAWM lub sij hawm pab pawg los ntawm OriginCityName ORDER BY c DESC LIMIT 10;
Cov lus nug muaj koom nrog
Q14
SELECT a.Year, c1/c2 NTAWM (xaiv Xyoo, suav()1000 as c1 from ontime WHERE DepDelay> 10 GROUP BY Year) ib qho kev koom nrog sab hauv (xaiv Xyoo, suav (*) raws li c2 los ntawm ontime GROUP BY Xyoo) b ntawm a.Year=b.Year ORDER BY a.Year;
Q15
Xaiv ib "Year", c1/c2 NTAWM (xaiv "Xyoo", suav()1000 as c1 FROM fontime where “DepDelay”> 10 GROUP BY “Xyoo”) ib qho kev koom nrog sab hauv (xaiv “Xyoo”, suav (*) raws li c2 NTAWM fontime GROUP BY “Xyoo”) b ntawm a”Year”=b. "Xyoo";
Table-1: Cov lus nug siv nyob rau hauv kev ntsuas
Cov lus nug executions
Nov yog cov txiaj ntsig ntawm txhua qhov kev nug thaum khiav hauv qhov chaw sib txawv: PostgreSQL nrog thiab tsis muaj kev ntsuas, haiv neeg ClickHouse thiab clickhousedb_fdw. Lub sij hawm qhia nyob rau hauv milliseconds.
Q#
PostgreSQL
PostgreSQL (Indexed)
Nyem Tsev
clickhousedb_fdw
Q1
27920
19634
23
57
Q2
35124
17301
50
80
Q3
34046
15618
67
115
Q4
31632
7667
25
37
Q5
47220
8976
27
60
Q6
58233
24368
55
153
Q7
30566
13256
52
91
Q8
38309
60511
112
179
Q9
20674
37979
31
81
Q10
34990
20102
56
148
Q11
30489
51658
37
155
Q12
39357
33742
186
1333
Q13
29912
30709
101
384
Q14
54126
39913
124
1364212
Q15
97258
30211
245
259
Table-1: Lub sijhawm siv los ua cov lus nug uas siv rau hauv cov qauv ntsuas
Saib cov txiaj ntsig
Daim duab qhia cov lus nug ua tiav lub sijhawm hauv milliseconds, X axis qhia tus lej nug los ntawm cov lus saum toj no, thiab Y axis qhia lub sijhawm ua tiav hauv milliseconds. ClickHouse cov txiaj ntsig thiab cov ntaub ntawv rov qab los ntawm postgres siv clickhousedb_fdw tau qhia. Los ntawm lub rooj koj tuaj yeem pom tias muaj qhov sib txawv loj ntawm PostgreSQL thiab ClickHouse, tab sis qhov sib txawv tsawg kawg ntawm ClickHouse thiab clickhousedb_fdw.
Daim duab no qhia qhov txawv ntawm ClickhouseDB thiab clickhousedb_fdw. Hauv cov lus nug feem ntau, FDW nyiaj siv ua haujlwm tsis yog siab heev thiab tsis tshua muaj txiaj ntsig tshwj tsis yog Q12. Cov lus nug no suav nrog kev koom nrog thiab ORDER BY clause. Vim yog ORDER BY GROUP/BY clause, ORDER BY tsis poob rau ClickHouse.
Hauv Table 2 peb pom lub sijhawm dhia hauv cov lus nug Q12 thiab Q13. Ntxiv dua thiab, qhov no yog tshwm sim los ntawm ORDER BY clause. Txhawm rau kom paub meej qhov no, kuv tau khiav cov lus nug Q-14 thiab Q-15 nrog thiab tsis muaj ORDER BY clause. Yog tsis muaj ORDER BY clause lub sij hawm ua tiav yog 259ms thiab nrog ORDER BY clause nws yog 1364212. Txhawm rau debug cov lus nug no kuv piav qhia ob qho lus nug thiab ntawm no yog cov txiaj ntsig ntawm kev piav qhia.
Q15: Tsis muaj ORDER BY Clause
bm=# EXPLAIN VERBOSE SELECT a."Year", c1/c2
FROM (SELECT "Year", count(*)*1000 AS c1 FROM fontime WHERE "DepDelay" > 10 GROUP BY "Year") a
INNER JOIN(SELECT "Year", count(*) AS c2 FROM fontime GROUP BY "Year") b ON a."Year"=b."Year";
Q15: Lus nug yam tsis tau txiav txim los ntawm nqe lus
QUERY PLAN
Hash Join (cost=2250.00..128516.06 rows=50000000 width=12)
Output: fontime."Year", (((count(*) * 1000)) / b.c2)
Inner Unique: true Hash Cond: (fontime."Year" = b."Year")
-> Foreign Scan (cost=1.00..-1.00 rows=100000 width=12)
Output: fontime."Year", ((count(*) * 1000))
Relations: Aggregate on (fontime)
Remote SQL: SELECT "Year", (count(*) * 1000) FROM "default".ontime WHERE (("DepDelay" > 10)) GROUP BY "Year"
-> Hash (cost=999.00..999.00 rows=100000 width=12)
Output: b.c2, b."Year"
-> Subquery Scan on b (cost=1.00..999.00 rows=100000 width=12)
Output: b.c2, b."Year"
-> Foreign Scan (cost=1.00..-1.00 rows=100000 width=12)
Output: fontime_1."Year", (count(*))
Relations: Aggregate on (fontime)
Remote SQL: SELECT "Year", count(*) FROM "default".ontime GROUP BY "Year"(16 rows)
Q14: Nug nrog ORDER BY Clause
bm=# EXPLAIN VERBOSE SELECT a."Year", c1/c2 FROM(SELECT "Year", count(*)*1000 AS c1 FROM fontime WHERE "DepDelay" > 10 GROUP BY "Year") a
INNER JOIN(SELECT "Year", count(*) as c2 FROM fontime GROUP BY "Year") b ON a."Year"= b."Year"
ORDER BY a."Year";
Q14: Lus nug Plan nrog ORDER BY Clause
QUERY PLAN
Merge Join (cost=2.00..628498.02 rows=50000000 width=12)
Output: fontime."Year", (((count(*) * 1000)) / (count(*)))
Inner Unique: true Merge Cond: (fontime."Year" = fontime_1."Year")
-> GroupAggregate (cost=1.00..499.01 rows=1 width=12)
Output: fontime."Year", (count(*) * 1000)
Group Key: fontime."Year"
-> Foreign Scan on public.fontime (cost=1.00..-1.00 rows=100000 width=4)
Remote SQL: SELECT "Year" FROM "default".ontime WHERE (("DepDelay" > 10))
ORDER BY "Year" ASC
-> GroupAggregate (cost=1.00..499.01 rows=1 width=12)
Output: fontime_1."Year", count(*) Group Key: fontime_1."Year"
-> Foreign Scan on public.fontime fontime_1 (cost=1.00..-1.00 rows=100000 width=4)
Remote SQL: SELECT "Year" FROM "default".ontime ORDER BY "Year" ASC(16 rows)
xaus
Cov txiaj ntsig ntawm cov kev sim no qhia tau tias ClickHouse muaj kev ua tau zoo tiag tiag, thiab clickhousedb_fdw muab cov txiaj ntsig kev ua tau zoo ntawm ClickHouse los ntawm PostgreSQL. Thaum muaj qee qhov nyiaj siv ua haujlwm thaum siv clickhousedb_fdw, nws yog qhov tsis txaus ntseeg thiab piv rau qhov kev ua tiav los ntawm kev khiav ib txwm nyob ntawm ClickHouse database. Qhov no kuj lees paub tias fdw hauv PostgreSQL muab cov txiaj ntsig zoo.
Telegram tham ntawm Clickhouse
Telegram tham siv PostgreSQL
Tau qhov twg los: www.hab.com