Idanwo iṣẹ ṣiṣe ti awọn ibeere itupalẹ ni PostgreSQL, ClickHouse ati clickhousedb_fdw (PostgreSQL)

Ninu iwadi yii, Mo fẹ lati rii kini awọn ilọsiwaju iṣẹ le ṣee ṣe nipasẹ lilo orisun data ClickHouse dipo PostgreSQL. Mo mọ awọn anfani iṣelọpọ ti Mo gba lati lilo ClickHouse. Njẹ awọn anfani wọnyi yoo tẹsiwaju ti MO ba wọle si ClickHouse lati PostgreSQL nipa lilo Wrapper Data Ajeji (FDW)?

Awọn agbegbe data ti iwadi jẹ PostgreSQL v11, clickhousedb_fdw ati ClickHouse database. Nikẹhin, lati PostgreSQL v11 a yoo ṣiṣẹ ọpọlọpọ awọn ibeere SQL ti a da nipasẹ clickhousedb_fdw wa si aaye data ClickHouse. A yoo rii bi iṣẹ FDW ṣe ṣe afiwe si awọn ibeere kanna ti nṣiṣẹ ni abinibi PostgreSQL ati abinibi ClickHouse.

Clickhouse aaye data

ClickHouse jẹ eto iṣakoso aaye data columnar orisun ṣiṣi ti o le ṣaṣeyọri iṣẹ ṣiṣe ni awọn akoko 100-1000 yiyara ju awọn isunmọ ibi-ipamọ data ibile, ti o lagbara lati ṣiṣẹ lori awọn ori ila bilionu kan ni o kere ju iṣẹju-aaya kan.

Clickhousedb_fdw

clickhousedb_fdw - Ipilẹ data ita fun aaye data ClickHouse, tabi FDW, jẹ iṣẹ akanṣe orisun ṣiṣi lati Percona. Eyi ni ọna asopọ si ibi ipamọ GitHub ti iṣẹ akanṣe naa.

Ni Oṣu Kẹta Mo kọ bulọọgi kan ti o sọ fun ọ diẹ sii nipa FDW wa.

Bi iwọ yoo ti rii, eyi n pese FDW kan fun ClickHouse ti o fun laaye Yan lati, ati FI SINU, ibi data ClickHouse lati olupin PostgreSQL v11.

FDW ṣe atilẹyin awọn ẹya to ti ni ilọsiwaju gẹgẹbi apapọ ati darapọ. Eyi ṣe ilọsiwaju iṣẹ ni pataki nipa lilo awọn orisun ti olupin latọna jijin fun awọn iṣẹ ṣiṣe to lekoko wọnyi.

Ayika tunbo

  • olupin Supermicro:
    • Intel® Xeon® Sipiyu E5-2683 v3 @ 2.00GHz
    • 2 sockets / 28 ohun kohun / 56 o tẹle
    • Iranti: 256GB ti Ramu
    • Ibi ipamọ: Samsung SM863 1.9TB Idawọlẹ SSD
    • Eto faili: ext4/xfs
  • OS: Linux smblade01 4.15.0-42-jeneriki #45~16.04.1-Ubuntu
  • PostgreSQL: ẹya 11

Awọn idanwo ala

Dipo lilo diẹ ninu awọn eto data ti a ṣe ipilẹṣẹ ẹrọ fun idanwo yii, a lo data “Iṣẹṣẹ nipasẹ Akoko Oniṣiṣẹ Ijabọ Akoko” data lati 1987 si 2018. O le wọle si data naa lilo iwe afọwọkọ wa ti o wa nibi.

Iwọn data data jẹ 85 GB, pese tabili kan ti awọn ọwọn 109.

Awọn ibeere ibujoko

Eyi ni awọn ibeere ti Mo lo lati ṣe afiwe ClickHouse, clickhousedb_fdw ati PostgreSQL.

Q#
Ibeere Ni awọn akojọpọ ati Ẹgbẹ Nipasẹ

Q1
Yan DayOfWeek, ka(*) AS c LATI akoko nibo Odun>= 2000 ATI Odun <= 2008 Group BY DayOfWeek ORDER BY c DESC;

Q2
Yan DayOfWeek, ka(*) AS c LATI ontime Nibo DepDelay>10 ATI Odun>= 2000 ATI Odun <= 2008 Group BY DayOfWeek ASEJE BY c DESC;

Q3
Yan Oti, ka (*) AS c LATI ontime Nibo DepDelay>10 ATI Odun>= 2000 ATI Odun <= 2008 Group BY Oti Bere fun BY c DESC LIMIT 10;

Q4
Yan Olugbeja, ka(LATI akoko ni ibi DepDelay>10 AND Odun = 2007 Group BY Carrier ORDER BY count() DESC;

Q5
YAN a.Onigberu, c, c2, c1000/c2 bi c3 LATI (Yan Olutọju, kika() AS c LATI ontime WHERE DepDelay>10 AND Year=2007 Group BY Carrier ) a INTER JOIN (Yan Carrier,count(*) AS c2 FROM intime WHERE Year=2007 GROUP BY Carrier) b on a.Carrier=b.Aṣẹ Gbigbe BY c3 DESC;

Q6
YAN a.Onigberu, c, c2, c1000/c2 bi c3 LATI (Yan Olutọju, kika() AS c LATI akoko nibo DepDelay>10 AND Odun>= 2000 AND Odun <= 2008 Group BY Carrier) a INTER JOIN (Yan Carrier, count(*) AS c2 LATI akoko nibo Odun>= 2000 AND Odun <= 2008 Group BY Olugbeja ) b lori a.Oluwa = b.OLUGBE ASE NIPA c3 DESC;

Q7
Yan Carrier, avg(DepDelay) * 1000 AS c3 LATI akoko nibo Odun>= 2000 AND Odun <= 2008 Group BY Carrier;

Q8
Yan Odun, avg(DepDelay) LATI Ẹgbẹ akoko nipasẹ Ọdun;

Q9
yan Ọdun, ka (*) bi c1 lati ẹgbẹ akoko nipasẹ Ọdun;

Q10
Yan avg(cnt) LATI (Yiyan Odun,Osu,ka(*) AS cnt LATI akoko nibo DepDel15=1 GROUP BY Odun,Osu) a;

Q11
yan avg (c1) lati (yan Ọdun, Osu, kika (*) bi c1 lati ẹgbẹ akoko nipasẹ Ọdun, Oṣu) a;

Q12
Yan OriginCityName, DestCityName, count(*) AS c LATI Group time BY OriginCityName, DestCityName PERE NIPA c DESC LIMIT 10;

Q13
Yan OriginCityName, ka(*) AS c LATI Group time BY OriginCityName PERE NIPA c DESC LIMIT 10;

Ibeere Ni Awọn Ijọpọ

Q14
Yan Ọdun kan, c1/c2 LATI (yan Ọdun, kika()1000 bi c1 lati akoko NIBI DepDelay>10 GROUP BY Odun) INU INU (yan Odun, ka (*) bi c2 lati igba akoko GROUP BY Odun ) b on a.Year=b.Odun ASEJE BY a.Year;

Q15
Yan a.”Ọdun”, c1/c2 LATI (yan “Ọdun”, ka()1000 bi c1 LATI fontime NIBI “DepDelay”> Ẹgbẹ 10 NIPA “Ọdun”) Isopọ INU kan (yan “Ọdun”, ka (*) bi c2 FROM FROM GROUP BY “Ọdun”) b lori a.” Ọdun”=b. "Ọdun";

Tabili-1: Awọn ibeere ti a lo ninu ala

Awọn ipaniyan ibeere

Eyi ni awọn abajade ti ọkọọkan awọn ibeere nigba ṣiṣe ni oriṣiriṣi awọn eto data data: PostgreSQL pẹlu ati laisi atọka, ClickHouse abinibi ati clickhousedb_fdw. Akoko ti han ni milliseconds.

Q#
PostgreSQL
PostgreSQL (Itọkasi)
Tẹ Ile
clickhousedb_fdw

Q1
27920
19634
23
57

Q2
35124
17301
50
80

Q3
34046
15618
67
115

Q4
31632
7667
25
37

Q5
47220
8976
27
60

Q6
58233
24368
55
153

Q7
30566
13256
52
91

Q8
38309
60511
112
179

Q9
20674
37979
31
81

Q10
34990
20102
56
148

Q11
30489
51658
37
155

Q12
39357
33742
186
1333

Q13
29912
30709
101
384

Q14
54126
39913
124
1364212

Q15
97258
30211
245
259

Tabili-1: Akoko ti o gba lati ṣiṣẹ awọn ibeere ti a lo ninu ala

Wo awọn abajade

Aworan naa fihan akoko ipaniyan ibeere ni milliseconds, ipo X fihan nọmba ibeere lati awọn tabili loke, ati ipo Y ṣe afihan akoko ipaniyan ni milliseconds. Awọn abajade ClickHouse ati data ti a gba pada lati postgres ni lilo clickhousedb_fdw ti han. Lati tabili o le rii pe iyatọ nla wa laarin PostgreSQL ati ClickHouse, ṣugbọn iyatọ kekere laarin ClickHouse ati clickhousedb_fdw.

Idanwo iṣẹ ṣiṣe ti awọn ibeere itupalẹ ni PostgreSQL, ClickHouse ati clickhousedb_fdw (PostgreSQL)

Aworan yi fihan iyatọ laarin ClickhouseDB ati clickhousedb_fdw. Ninu ọpọlọpọ awọn ibeere, oke FDW ko ga bẹ ati pe ko ṣe pataki ayafi fun Q12. Ibeere yii pẹlu awọn idapọ ati BEERE NIPA gbolohun ọrọ. Nitori aṣẹ nipasẹ GROUP/nipasẹ gbolohun ọrọ, ORDER BY ko ju silẹ si ClickHouse.

Ni Tabili 2 a rii akoko fo ni awọn ibeere Q12 ati Q13. Lẹẹkansi, eyi ṣẹlẹ nipasẹ BEERE NIPA gbolohun ọrọ. Lati jẹrisi eyi, Mo ran awọn ibeere Q-14 ati Q-15 pẹlu ati laisi ORDER BY gbolohun ọrọ. Laisi ORDER BY gbolohun ọrọ ipari akoko jẹ 259ms ati pẹlu ORDER BY gbolohun ọrọ o jẹ 1364212. Lati ṣatunṣe ibeere yii Mo n ṣalaye awọn ibeere mejeeji ati eyi ni awọn abajade alaye.

Q15: Laisi aṣẹ nipasẹ Abala

bm=# EXPLAIN VERBOSE SELECT a."Year", c1/c2 
     FROM (SELECT "Year", count(*)*1000 AS c1 FROM fontime WHERE "DepDelay" > 10 GROUP BY "Year") a
     INNER JOIN(SELECT "Year", count(*) AS c2 FROM fontime GROUP BY "Year") b ON a."Year"=b."Year";

Q15: Ibeere Laisi Ibere ​​BY Abala

QUERY PLAN                                                      
Hash Join  (cost=2250.00..128516.06 rows=50000000 width=12)  
Output: fontime."Year", (((count(*) * 1000)) / b.c2)  
Inner Unique: true   Hash Cond: (fontime."Year" = b."Year")  
->  Foreign Scan  (cost=1.00..-1.00 rows=100000 width=12)        
Output: fontime."Year", ((count(*) * 1000))        
Relations: Aggregate on (fontime)        
Remote SQL: SELECT "Year", (count(*) * 1000) FROM "default".ontime WHERE (("DepDelay" > 10)) GROUP BY "Year"  
->  Hash  (cost=999.00..999.00 rows=100000 width=12)        
Output: b.c2, b."Year"        
->  Subquery Scan on b  (cost=1.00..999.00 rows=100000 width=12)              
Output: b.c2, b."Year"              
->  Foreign Scan  (cost=1.00..-1.00 rows=100000 width=12)                    
Output: fontime_1."Year", (count(*))                    
Relations: Aggregate on (fontime)                    
Remote SQL: SELECT "Year", count(*) FROM "default".ontime GROUP BY "Year"(16 rows)

Q14: Ibeere Pẹlu Ibere ​​BY Abala

bm=# EXPLAIN VERBOSE SELECT a."Year", c1/c2 FROM(SELECT "Year", count(*)*1000 AS c1 FROM fontime WHERE "DepDelay" > 10 GROUP BY "Year") a 
     INNER JOIN(SELECT "Year", count(*) as c2 FROM fontime GROUP BY "Year") b  ON a."Year"= b."Year" 
     ORDER BY a."Year";

Q14: Eto ibeere pẹlu BEERE NIPA Abala

QUERY PLAN 
Merge Join  (cost=2.00..628498.02 rows=50000000 width=12)   
Output: fontime."Year", (((count(*) * 1000)) / (count(*)))   
Inner Unique: true   Merge Cond: (fontime."Year" = fontime_1."Year")   
->  GroupAggregate  (cost=1.00..499.01 rows=1 width=12)        
Output: fontime."Year", (count(*) * 1000)         
Group Key: fontime."Year"         
->  Foreign Scan on public.fontime  (cost=1.00..-1.00 rows=100000 width=4)               
Remote SQL: SELECT "Year" FROM "default".ontime WHERE (("DepDelay" > 10)) 
            ORDER BY "Year" ASC   
->  GroupAggregate  (cost=1.00..499.01 rows=1 width=12)         
Output: fontime_1."Year", count(*)         Group Key: fontime_1."Year"         
->  Foreign Scan on public.fontime fontime_1  (cost=1.00..-1.00 rows=100000 width=4) 
              
Remote SQL: SELECT "Year" FROM "default".ontime ORDER BY "Year" ASC(16 rows)

ipari

Awọn abajade ti awọn adanwo wọnyi fihan pe ClickHouse nfunni ni iṣẹ ṣiṣe to dara gaan, ati clickhousedb_fdw nfunni ni awọn anfani iṣẹ ṣiṣe ti ClickHouse lati PostgreSQL. Lakoko ti o wa ni oke nigba lilo clickhousedb_fdw, o jẹ aifiyesi ati afiwera si iṣẹ ṣiṣe ti o waye nipasẹ ṣiṣe ni abinibi lori ibi ipamọ data ClickHouse. Eyi tun jẹrisi pe fdw ni PostgreSQL n pese awọn abajade to dara julọ.

Iwiregbe Telegram nipasẹ Clickhouse https://t.me/clickhouse_ru
Iwiregbe Telegram ni lilo PostgreSQL https://t.me/pgsql

orisun: www.habr.com

Fi ọrọìwòye kun