Ihlola ukusebenza kwemibuzo yokuhlaziya ku-PostgreSQL, ClickHouse kanye ne-clickhousedb_fdw (PostgreSQL)

Kulolu cwaningo, bengifuna ukubona ukuthi yikuphi ukuthuthukiswa kokusebenza okungafinyelelwa ngokusebenzisa umthombo wedatha we-ClickHouse kune-PostgreSQL. Ngiyazazi izinzuzo zokukhiqiza engizithola ngokusebenzisa i-ClickHouse. Ingabe lezi zinzuzo zizoqhubeka uma ngifinyelela i-ClickHouse kusuka ku-PostgreSQL ngisebenzisa i-Foreign Data Wrapper (FDW)?

Izindawo zesizindalwazi ezifundwe yi-PostgreSQL v11, clickhousedb_fdw kanye nesizindalwazi se-ClickHouse. Ekugcineni, kusukela ku-PostgreSQL v11 sizobe sisebenzisa imibuzo ehlukahlukene ye-SQL ehanjiswa ngokuchofoza kwethu ku-clickhousedb_fdw kusizindalwazi se-ClickHouse. Sizobe sesibona ukuthi ukusebenza kwe-FDW kuqhathaniseka kanjani nemibuzo efanayo esebenza ku-PostgreSQL yomdabu kanye ne-ClickHouse yomdabu.

I-Clickhouse Database

I-ClickHouse iwumthombo ovulekile wohlelo lokuphatha isizindalwazi sekholamu engafinyelela ukusebenza ngokushesha izikhathi eziyi-100-1000 kunezindlela zesizindalwazi esivamile, ekwazi ukucubungula imigqa engaphezu kwebhiliyoni ngaphansi kwesekhondi.

Clickhousedb_fdw

clickhousedb_fdw - Isisonga sedatha sangaphandle sesizindalwazi se-ClickHouse, noma i-FDW, iphrojekthi yomthombo ovulekile evela ku-Percona. Nasi isixhumanisi sendawo yephrojekthi ye-GitHub.

NgoMashi ngabhala ibhulogi ekutshela okwengeziwe nge-FDW yethu.

Njengoba uzobona, lokhu kunikeza i-FDW ye-ClickHouse evumela ukuthi KHETHA kusuka, futhi INSERT INTO, isizindalwazi se-ClickHouse kusuka kuseva ye-PostgreSQL v11.

I-FDW isekela izici ezithuthukisiwe ezifana nokuhlanganisa nokujoyina. Lokhu kuthuthukisa kakhulu ukusebenza ngokusebenzisa izinsiza zeseva ekude kule misebenzi edinga izinsiza.

Imvelo yebhentshimakhi

  • Iseva ye-Supermicro:
    • I-Intel® Xeon® CPU E5-2683 v3 @ 2.00GHz
    • 2 amasokhethi / 28 cores / 56 imicu
    • Imemori: I-256GB ye-RAM
    • Isitoreji: I-Samsung SM863 1.9TB Enterprise SSD
    • Isistimu yefayela: ext4/xfs
  • OS: Linux smblade01 4.15.0-42-generic #45~16.04.1-Ubuntu
  • I-PostgreSQL: inguqulo 11

Ukuhlolwa kwebhentshimakhi

Esikhundleni sokusebenzisa idatha ethile ekhiqizwe umshini kulokhu kuhlolwa, sisebenzise idatha "Yokukhiqiza Ngesikhathi Esibikwe Isikhathi Somsebenzi" kusukela ngo-1987 kuya ku-2018. Ungakwazi ukufinyelela idatha usebenzisa iskripthi sethu esitholakala lapha.

Usayizi wesizindalwazi ungu-85 GB, uhlinzeka ngethebula elilodwa lamakholomu angu-109.

Imibuzo yeBenchmark

Nansi imibuzo ebengiyisebenzisa ukuqhathanisa i-ClickHouse, clickhousedb_fdw kanye ne-PostgreSQL.

Q#
Umbuzo Uqukethe Ama-aggregate kanye Neqembu Nge

Q1
KHETHA I-DayOfWeek, bala(*) NJENGOBA c UKUSUKA NGESIKHATHI LAPHO Unyaka >= 2000 NONyaka <= 2008 IQEMBU NGE-DayOfWeek UKUHLELA NGE-C DESC;

Q2
KHETHA I-DayOfWeek, bala(*) NJENGOBA c KUSUKA ngesikhathi LAPHO i-DepDelay>10 NONyaka >= 2000 NONyaka <= 2008 IQEMBU NGE-DayOfWeek UKUHLELA NGE-C DESC;

Q3
KHETHA Imvelaphi, bala(*) NJENGOBA c UKUSUKA ngesikhathi LAPHO DepDelay>10 NONyaka >= 2000 NONyaka <= 2008 GROUP BY Origin ORDER BY c DESC LIMIT 10;

Q4
KHETHA inkampani yenethiwekhi, bala() KUSUKA ngesikhathi LAPHO i-DepDelay>10 NONyaka = 2007 IQEMBU NGE-ODA YEnkampani yenethiwekhi ngokubala() I-DESC;

Q5
KHETHA a.Inkampani yenethiwekhi, c, c2, c1000/c2 njenge-c3 FROM ( KHETHA Isithwali, bala() NJENGOBA c KUSUKA ngesikhathi LAPHO i-DepDelay>10 KANYE NONyaka=2007 IQEMBU NGEnkampani Yenethiwekhi ) JOYINA NGAPHAKATHI ( KHETHA Inkampani yenethiwekhi, bala(*) NJENGOBA c2 KUSUKA ngesikhathi lapho Unyaka=2007 IQEMBU NGEnkampani Yenethiwekhi)b ku-a.Carrier=b.I-ODA Yenkampani Yenethiwekhi NGE-C3 DESC;

Q6
KHETHA a.Inkampani yenethiwekhi, c, c2, c1000/c2 njenge-c3 FROM ( KHETHA Isithwali, bala() NJENGOBA c KUSUKA ngesikhathi LAPHO i-DepDelay>10 NONyaka >= 2000 NONyaka <= 2008 IQEMBU NGENXA YENKAMPANI) JOYINA NGAPHAKATHI (KHETHA Inkampani Yenethiwekhi, bala(*) NJENGOBA c2 KUSUKA ngesikhathi LAPHO Unyaka >= 2000 NONyaka <= 2008 GROUP BY Inkampani yenethiwekhi ) b ku-a.Carrier=b.I-ODA Yenkampani Yenethiwekhi BY c3 DESC;

Q7
KHETHA Inkampani Yenethiwekhi, avg(DepDelay) * 1000 AS c3 KUSUKA ngesikhathi LAPHO Unyaka >= 2000 KANYE Nonyaka <= 2008 IQEMBU NGEnkampani Yenethiwekhi;

Q8
KHETHA Unyaka, avg(DepDelay) KUSUKA ngesikhathi GROUP BY Year;

Q9
khetha Year, count(*) as c1 from ontime group by Year;

I-Q10
KHETHA isilinganiso(cnt) KUSUKA (KHETHA Unyaka,Inyanga,ukubala(*) AS cnt KUSUKA ngesikhathi LAPHO DepDel15=1 IQEMBU NGONYAKA,Inyanga) a;

I-Q11
khetha i-avg(c1) kusukela ku-(khetha uNyaka,Inyanga,ibala(*) njenge-c1 eqenjini elisebenza ngesikhathi ngoNyaka,Inyanga) a;

I-Q12
KHETHA OriginCityName, DestCityName, count(*) AS c KUSUKA ngesikhathi GROUP BY OriginCityName, DestCityName ORDER BY c DESC LIMIT 10;

I-Q13
KHETHA I-OriginCityName, bala(*) NJENGOBA c KUSUKA ngesikhathi GROUP BY OriginCityName ORDER BY c DESC LIMIT 10;

Umbuzo Uqukethe Ukujoyina

I-Q14
KHETHA a.Year, c1/c2 KUSUKA ( khetha Year, count()1000 njengo-c1 kusukela ngesikhathi LAPHO DepDelay>10 IQEMBU NGONYAKA) UKUJOYINA NGAPHAKATHI (khetha Unyaka, bala(*) njengo-c2 kusukela ngesikhathi GROUP BY Year ) b ngo-A.Year=b.UKUHLELA NGONYAKA NGONYAKA;

I-Q15
KHETHA a.”Unyaka”, c1/c2 KUSUKA ( khetha “Unyaka”, count()1000 njengo-c1 KUSUKA KU-fontime LAPHO “DepDelay”>10 IQEMBU NGOKUBA “Ngonyaka”) JOYINA NGAPHAKATHI (khetha “Unyaka”, bala(*) njengo-c2 UKUSUKA EQEMBUNI LE-fontime “Ngonyaka” ) b ngo-a.”Year”=b. "Unyaka";

Ithebula-1: Imibuzo esetshenziswe kubhentshimakhi

Ukwenziwa kombuzo

Nansi imiphumela yombuzo ngamunye uma isetshenziswa kuzilungiselelo zesizindalwazi ezihlukene: I-PostgreSQL enezinkomba nangaphandle kwayo, i-ClickHouse yomdabu kanye ne-clickhousedb_fdw. Isikhathi siboniswa ngama-millisecond.

Q#
I-PostgreSQL
I-PostgreSQL (Inkomba)
ChofozaHouse
clickhousedb_fdw

Q1
27920
19634
23
57

Q2
35124
17301
50
80

Q3
34046
15618
67
115

Q4
31632
7667
25
37

Q5
47220
8976
27
60

Q6
58233
24368
55
153

Q7
30566
13256
52
91

Q8
38309
60511
112
179

Q9
20674
37979
31
81

I-Q10
34990
20102
56
148

I-Q11
30489
51658
37
155

I-Q12
39357
33742
186
1333

I-Q13
29912
30709
101
384

I-Q14
54126
39913
124
1364212

I-Q15
97258
30211
245
259

Ithebula-1: Isikhathi esithathwayo ukwenza imibuzo esetshenziswe kubhentshimakhi

Buka imiphumela

Igrafu ibonisa isikhathi sokwenza kombuzo ngama-millisecond, i-eksisi engu-X ibonisa inombolo yombuzo evela kumathebula angenhla, futhi i-eksisi engu-Y ibonisa isikhathi sokwenza ngama-millisecond. Imiphumela ye-ClickHouse nedatha ebuyisiwe kuma-postgres kusetshenziswa i-clickhousedb_fdw iyaboniswa. Etafuleni ungabona ukuthi kunomehluko omkhulu phakathi kwe-PostgreSQL ne-ClickHouse, kodwa umehluko omncane phakathi kwe-ClickHouse ne-clickhousedb_fdw.

Ihlola ukusebenza kwemibuzo yokuhlaziya ku-PostgreSQL, ClickHouse kanye ne-clickhousedb_fdw (PostgreSQL)

Le grafu ibonisa umehluko phakathi kwe-ClickhouseDB ne-clickhousedb_fdw. Emibuzweni eminingi, i-FDW ephezulu ayiphezulu kangako futhi ayibalulekile kangako ngaphandle kwe-Q12. Lo mbuzo uhlanganisa okuhlanganisayo kanye nesigatshana esithi ORDER BY. Ngenxa ye-ORDER BY GROUP/BY clause, i-ORDER BY akwehli ku-ClickHouse.

Kuthebula lesi-2 sibona ukugxuma kwesikhathi emibuzweni engu-Q12 ne-Q13. Futhi, lokhu kubangelwa isigatshana esithi ORDER BY. Ukuqinisekisa lokhu, ngibuze imibuzo Q-14 kanye ne-Q-15 ngesigatshana se-ORDER BY nangaphandle kwayo. Ngaphandle kwe-ORDER BY clause isikhathi sokuqeda singu-259ms futhi nge-ORDER BY clause ithi 1364212. Ukuze ulungise lo mbuzo ngichaza yomibili imibuzo futhi nayi imiphumela yencazelo.

Q15: Ngaphandle KOMYALELO NGESIgatshana

bm=# EXPLAIN VERBOSE SELECT a."Year", c1/c2 
     FROM (SELECT "Year", count(*)*1000 AS c1 FROM fontime WHERE "DepDelay" > 10 GROUP BY "Year") a
     INNER JOIN(SELECT "Year", count(*) AS c2 FROM fontime GROUP BY "Year") b ON a."Year"=b."Year";

Q15: Umbuzo Ngaphandle Kwe-ODA NGESIgatshana

QUERY PLAN                                                      
Hash Join  (cost=2250.00..128516.06 rows=50000000 width=12)  
Output: fontime."Year", (((count(*) * 1000)) / b.c2)  
Inner Unique: true   Hash Cond: (fontime."Year" = b."Year")  
->  Foreign Scan  (cost=1.00..-1.00 rows=100000 width=12)        
Output: fontime."Year", ((count(*) * 1000))        
Relations: Aggregate on (fontime)        
Remote SQL: SELECT "Year", (count(*) * 1000) FROM "default".ontime WHERE (("DepDelay" > 10)) GROUP BY "Year"  
->  Hash  (cost=999.00..999.00 rows=100000 width=12)        
Output: b.c2, b."Year"        
->  Subquery Scan on b  (cost=1.00..999.00 rows=100000 width=12)              
Output: b.c2, b."Year"              
->  Foreign Scan  (cost=1.00..-1.00 rows=100000 width=12)                    
Output: fontime_1."Year", (count(*))                    
Relations: Aggregate on (fontime)                    
Remote SQL: SELECT "Year", count(*) FROM "default".ontime GROUP BY "Year"(16 rows)

Q14: Umbuzo NGE-ORDER BY Clause

bm=# EXPLAIN VERBOSE SELECT a."Year", c1/c2 FROM(SELECT "Year", count(*)*1000 AS c1 FROM fontime WHERE "DepDelay" > 10 GROUP BY "Year") a 
     INNER JOIN(SELECT "Year", count(*) as c2 FROM fontime GROUP BY "Year") b  ON a."Year"= b."Year" 
     ORDER BY a."Year";

Q14: Uhlelo Lwemibuzo NGE-ORDER BY Clause

QUERY PLAN 
Merge Join  (cost=2.00..628498.02 rows=50000000 width=12)   
Output: fontime."Year", (((count(*) * 1000)) / (count(*)))   
Inner Unique: true   Merge Cond: (fontime."Year" = fontime_1."Year")   
->  GroupAggregate  (cost=1.00..499.01 rows=1 width=12)        
Output: fontime."Year", (count(*) * 1000)         
Group Key: fontime."Year"         
->  Foreign Scan on public.fontime  (cost=1.00..-1.00 rows=100000 width=4)               
Remote SQL: SELECT "Year" FROM "default".ontime WHERE (("DepDelay" > 10)) 
            ORDER BY "Year" ASC   
->  GroupAggregate  (cost=1.00..499.01 rows=1 width=12)         
Output: fontime_1."Year", count(*)         Group Key: fontime_1."Year"         
->  Foreign Scan on public.fontime fontime_1  (cost=1.00..-1.00 rows=100000 width=4) 
              
Remote SQL: SELECT "Year" FROM "default".ontime ORDER BY "Year" ASC(16 rows)

isiphetho

Imiphumela yalokhu kuhlola ibonisa ukuthi i-ClickHouse inikeza ukusebenza okuhle ngempela, futhi i-clickhousedb_fdw inikeza izinzuzo zokusebenza kwe-ClickHouse kusuka ku-PostgreSQL. Nakuba kune-overhead ethile uma usebenzisa i-clickhousedb_fdw, ayinalutho futhi iqhathaniswa nokusebenza okuzuzwe ngokusebenza ngokomdabu kusizindalwazi se-ClickHouse. Lokhu futhi kuqinisekisa ukuthi i-fdw ku-PostgreSQL inikeza imiphumela emihle kakhulu.

Ingxoxo yocingo ngeClickhouse https://t.me/clickhouse_ru
Ingxoxo yocingo usebenzisa i-PostgreSQL https://t.me/pgsql

Source: www.habr.com

Engeza amazwana