Ke hoʻāʻo nei i ka hana o nā nīnau noiʻi ma PostgreSQL, ClickHouse a me clickhousedb_fdw (PostgreSQL)

Ma kēia haʻawina, makemake au e ʻike i ka hoʻomaikaʻi ʻana i ka hana e hiki ke hoʻokō ʻia me ka hoʻohana ʻana i kahi kumu ʻikepili ClickHouse ma mua o PostgreSQL. ʻIke wau i nā pōmaikaʻi huahana aʻu e loaʻa ai mai ka hoʻohana ʻana iā ClickHouse. E hoʻomau anei kēia mau pōmaikaʻi inā loaʻa iaʻu ka ClickHouse mai PostgreSQL me ka hoʻohana ʻana i kahi Foreign Data Wrapper (FDW)?

Исследуемыми средами баз данных являются PostgreSQL v11, clickhousedb_fdw и база данных ClickHouse. В конечном счете, из PostgreSQL v11 мы будем запускать различные SQL-запросы, маршрутизируемые через наш clickhousedb_fdw в базу данных ClickHouse. Затем мы увидим, как производительность FDW сравнивается с теми же запросами, выполняемыми в нативном PostgreSQL и нативном ClickHouse.

Clickhouse Database

ʻO ClickHouse kahi ʻōnaehana hoʻokele waihona waihona kolamu open source e hiki ai ke hoʻokō i ka hana 100-1000 mau manawa wikiwiki ma mua o nā hoʻokokoke ʻikepili kuʻuna, hiki ke hana ma luna o kahi piliona lālani ma lalo o kekona.

Clickhousedb_fdw

clickhousedb_fdw - ʻO ka wīwī ʻikepili waho no ka waihona ClickHouse, a i ʻole FDW, he papahana open source mai Percona. Eia kahi loulou i ka waihona waihona GitHub o ka papahana.

Ma Malaki ua kākau wau i kahi blog e haʻi hou aku iā ʻoe e pili ana i kā mākou FDW.

E like me kāu e ʻike ai, hāʻawi kēia i kahi FDW no ClickHouse e ʻae iā SELECT mai, a INSERT INTO, ka waihona ClickHouse mai ka kikowaena PostgreSQL v11.

Kākoʻo ʻo FDW i nā hiʻohiʻona holomua e like me ka hui a hui. Hoʻomaikaʻi nui kēia i ka hana ma o ka hoʻohana ʻana i nā kumuwaiwai o ke kikowaena mamao no kēia mau hana hoʻoikaika waiwai.

Kaiapuni benchmark

  • kikowaena Supermicro:
    • Intel® Xeon® CPU E5-2683 v3 @ 2.00GHz
    • 2 kumu / 28 cores / 56 kaula
    • ʻApau: 256GB o RAM
    • Waihona: Samsung SM863 1.9TB Enterprise SSD
    • Pūnaehana waihona: ext4/xfs
  • OS: Linux smblade01 4.15.0-42-generic #45~16.04.1-Ubuntu
  • PostgreSQL: mana 11

Nā hoʻāʻo pae ʻāina

Ma kahi o ka hoʻohana ʻana i kekahi ʻikepili i hana ʻia e ka mīkini no kēia hoʻāʻo, ua hoʻohana mākou i ka ʻikepili "Productivity by Time Reported Operator Time" mai 1987 a 2018. Hiki iā ʻoe ke komo i ka ʻikepili e hoʻohana ana i kā mākou palapala i loaʻa ma aneʻi.

ʻO 85 GB ka nui o ka waihona, e hāʻawi ana i hoʻokahi papa o 109 kolamu.

Nā nīnau hoʻohālikelike

Eia nā nīnau aʻu i hoʻohana ai e hoʻohālikelike iā ClickHouse, clickhousedb_fdw a me PostgreSQL.

Q#
Loaʻa i ka nīnau nā hui a me ka hui e

Q1
E koho i ka LaOfWeek, helu(*) AS c FROM ka manawa WHERE Makahiki >= 2000 ME Makahiki <= 2008 GROUP BY DayOfWeek ORDER BY c DESC;

Q2
E koho i ka LaOfWeek, helu(*) AS c MAI ka manawa WHERE DepDelay>10 ME Makahiki >= 2000 ME Makahiki <= 2008 GROUP BY DayOfWeek ORDER BY c DESC;

Q3
SELECT Origin, helu(*) AS c FROM ontime WHERE DepDelay>10 AND Makahiki >= 2000 AND Makahiki <= 2008 GROUP BY Origin ORDER BY c DESC LIMIT 10;

Q4
E koho i ka mea lawe, helu() MAI ka manawa WHERE DepDelay>10 ME Makahiki = 2007 GROUP MA KA Mea lawe KAUKAU MA KA helu() DESC;

Q5
KOHO a.Kai lawe, c, c2, c1000/c2 e like me c3 MAI ( WAIWAI lawe, helu() AS c FROM ontime WHERE DepDelay>10 AND Year=2007 GROUP BY Carrier ) a INNER JOIN ( SELECT Carrier, helu(*) AS c2 FROM ontime WHERE Makahiki=2007 GROUP BY Carrier)b on a.Carrier=b.Carrier ORDER NA c3 DESC;

Q6
KOHO a.Kai lawe, c, c2, c1000/c2 e like me c3 MAI ( WAIWAI lawe, helu() AS c MAI ka manawa WHERE DepDelay>10 ME Makahiki >= 2000 A ME Makahiki <= 2008 GROUP BY Carrier) a INNER JOIN ( SELECT Carrier, helu(*) AS c2 FROM ontime WHERE Makahiki >= 2000 AND Makahiki <= 2008 GROUP BY Lawelawe ) b on a.Carrier=b.Carrier KAUOHA NA c3 DESC;

Q7
E koho i ka mea lawe, avg(DepDelay) * 1000 AS c3 MAI ka manawa WHERE Makahiki >= 2000 A ME Makahiki <= 2008 GROUP BY Carrier;

Q8
SELECT Makahiki, avg(DepDelay) FROM GROUP GROUP MA KA Makahiki;

Q9
koho Makahiki, helu(*) e like me c1 mai ka hui manawa ma ka Makahiki;

Q10
SELECT avg(cnt) FROM (KOHO Makahiki,Mahina, helu(*) AS cnt FROM ontime WHERE DepDel15=1 GROUP MA Makahiki,Mahina) a;

Q11
koho avg(c1) mai (koho Makahiki,Mahina, helu(*) e like me c1 mai ka hui manawa ma ka Makahiki,Mahina) a;

Q12
E KOHO OriginCityName, DestCityName, helu(*) AS c MAI ka manawa manawa GROUP BY OriginCityName, DestCityName ORDER BY c DESC LIMIT 10;

Q13
E koho i ka OriginCityName, helu(*) AS c mai ka manawa manawa GROUP BY OriginCityName ORDER BY c DESC LIMIT 10;

Loaʻa i ka nīnau nā hui

Q14
E koho i ka makahiki, c1/c2 MAI ( koho Makahiki, helu()1000 like c1 mai ka manawa WHERE DepDelay>10 GROUP BY Makahiki) a INNER JOIN (koho Makahiki, helu(*) as c2 from ontime GROUP BY Makahiki ) b on a.Year=b.Year ORDER BY a.Year;

Q15
E koho i ka "makahiki", c1/c2 mai ( koho "makahiki", helu()1000 as c1 FROM fontime WHERE "DepDelay">10 GROUP MA "Year") a INNER JOIN (koho i "Year", helu(*) as c2 FROM fontime GROUP BY "Year") b ma a."Year"=b. "Makahiki";

Papa-1: Nā nīnau i hoʻohana ʻia ma ka pae ʻāina

Nā hoʻokō nīnau

Eia nā hopena o kēlā me kēia nīnau i ka wā e holo ai i nā hoʻonohonoho waihona waihona ʻokoʻa: PostgreSQL me ka ʻole a i ʻole nā ​​kuhikuhi, ClickHouse maoli a me clickhousedb_fdw. Hōʻike ʻia ka manawa ma nā milliseconds.

Q#
PostgreSQL
PostgreSQL (i kuhikuhi ʻia)
KaomiHouse
clickhousedb_fdw

Q1
27920
19634
23
57

Q2
35124
17301
50
80

Q3
34046
15618
67
115

Q4
31632
7667
25
37

Q5
47220
8976
27
60

Q6
58233
24368
55
153

Q7
30566
13256
52
91

Q8
38309
60511
112
179

Q9
20674
37979
31
81

Q10
34990
20102
56
148

Q11
30489
51658
37
155

Q12
39357
33742
186
1333

Q13
29912
30709
101
384

Q14
54126
39913
124
1364212

Q15
97258
30211
245
259

Papa-1: Ka manawa i lawe ʻia no ka hoʻokō ʻana i nā nīnau i hoʻohana ʻia ma ka benchmark

Nānā i nā hopena

Hōʻike ka pakuhi i ka manawa hoʻokō nīnau ma nā milliseconds, hōʻike ke axis X i ka helu nīnau mai nā papa ma luna, a hōʻike ka axis Y i ka manawa hoʻokō i nā milliseconds. Hōʻike ʻia nā hopena ClickHouse a me nā ʻikepili i kiʻi ʻia mai nā postgres me ka hoʻohana ʻana i clickhousedb_fdw. Mai ka papaʻaina hiki iā ʻoe ke ʻike he ʻokoʻa nui ma waena o PostgreSQL a me ClickHouse, akā ʻokoʻa liʻiliʻi ma waena o ClickHouse a me clickhousedb_fdw.

Ke hoʻāʻo nei i ka hana o nā nīnau noiʻi ma PostgreSQL, ClickHouse a me clickhousedb_fdw (PostgreSQL)

Hōʻike kēia pakuhi i ka ʻokoʻa ma waena o ClickhouseDB a me clickhousedb_fdw. I ka nui o nā nīnau, ʻaʻole kiʻekiʻe ka FDW ma luna a ʻaʻole i koʻikoʻi koe wale nō no Q12. Aia kēia nīnau i ka hui pū ʻana a me ka paukū ORDER BY. Ma muli o ka paukū ORDER BY GROUP/BY, ʻaʻole hāʻule ʻo ORDER BY i ClickHouse.

Ma ka Papa 2, ʻike mākou i ka lele ʻana o ka manawa ma nā nīnau Q12 a me Q13. Eia hou, ma muli o ka paukū ORDER BY. No ka hōʻoia ʻana i kēia, ua holo au i nā nīnau Q-14 a me Q-15 me ka ʻole o ka paukū ORDER BY. Me ka ʻole o ka paukū ORDER BY ʻo ka manawa hoʻopau he 259ms a me ka paukū ORDER BY ʻo 1364212. No ka debug i kēia nīnau ke wehewehe nei au i nā nīnau ʻelua a eia nā hopena o ka wehewehe.

Q15: Me ke kauoha ʻole ma ka paukū

bm=# EXPLAIN VERBOSE SELECT a."Year", c1/c2 
     FROM (SELECT "Year", count(*)*1000 AS c1 FROM fontime WHERE "DepDelay" > 10 GROUP BY "Year") a
     INNER JOIN(SELECT "Year", count(*) AS c2 FROM fontime GROUP BY "Year") b ON a."Year"=b."Year";

Q15: Nīnau me ke kauoha ʻole ma ka paukū

QUERY PLAN                                                      
Hash Join  (cost=2250.00..128516.06 rows=50000000 width=12)  
Output: fontime."Year", (((count(*) * 1000)) / b.c2)  
Inner Unique: true   Hash Cond: (fontime."Year" = b."Year")  
->  Foreign Scan  (cost=1.00..-1.00 rows=100000 width=12)        
Output: fontime."Year", ((count(*) * 1000))        
Relations: Aggregate on (fontime)        
Remote SQL: SELECT "Year", (count(*) * 1000) FROM "default".ontime WHERE (("DepDelay" > 10)) GROUP BY "Year"  
->  Hash  (cost=999.00..999.00 rows=100000 width=12)        
Output: b.c2, b."Year"        
->  Subquery Scan on b  (cost=1.00..999.00 rows=100000 width=12)              
Output: b.c2, b."Year"              
->  Foreign Scan  (cost=1.00..-1.00 rows=100000 width=12)                    
Output: fontime_1."Year", (count(*))                    
Relations: Aggregate on (fontime)                    
Remote SQL: SELECT "Year", count(*) FROM "default".ontime GROUP BY "Year"(16 rows)

Q14: Nīnau me ORDER BY Clause

bm=# EXPLAIN VERBOSE SELECT a."Year", c1/c2 FROM(SELECT "Year", count(*)*1000 AS c1 FROM fontime WHERE "DepDelay" > 10 GROUP BY "Year") a 
     INNER JOIN(SELECT "Year", count(*) as c2 FROM fontime GROUP BY "Year") b  ON a."Year"= b."Year" 
     ORDER BY a."Year";

P14: Hoʻolālā Nīnau me ORDER BY Clause

QUERY PLAN 
Merge Join  (cost=2.00..628498.02 rows=50000000 width=12)   
Output: fontime."Year", (((count(*) * 1000)) / (count(*)))   
Inner Unique: true   Merge Cond: (fontime."Year" = fontime_1."Year")   
->  GroupAggregate  (cost=1.00..499.01 rows=1 width=12)        
Output: fontime."Year", (count(*) * 1000)         
Group Key: fontime."Year"         
->  Foreign Scan on public.fontime  (cost=1.00..-1.00 rows=100000 width=4)               
Remote SQL: SELECT "Year" FROM "default".ontime WHERE (("DepDelay" > 10)) 
            ORDER BY "Year" ASC   
->  GroupAggregate  (cost=1.00..499.01 rows=1 width=12)         
Output: fontime_1."Year", count(*)         Group Key: fontime_1."Year"         
->  Foreign Scan on public.fontime fontime_1  (cost=1.00..-1.00 rows=100000 width=4) 
              
Remote SQL: SELECT "Year" FROM "default".ontime ORDER BY "Year" ASC(16 rows)

hopena

Hōʻike nā hopena o kēia mau hoʻokolohua e hāʻawi ana ʻo ClickHouse i ka hana maikaʻi loa, a hāʻawi ʻo clickhousedb_fdw i nā pono hana o ClickHouse mai PostgreSQL. ʻOiai aia kekahi ma luna o ka hoʻohana ʻana i ka clickhousedb_fdw, he mea ʻole a hoʻohālikelike ʻia i ka hana i loaʻa ma ka holo ʻana ma ka waihona ClickHouse. Hōʻoia pū kēia i ka hāʻawi ʻana o fdw ma PostgreSQL i nā hopena maikaʻi loa.

Kūkākūkā Telegram ma o Clickhouse https://t.me/clickhouse_ru
Kūkākūkā Telegram me ka hoʻohana ʻana iā PostgreSQL https://t.me/pgsql

Source: www.habr.com

Pākuʻi i ka manaʻo hoʻopuka