Ko nga PTM o naianei he maha nga waahanga. He maha nga tau, e tuku patai ana nga tono ki nga papaunga raraunga i roto i te whakarara. Mena he patai ripoata mo nga rarangi maha i roto i te ripanga, ka tere ake i te whakamahi i nga PTM maha, a kua taea e PostgreSQL te mahi mai i te putanga 9.6.
E 3 tau te roa ki te whakatinana i te ahuatanga uiui whakarara - me tuhi ano i te waehere i nga waahanga rereke o te mahi uiui. I whakauruhia e PostgreSQL 9.6 nga hanganga hei whakapai ake i te waehere. I roto i nga putanga ka whai ake, ko etahi atu momo patai ka mahia whakarara.
Whakataunga
Kaua e whakaahei te mahi whakarara mena kei te pukumahi kee nga matua katoa, ki te kore ka puhoi etahi atu tono.
Ko te mea nui, ko te tukatuka whakarara me nga uara WORK_MEM teitei e whakamahi ana i te maha o nga mahara - ka uru mai ia hash me te whakahiato ka mau te mahara work_mem.
Ko nga patai OLTP iti e kore e taea te whakatere ma te mahi whakarara. A, ki te hoki mai te patai ki tetahi rarangi, ka whakaroa noa te tukatuka whakarara.
He pai nga Kaihanga ki te whakamahi i te tohu tohu TPC-H. He rite ano pea o patai mo te mahi whakarara tino pai.
Ko nga uiui KORERO anake karekau he raka tohu ka mahia whakarara.
I etahi wa he pai ake te taupū tika i te matawai ripanga raupapa i roto i te aratau whakarara.
Ko te tatari mo nga patai me nga pehu kaore e tautokohia.
Ko nga mahi matapihi me nga mahi whakahiato kua whakaritea kaore i te whakarara.
Kaore koe e whiwhi i tetahi mea i roto i te kawenga mahi I/O.
Karekau he taurite algorithms wehewehe. Engari ko nga patai me nga momo ka taea te mahi whakarara i etahi waahanga.
Whakakapihia te CTE (ME ...) ki te SELECT kohanga kia taea ai te tukatuka whakarara.
Ko nga takai raraunga tuatoru kaore ano kia tautoko i te tukatuka whakarara (engari ka taea!)
FULL OUTER JOIN e kore e tautokona.
max_rows ka mono i te tukatuka whakarara.
Mēnā he taumahi tā tētahi uiui kāore i tohungia PARALLEL SAFE, ka kotahi te miro.
Ko te SERIALIZABLE tauwhitinga taumata wehe ka mono i te tukatuka whakarara.
Taiao Whakamātautau
I ngana nga kaihanga PostgreSQL ki te whakaiti i te wa whakautu o nga patai tohu tohu TPC-H. Tikiake i te tohu tohu me te urutau ki te PostgreSQL. He whakamahinga kore mana tenei mo te tohu tohu TPC-H - ehara mo te whakatauritenga raraunga, taputapu taputapu ranei.
Whakaingoa ano te makefile.suite ki Makefile ka huri i te whakaahuatanga i konei: https://github.com/tvondra/pg_tpch . Whakahiato te waehere me te whakahau hanga.
Hanga raraunga: ./dbgen -s 10 hanga he 23 GB pātengi raraunga. He nui tenei ki te kite i te rereketanga o te mahi o nga uiui whakarara me te kore whakarara.
Tahuri kōnae tbl в csv с for и sed.
Kawa te putunga pg_tpch me te kape i nga konae csv в pg_tpch/dss/data.
Waihangahia nga patai me te whakahau qgen.
Utaina nga raraunga ki te papaarangi me te whakahau ./tpch.sh.
Matawai raupapa whakarara
He tere ake pea ehara na te panui whakarara, engari na te mea kua horahia nga raraunga ki nga tini CPU. I roto i nga punaha whakahaere hou, he pai te keteroki o nga konae raraunga PostgreSQL. Ma te panui i mua, ka taea te tiki poraka nui ake i te rokiroki atu i nga tono daemon PG. No reira, karekau te mahinga uiui e whakawhäitihia e te kōpae I/O. Ka pau nga huringa CPU ki:
panuihia nga rarangi kotahi i ia wa mai i nga wharangi ripanga;
whakaritea nga uara aho me nga tikanga WHERE.
Me whakahaere he patai ngawari select:
tpch=# explain analyze select l_quantity as sum_qty from lineitem where l_shipdate <= date '1998-12-01' - interval '105' day;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Seq Scan on lineitem (cost=0.00..1964772.00 rows=58856235 width=5) (actual time=0.014..16951.669 rows=58839715 loops=1)
Filter: (l_shipdate <= '1998-08-18 00:00:00'::timestamp without time zone)
Rows Removed by Filter: 1146337
Planning Time: 0.203 ms
Execution Time: 19035.100 ms
He maha rawa nga rarangi ka puta mai i te karapa raupapa me te kore whakahiato, no reira ka mahia te uiui e te matua PTM kotahi.
Mena he taapiri SUM(), ka kite koe ka awhina nga rerengamahi e rua kia tere ake te patai:
explain analyze select sum(l_quantity) as sum_qty from lineitem where l_shipdate <= date '1998-12-01' - interval '105' day;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=1589702.14..1589702.15 rows=1 width=32) (actual time=8553.365..8553.365 rows=1 loops=1)
-> Gather (cost=1589701.91..1589702.12 rows=2 width=32) (actual time=8553.241..8555.067 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=1588701.91..1588701.92 rows=1 width=32) (actual time=8547.546..8547.546 rows=1 loops=3)
-> Parallel Seq Scan on lineitem (cost=0.00..1527393.33 rows=24523431 width=5) (actual time=0.038..5998.417 rows=19613238 loops=3)
Filter: (l_shipdate <= '1998-08-18 00:00:00'::timestamp without time zone)
Rows Removed by Filter: 382112
Planning Time: 0.241 ms
Execution Time: 8555.131 ms
Te whakahiato whakarara
Ko te parallel Seq Scan node ka whakaputa rarangi mo te whakahiato wahanga. Ko te "Partial Aggregate" node ka kuti i enei raina ma te whakamahi SUM(). I te mutunga, ka kohia te porotiti SUM mai ia tukanga kaimahi e te pona "Kohikohi".
Ko te hua whakamutunga ka tatauhia e te "Whakamutunga Whakakotahi" node. Mena kei a koe ake nga mahi whakahiato, kaua e wareware ki te tohu hei "haumaru whakarara".
Te maha o nga tukanga kaimahi
Ka taea te whakanui ake i te maha o nga tukanga kaimahi me te kore e whakaara ano i te tūmau:
explain analyze select sum(l_quantity) as sum_qty from lineitem where l_shipdate <= date '1998-12-01' - interval '105' day;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=1589702.14..1589702.15 rows=1 width=32) (actual time=8553.365..8553.365 rows=1 loops=1)
-> Gather (cost=1589701.91..1589702.12 rows=2 width=32) (actual time=8553.241..8555.067 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=1588701.91..1588701.92 rows=1 width=32) (actual time=8547.546..8547.546 rows=1 loops=3)
-> Parallel Seq Scan on lineitem (cost=0.00..1527393.33 rows=24523431 width=5) (actual time=0.038..5998.417 rows=19613238 loops=3)
Filter: (l_shipdate <= '1998-08-18 00:00:00'::timestamp without time zone)
Rows Removed by Filter: 382112
Planning Time: 0.241 ms
Execution Time: 8555.131 ms
He aha kei konei? E 2 nga wa ka nui ake nga tukanga mahi, a ko te tono ka 1,6599 nga wa tere ake. He rawe nga tatauranga. E rua nga tukanga kaimahi me te rangatira kotahi. Whai muri i te huringa ka 2+1.
To tatou tere tere mai i te tukatuka whakarara: 5/3 = 1,66(6) wa.
Ka pēhea te mahi i te reira?
Nga tukanga
Ko te tono mahi ka timata tonu me te tukanga matua. Ka mahia e te kaiarahi nga mea katoa kaore i te whakarara me etahi mahi whakarara. Ko etahi atu tukanga e rite ana nga tono ka kiia ko nga tukanga kaimahi. Ka whakamahia e te tukatuka whakarara nga hanganga nga tukanga kaimahi papamuri hihiri (mai i te putanga 9.4). I te mea ko etahi atu waahanga o PostgreSQL e whakamahi ana i nga tikanga kaore i nga miro, ko te patai me nga tukanga kaimahi 3 ka 4 nga wa tere atu i te tukatuka tuku iho.
Te taunekeneke
Ko nga mahi a nga kaimahi ka whakawhitiwhiti korero ki te kaihautu ma te rarangi karere (i runga i te mahara tahi). E rua nga rarangi o ia tukanga: mo nga hapa me nga tuple.
Ia wa e 3 nga wa nui ake te tepu min_parallel_(index|table)_scan_size, Ka taapirihia e Postgres he tukanga kaimahi. Ko te maha o nga rerenga mahi kaore i runga i nga utu. Ko te whakawhirinaki porohita ka uaua nga whakatinanatanga uaua. Engari, ka whakamahia e te mahere nga ture ngawari.
I roto i te mahi, kaore enei ture e pai mo te hanga, na ka taea e koe te whakarereke i te maha o nga tukanga kaimahi mo tetahi ripanga motuhake: ALTER TABLE ... SET (parallel_workers = N).
He aha i kore ai e whakamahia te tukatuka whakarara?
I tua atu i te rarangi roa o nga here, kei reira ano nga arowhai utu:
parallel_setup_cost - ki te karo i te tukatuka whakarara o nga tono poto. Ka whakatauhia e tenei tawhā te wa ki te whakarite i te mahara, te tiimata i te mahi, me te whakawhiti raraunga tuatahi.
parallel_tuple_cost: Ko nga korero i waenganui i te kaihautu me nga kaimahi ka taea te whakaroa i runga i te rahi o nga tuple mai i nga mahi mahi. Ka tatauhia e tenei tawhā te utu mo te whakawhiti raraunga.
Kohanga Kohanga Hono
PostgreSQL 9.6+ может выполнять вложенные циклы параллельно — это простая операция.
explain (costs off) select c_custkey, count(o_orderkey)
from customer left outer join orders on
c_custkey = o_custkey and o_comment not like '%special%deposits%'
group by c_custkey;
QUERY PLAN
--------------------------------------------------------------------------------------
Finalize GroupAggregate
Group Key: customer.c_custkey
-> Gather Merge
Workers Planned: 4
-> Partial GroupAggregate
Group Key: customer.c_custkey
-> Nested Loop Left Join
-> Parallel Index Only Scan using customer_pkey on customer
-> Index Scan using idx_orders_custkey on orders
Index Cond: (customer.c_custkey = o_custkey)
Filter: ((o_comment)::text !~~ '%special%deposits%'::text)
Ka puta te kohinga i te wahanga whakamutunga, na reira he mahi whakarara te Nested Loop Left Join. Ko te Taurangi Whakarara Anake I whakauruhia ki te putanga 10 anake. He rite te mahi ki te matawai rangatū whakarara. Tikanga c_custkey = o_custkey panui kotahi ota mo ia aho kiritaki. Na ehara i te mea whakarara.
Hash Hono
Ka hangaia e ia mahinga kaimahi tana ake ripanga hash tae noa ki te PostgreSQL 11. A, ki te neke atu i te wha o enei mahinga, kaore e pai ake te mahi. I roto i te putanga hou, ka tohatohahia te ripanga hash. Ka taea e ia mahinga kaimahi te whakamahi i te WORK_MEM ki te hanga ripanga hash.
select
l_shipmode,
sum(case
when o_orderpriority = '1-URGENT'
or o_orderpriority = '2-HIGH'
then 1
else 0
end) as high_line_count,
sum(case
when o_orderpriority <> '1-URGENT'
and o_orderpriority <> '2-HIGH'
then 1
else 0
end) as low_line_count
from
orders,
lineitem
where
o_orderkey = l_orderkey
and l_shipmode in ('MAIL', 'AIR')
and l_commitdate < l_receiptdate
and l_shipdate < l_commitdate
and l_receiptdate >= date '1996-01-01'
and l_receiptdate < date '1996-01-01' + interval '1' year
group by
l_shipmode
order by
l_shipmode
LIMIT 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1964755.66..1964961.44 rows=1 width=27) (actual time=7579.592..7922.997 rows=1 loops=1)
-> Finalize GroupAggregate (cost=1964755.66..1966196.11 rows=7 width=27) (actual time=7579.590..7579.591 rows=1 loops=1)
Group Key: lineitem.l_shipmode
-> Gather Merge (cost=1964755.66..1966195.83 rows=28 width=27) (actual time=7559.593..7922.319 rows=6 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Partial GroupAggregate (cost=1963755.61..1965192.44 rows=7 width=27) (actual time=7548.103..7564.592 rows=2 loops=5)
Group Key: lineitem.l_shipmode
-> Sort (cost=1963755.61..1963935.20 rows=71838 width=27) (actual time=7530.280..7539.688 rows=62519 loops=5)
Sort Key: lineitem.l_shipmode
Sort Method: external merge Disk: 2304kB
Worker 0: Sort Method: external merge Disk: 2064kB
Worker 1: Sort Method: external merge Disk: 2384kB
Worker 2: Sort Method: external merge Disk: 2264kB
Worker 3: Sort Method: external merge Disk: 2336kB
-> Parallel Hash Join (cost=382571.01..1957960.99 rows=71838 width=27) (actual time=7036.917..7499.692 rows=62519 loops=5)
Hash Cond: (lineitem.l_orderkey = orders.o_orderkey)
-> Parallel Seq Scan on lineitem (cost=0.00..1552386.40 rows=71838 width=19) (actual time=0.583..4901.063 rows=62519 loops=5)
Filter: ((l_shipmode = ANY ('{MAIL,AIR}'::bpchar[])) AND (l_commitdate < l_receiptdate) AND (l_shipdate < l_commitdate) AND (l_receiptdate >= '1996-01-01'::date) AND (l_receiptdate < '1997-01-01 00:00:00'::timestamp without time zone))
Rows Removed by Filter: 11934691
-> Parallel Hash (cost=313722.45..313722.45 rows=3750045 width=20) (actual time=2011.518..2011.518 rows=3000000 loops=5)
Buckets: 65536 Batches: 256 Memory Usage: 3840kB
-> Parallel Seq Scan on orders (cost=0.00..313722.45 rows=3750045 width=20) (actual time=0.029..995.948 rows=3000000 loops=5)
Planning Time: 0.977 ms
Execution Time: 7923.770 ms
Ko te Uiui 12 mai i TPC-H e whakaatu marama ana i te hononga hash whakarara. Ka whai waahi nga mahi a ia kaimahi ki te hanga i tetahi ripanga hash noa.
Hanumi Hono
Ko te whakakotahitanga he mea whakarara kore. Kaua e manukanuka mena koinei te taahiraa whakamutunga o te patai - ka taea tonu te whakahaere whakarara.
-- Query 2 from TPC-H
explain (costs off) select s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone, s_comment
from part, supplier, partsupp, nation, region
where
p_partkey = ps_partkey
and s_suppkey = ps_suppkey
and p_size = 36
and p_type like '%BRASS'
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = 'AMERICA'
and ps_supplycost = (
select
min(ps_supplycost)
from partsupp, supplier, nation, region
where
p_partkey = ps_partkey
and s_suppkey = ps_suppkey
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = 'AMERICA'
)
order by s_acctbal desc, n_name, s_name, p_partkey
LIMIT 100;
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Limit
-> Sort
Sort Key: supplier.s_acctbal DESC, nation.n_name, supplier.s_name, part.p_partkey
-> Merge Join
Merge Cond: (part.p_partkey = partsupp.ps_partkey)
Join Filter: (partsupp.ps_supplycost = (SubPlan 1))
-> Gather Merge
Workers Planned: 4
-> Parallel Index Scan using <strong>part_pkey</strong> on part
Filter: (((p_type)::text ~~ '%BRASS'::text) AND (p_size = 36))
-> Materialize
-> Sort
Sort Key: partsupp.ps_partkey
-> Nested Loop
-> Nested Loop
Join Filter: (nation.n_regionkey = region.r_regionkey)
-> Seq Scan on region
Filter: (r_name = 'AMERICA'::bpchar)
-> Hash Join
Hash Cond: (supplier.s_nationkey = nation.n_nationkey)
-> Seq Scan on supplier
-> Hash
-> Seq Scan on nation
-> Index Scan using idx_partsupp_suppkey on partsupp
Index Cond: (ps_suppkey = supplier.s_suppkey)
SubPlan 1
-> Aggregate
-> Nested Loop
Join Filter: (nation_1.n_regionkey = region_1.r_regionkey)
-> Seq Scan on region region_1
Filter: (r_name = 'AMERICA'::bpchar)
-> Nested Loop
-> Nested Loop
-> Index Scan using idx_partsupp_partkey on partsupp partsupp_1
Index Cond: (part.p_partkey = ps_partkey)
-> Index Scan using supplier_pkey on supplier supplier_1
Index Cond: (s_suppkey = partsupp_1.ps_suppkey)
-> Index Scan using nation_pkey on nation nation_1
Index Cond: (n_nationkey = supplier_1.s_nationkey)
Ko te "Hanumi Hono" kei runga ake i te "Humi Hanumi". No reira karekau te whakakotahitanga e whakamahi i te tukatuka whakarara. Engari ko te "Parallel Index Scan" ka awhina tonu i te waahanga part_pkey.
Hononga ma nga waahanga
Kei te PostgreSQL 11 hononga ma nga waahanga kua monokia ma te taunoa: he tino utu nui te whakaritenga. Ko nga ripanga he rite te wehewehenga ka taea te hono i te waahanga ma te wehewehe. Ma tenei ka whakamahi a Postgres i nga ripanga iti ake. Ko ia hononga o nga waahanga ka taea te whakarara.
tpch=# set enable_partitionwise_join=t;
tpch=# explain (costs off) select * from prt1 t1, prt2 t2
where t1.a = t2.b and t1.b = 0 and t2.b between 0 and 10000;
QUERY PLAN
---------------------------------------------------
Append
-> Hash Join
Hash Cond: (t2.b = t1.a)
-> Seq Scan on prt2_p1 t2
Filter: ((b >= 0) AND (b <= 10000))
-> Hash
-> Seq Scan on prt1_p1 t1
Filter: (b = 0)
-> Hash Join
Hash Cond: (t2_1.b = t1_1.a)
-> Seq Scan on prt2_p2 t2_1
Filter: ((b >= 0) AND (b <= 10000))
-> Hash
-> Seq Scan on prt1_p2 t1_1
Filter: (b = 0)
tpch=# set parallel_setup_cost = 1;
tpch=# set parallel_tuple_cost = 0.01;
tpch=# explain (costs off) select * from prt1 t1, prt2 t2
where t1.a = t2.b and t1.b = 0 and t2.b between 0 and 10000;
QUERY PLAN
-----------------------------------------------------------
Gather
Workers Planned: 4
-> Parallel Append
-> Parallel Hash Join
Hash Cond: (t2_1.b = t1_1.a)
-> Parallel Seq Scan on prt2_p2 t2_1
Filter: ((b >= 0) AND (b <= 10000))
-> Parallel Hash
-> Parallel Seq Scan on prt1_p2 t1_1
Filter: (b = 0)
-> Parallel Hash Join
Hash Cond: (t2.b = t1.a)
-> Parallel Seq Scan on prt2_p1 t2
Filter: ((b >= 0) AND (b <= 10000))
-> Parallel Hash
-> Parallel Seq Scan on prt1_p1 t1
Filter: (b = 0)
Ko te mea nui ko te hononga i roto i nga waahanga he whakarara mena he nui enei waahanga.
Tāpiri Whakarara
Tāpiri Whakarara ka taea te whakamahi hei utu mo nga poraka rereke i roto i nga rerenga mahi rereke. I te nuinga o te wa ka puta tenei ki nga patai UNION KATOA. Ko te ngoikoretanga he iti ake te whakarara, na te mea he 1 tono anake te mahi a ia kaimahi.
E 2 nga tukanga kaimahi e whakahaere ana i konei, ahakoa e 4 kua whakahohea.
tpch=# explain (costs off) select sum(l_quantity) as sum_qty from lineitem where l_shipdate <= date '1998-12-01' - interval '105' day union all select sum(l_quantity) as sum_qty from lineitem where l_shipdate <= date '2000-12-01' - interval '105' day;
QUERY PLAN
------------------------------------------------------------------------------------------------
Gather
Workers Planned: 2
-> Parallel Append
-> Aggregate
-> Seq Scan on lineitem
Filter: (l_shipdate <= '2000-08-18 00:00:00'::timestamp without time zone)
-> Aggregate
-> Seq Scan on lineitem lineitem_1
Filter: (l_shipdate <= '1998-08-18 00:00:00'::timestamp without time zone)
Ko nga taurangi tino nui
WORK_MEM whakawhāiti te pūmahara mo ia tukanga, kaua ko nga patai noa: work_mem nga tukanga hononga = nui te mahara.
max_parallel_workers_per_gather — e hia nga kaimahi mahi ka whakamahia e te kaupapa whakahaere mo te tukatuka whakarara mai i te mahere.
max_worker_processes — ka whakatika i te tapeke o nga tukanga kaimahi ki te maha o nga matua CPU i runga i te tūmau.
Mai i te putanga 9.6, ka taea e te tukatuka whakarara te whakapai ake i te mahinga o nga patai uaua e karapa ana i nga rarangi maha, tohu tohu ranei. I roto i te PostgreSQL 10, ka taea te tukatuka whakarara ma te taunoa. Kia maumahara ki te whakakore i runga i nga tūmau me te nui o te mahi OLTP. He maha nga rauemi ka pau i nga karapa raupapa, i nga tohu tohu tohu ranei. Mena kaore koe i te whakahaere i tetahi purongo mo te katoa o nga huingararaunga, ka taea e koe te whakapai ake i nga mahi uiui ma te taapiri noa i nga tohu kua ngaro, te whakamahi ranei i te wehewehe tika.