Ndị CPU nke oge a nwere ọtụtụ cores. Ruo ọtụtụ afọ, ngwa na-ezigara ajụjụ na ọdụ data n'otu oge. Ọ bụrụ na ọ bụ ajụjụ mkpesa na ọtụtụ ahịrị na tebụl, ọ na-agba ọsọ ọsọ mgbe ị na-eji ọtụtụ CPU, na PostgreSQL enweela ike ime nke a kemgbe ụdị 9.6.
O were afọ 3 iji mejuputa njirimara ajụjụ ọnụ - anyị ga-edegharị koodu ahụ na ọkwa dị iche iche nke mmezu ajụjụ. PostgreSQL 9.6 webatara akụrụngwa iji meziwanye koodu ahụ. Na nsụgharị ndị na-esote, a na-eme ụdị ajụjụ ndị ọzọ n'otu oge.
Mgbochi
Emela ka ogbugbu yiri ya ma ọ bụrụ na cores niile ejirila ọrụ n'aka, ma ọ bụghị ya, arịrịọ ndị ọzọ ga-akwụsịlata.
Nke kachasị mkpa, nhazi ya na ụkpụrụ WORK_MEM dị elu na-eji ọtụtụ ebe nchekwa - njikọ hash ọ bụla ma ọ bụ ụdị na-eburu ebe nchekwa work_mem.
Ajụjụ OLTP dị obere enweghị ike ịme ngwa ngwa site na mmezu yiri ya. Ma ọ bụrụ na ajụjụ ahụ weghachite otu ahịrị, nhazi n'otu oge ga-ebelata ya.
Ndị mmepe na-enwe mmasị iji akara akara TPC-H. Ma eleghị anya, ị nwere ajụjụ ndị yiri ya maka mmezu myirịta zuru oke.
Naanị ajụjụ ahọpụtara na-enweghị mkpọchi amụma ka a na-eme n'otu oge.
Mgbe ụfọdụ, indexing kwesịrị ekwesị dị mma karịa nyocha tebụl usoro n'usoro n'usoro.
akwadoghị ịkwụsịtụ ajụjụ na cursors.
Ọrụ mpio yana ọrụ nchịkọta nhazi iwu anaghị ejikọta.
Ị gaghị enweta ihe ọ bụla na ọrụ I/O.
Enweghị usoro nhazi nhazi ọ bụla. Mana ajụjụ nwere ụdị nwere ike ime n'otu oge n'akụkụ ụfọdụ.
Dochie CTE (WITH ...) na SELECT akwu iji mee ka nhazi ya na-arụkọ ọrụ.
Ihe mkpuchi data nke ndị ọzọ akwadobeghị nhazi ya (mana ha nwere ike!)
akwadoghị njikọ zuru oke.
max_rows na-ewepụ nhazi n'otu oge.
Ọ bụrụ na ajụjụ nwere ọrụ na-enweghị akara PARALLEL SAFE, ọ ga-abụ otu eri.
Ndị mmepe PostgreSQL nwara ibelata oge nzaghachi nke ajụjụ benchmark TPC-H. Budata benchmark na megharia ya na PostgreSQL. Nke a bụ ojiji akara TPC-H na-akwadoghị - ọ bụghị maka nchekwa data ma ọ bụ ntụnyere ngwaike.
Budata TPC-H_Tools_v2.17.3.zip (ma ọ bụ ụdị ọhụrụ) site na mpụga TPC.
Kpọgharia makefile.suite ka ọ bụrụ Makefile wee gbanwee dịka akọwara ebe a: https://github.com/tvondra/pg_tpch . Jiri iwu ime ka chịkọta koodu ahụ.
Mepụta data: ./dbgen -s 10 na-emepụta nchekwa data 23 GB. Nke a zuru ezu iji hụ ọdịiche dị na arụmọrụ nke ajụjụ ndị na-emekọ ihe na ndị na-abụghị nke.
Tụgharịa faịlụ tbl в csv с for и sed.
Mechie ebe nchekwa ahụ pg_tpch ma detuo faịlụ ndị ahụ csv в pg_tpch/dss/data.
Jiri iwu mepụta ajụjụ qgen.
Jiri iwu ahụ buo data n'ime nchekwa data ./tpch.sh.
Ndekota usoro nyocha
Ọ nwere ike ịdị ngwa ngwa ọ bụghị n'ihi ịgụkọ ọnụ, mana n'ihi na a na-agbasa data n'ofe ọtụtụ cores CPU. Na sistemụ arụmọrụ ọgbara ọhụrụ, faịlụ data PostgreSQL na-echekwa nke ọma. Na-agụ n'ihu, ọ ga-ekwe omume ịnweta nnukwu ngọngọ site na nchekwa karịa arịrịọ PG daemon. Ya mere, arụmọrụ ajụjụ anaghị ejedebe na diski I/O. Ọ na-eri cycles CPU ka:
gụọ ahịrị otu otu site na ibe tebụl;
tụnyere eriri ụkpụrụ na ọnọdụ WHERE.
Ka anyị gbaa ajụjụ dị mfe select:
tpch=# explain analyze select l_quantity as sum_qty from lineitem where l_shipdate <= date '1998-12-01' - interval '105' day;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Seq Scan on lineitem (cost=0.00..1964772.00 rows=58856235 width=5) (actual time=0.014..16951.669 rows=58839715 loops=1)
Filter: (l_shipdate <= '1998-08-18 00:00:00'::timestamp without time zone)
Rows Removed by Filter: 1146337
Planning Time: 0.203 ms
Execution Time: 19035.100 ms
Nyocha usoro na-emepụta ọtụtụ ahịrị na-enweghị nchịkọta, ya mere a na-eme ajụjụ a site na otu isi CPU.
Ọ bụrụ na ị gbakwunye SUM(), ị nwere ike ịhụ na usoro ọrụ abụọ ga-enyere aka mee ngwa ngwa ajụjụ a:
explain analyze select sum(l_quantity) as sum_qty from lineitem where l_shipdate <= date '1998-12-01' - interval '105' day;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=1589702.14..1589702.15 rows=1 width=32) (actual time=8553.365..8553.365 rows=1 loops=1)
-> Gather (cost=1589701.91..1589702.12 rows=2 width=32) (actual time=8553.241..8555.067 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=1588701.91..1588701.92 rows=1 width=32) (actual time=8547.546..8547.546 rows=1 loops=3)
-> Parallel Seq Scan on lineitem (cost=0.00..1527393.33 rows=24523431 width=5) (actual time=0.038..5998.417 rows=19613238 loops=3)
Filter: (l_shipdate <= '1998-08-18 00:00:00'::timestamp without time zone)
Rows Removed by Filter: 382112
Planning Time: 0.241 ms
Execution Time: 8555.131 ms
Mkpokọta ọnụ
Ndekota Seq Scan node na-emepụta ahịrị maka mkpokọta akụkụ. Ọnụ "Partial Aggregate" na-ejiji ewepụ ahịrị ndị a SUM(). Na njedebe, a na-anakọta counter SUM site na usoro onye ọrụ ọ bụla site na ọnụ "Gakọta".
A na-agbakọ nsonaazụ ikpeazụ site na ọnụ "Mechaa Mkpokọta". Ọ bụrụ na ị nwere ọrụ nchịkọta nke gị, echefula kaa akara ha dị ka "ihe nchekwa yiri".
Ọnụ ọgụgụ nke usoro ndị ọrụ
Enwere ike ịbawanye ọnụọgụ nke usoro ndị ọrụ na-enweghị ịmalitegharị ihe nkesa:
explain analyze select sum(l_quantity) as sum_qty from lineitem where l_shipdate <= date '1998-12-01' - interval '105' day;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=1589702.14..1589702.15 rows=1 width=32) (actual time=8553.365..8553.365 rows=1 loops=1)
-> Gather (cost=1589701.91..1589702.12 rows=2 width=32) (actual time=8553.241..8555.067 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=1588701.91..1588701.92 rows=1 width=32) (actual time=8547.546..8547.546 rows=1 loops=3)
-> Parallel Seq Scan on lineitem (cost=0.00..1527393.33 rows=24523431 width=5) (actual time=0.038..5998.417 rows=19613238 loops=3)
Filter: (l_shipdate <= '1998-08-18 00:00:00'::timestamp without time zone)
Rows Removed by Filter: 382112
Planning Time: 0.241 ms
Execution Time: 8555.131 ms
Kedu ihe na-eme ebe a? Enwere 2 ugboro ọzọ usoro ọrụ, na arịrịọ ahụ ghọrọ naanị 1,6599 ugboro ngwa ngwa. Mgbakọ ndị ahụ na-adọrọ mmasị. Anyị nwere usoro ndị ọrụ 2 na onye isi 1. Mgbe mgbanwe ahụ gasịrị, ọ ghọrọ 4+1.
Ọsọ ọsọ anyị kachasị site na nhazi nhazi: 5/3 = 1,66 (6) ugboro.
Olee otú ọ na-arụ ọrụ?
Nhazi
Arịrịọ ogbugbu na-amalite mgbe niile site na usoro isi. Onye ndu na-eme ihe niile na-enweghị atụ na ụfọdụ nhazi nhazi. Usoro ndị ọzọ na-eme otu arịrịọ ka a na-akpọ usoro ndị ọrụ. Nhazi ihe na-eji akụrụngwa eme ihe dynamic ndabere ọrụ usoro (site na ụdị 9.4). Ebe ọ bụ na akụkụ ndị ọzọ nke PostgreSQL na-eji usoro karịa eri, ajụjụ nwere usoro ndị ọrụ 3 nwere ike ịbụ ugboro anọ ngwa ngwa karịa nhazi omenala.
Mmekọrịta
Usoro ndị ọrụ na-ekwurịta okwu na onye ndu site na kwụ n'ahịrị ozi (dabere na ebe nchekwa nkekọrịta). Usoro ọ bụla nwere 2 kwụ n'ahịrị: maka njehie na maka tuples.
Mgbe ọ bụla tebụl bụ 3 ugboro ibu karịa min_parallel_(index|table)_scan_size, Postgres na-agbakwụnye usoro onye ọrụ. Ọnụ ọgụgụ nke usoro ọrụ anaghị adabere na ụgwọ. Ndabere okirikiri na-eme ka mmejuputa mgbagwoju anya sie ike. Kama nke ahụ, onye na-eme atụmatụ na-eji iwu ndị dị mfe eme ihe.
Na omume, iwu ndị a anaghị adabara mgbe niile maka mmepụta, yabụ ị nwere ike ịgbanwe ọnụọgụ ndị ọrụ maka otu tebụl: ALTER TABLE ... SET (parallel_workers = N).
Kedu ihe kpatara na-ejighị nhazi ihe yiri ya?
Na mgbakwunye na ndepụta mmachi ogologo, enwerekwa nlele ego:
parallel_setup_cost - iji zere nhazi nhazi nke obere arịrịọ. Oke a na-atụle oge iji kwadebe ebe nchekwa, malite usoro na mgbanwe data mbụ.
parallel_tuple_cost: nkwurịta okwu n'etiti onye ndú na ndị ọrụ nwere ike igbu oge na ọnụ ọgụgụ nke tuples si ọrụ usoro. Oke a na-agbakọ ọnụ ahịa mgbanwe data.
Ejikọtara aka aka
PostgreSQL 9.6+ может выполнять вложенные циклы параллельно — это простая операция.
explain (costs off) select c_custkey, count(o_orderkey)
from customer left outer join orders on
c_custkey = o_custkey and o_comment not like '%special%deposits%'
group by c_custkey;
QUERY PLAN
--------------------------------------------------------------------------------------
Finalize GroupAggregate
Group Key: customer.c_custkey
-> Gather Merge
Workers Planned: 4
-> Partial GroupAggregate
Group Key: customer.c_custkey
-> Nested Loop Left Join
-> Parallel Index Only Scan using customer_pkey on customer
-> Index Scan using idx_orders_custkey on orders
Index Cond: (customer.c_custkey = o_custkey)
Filter: ((o_comment)::text !~~ '%special%deposits%'::text)
Nchịkọta a na-eme n'ọkwa ikpeazụ, ya mere Nested Loop Left Join bụ ọrụ yiri ya. Ndekọ Index Naanị nyocha ka ewepụtara naanị na ụdị 10. Ọ na-arụ ọrụ yiri nyocha usoro ihe nlere. Ọnọdụ c_custkey = o_custkey na-agụ otu usoro kwa eriri onye ahịa. Ya mere, ọ bụghị ihe yiri ya.
Jikọọ Hash
Usoro onye ọrụ ọ bụla na-emepụta tebụl hash nke ya ruo mgbe PostgreSQL 11. Ma ọ bụrụ na e nwere ihe karịrị anọ n'ime usoro ndị a, arụmọrụ agaghị adị mma. Na ụdị ọhụrụ ahụ, a na-ekekọrịta tebụl hash. Usoro onye ọrụ ọ bụla nwere ike iji WORK_MEM mepụta tebụl hash.
select
l_shipmode,
sum(case
when o_orderpriority = '1-URGENT'
or o_orderpriority = '2-HIGH'
then 1
else 0
end) as high_line_count,
sum(case
when o_orderpriority <> '1-URGENT'
and o_orderpriority <> '2-HIGH'
then 1
else 0
end) as low_line_count
from
orders,
lineitem
where
o_orderkey = l_orderkey
and l_shipmode in ('MAIL', 'AIR')
and l_commitdate < l_receiptdate
and l_shipdate < l_commitdate
and l_receiptdate >= date '1996-01-01'
and l_receiptdate < date '1996-01-01' + interval '1' year
group by
l_shipmode
order by
l_shipmode
LIMIT 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1964755.66..1964961.44 rows=1 width=27) (actual time=7579.592..7922.997 rows=1 loops=1)
-> Finalize GroupAggregate (cost=1964755.66..1966196.11 rows=7 width=27) (actual time=7579.590..7579.591 rows=1 loops=1)
Group Key: lineitem.l_shipmode
-> Gather Merge (cost=1964755.66..1966195.83 rows=28 width=27) (actual time=7559.593..7922.319 rows=6 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Partial GroupAggregate (cost=1963755.61..1965192.44 rows=7 width=27) (actual time=7548.103..7564.592 rows=2 loops=5)
Group Key: lineitem.l_shipmode
-> Sort (cost=1963755.61..1963935.20 rows=71838 width=27) (actual time=7530.280..7539.688 rows=62519 loops=5)
Sort Key: lineitem.l_shipmode
Sort Method: external merge Disk: 2304kB
Worker 0: Sort Method: external merge Disk: 2064kB
Worker 1: Sort Method: external merge Disk: 2384kB
Worker 2: Sort Method: external merge Disk: 2264kB
Worker 3: Sort Method: external merge Disk: 2336kB
-> Parallel Hash Join (cost=382571.01..1957960.99 rows=71838 width=27) (actual time=7036.917..7499.692 rows=62519 loops=5)
Hash Cond: (lineitem.l_orderkey = orders.o_orderkey)
-> Parallel Seq Scan on lineitem (cost=0.00..1552386.40 rows=71838 width=19) (actual time=0.583..4901.063 rows=62519 loops=5)
Filter: ((l_shipmode = ANY ('{MAIL,AIR}'::bpchar[])) AND (l_commitdate < l_receiptdate) AND (l_shipdate < l_commitdate) AND (l_receiptdate >= '1996-01-01'::date) AND (l_receiptdate < '1997-01-01 00:00:00'::timestamp without time zone))
Rows Removed by Filter: 11934691
-> Parallel Hash (cost=313722.45..313722.45 rows=3750045 width=20) (actual time=2011.518..2011.518 rows=3000000 loops=5)
Buckets: 65536 Batches: 256 Memory Usage: 3840kB
-> Parallel Seq Scan on orders (cost=0.00..313722.45 rows=3750045 width=20) (actual time=0.029..995.948 rows=3000000 loops=5)
Planning Time: 0.977 ms
Execution Time: 7923.770 ms
Ajụjụ 12 sitere na TPC-H na-egosi n'ụzọ doro anya njikọ hash yiri ya. Usoro onye ọrụ ọ bụla na-enye aka na ịmepụta tebụl hash nkịtị.
Jikota Jikọọ
Njikọ njikọ na-enweghị ihe jikọrọ ya na okike. Echegbula onwe gị ma ọ bụrụ na nke a bụ nzọụkwụ ikpeazụ nke ajụjụ - ọ ka nwere ike na-aga n'otu n'otu.
-- Query 2 from TPC-H
explain (costs off) select s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone, s_comment
from part, supplier, partsupp, nation, region
where
p_partkey = ps_partkey
and s_suppkey = ps_suppkey
and p_size = 36
and p_type like '%BRASS'
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = 'AMERICA'
and ps_supplycost = (
select
min(ps_supplycost)
from partsupp, supplier, nation, region
where
p_partkey = ps_partkey
and s_suppkey = ps_suppkey
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = 'AMERICA'
)
order by s_acctbal desc, n_name, s_name, p_partkey
LIMIT 100;
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Limit
-> Sort
Sort Key: supplier.s_acctbal DESC, nation.n_name, supplier.s_name, part.p_partkey
-> Merge Join
Merge Cond: (part.p_partkey = partsupp.ps_partkey)
Join Filter: (partsupp.ps_supplycost = (SubPlan 1))
-> Gather Merge
Workers Planned: 4
-> Parallel Index Scan using <strong>part_pkey</strong> on part
Filter: (((p_type)::text ~~ '%BRASS'::text) AND (p_size = 36))
-> Materialize
-> Sort
Sort Key: partsupp.ps_partkey
-> Nested Loop
-> Nested Loop
Join Filter: (nation.n_regionkey = region.r_regionkey)
-> Seq Scan on region
Filter: (r_name = 'AMERICA'::bpchar)
-> Hash Join
Hash Cond: (supplier.s_nationkey = nation.n_nationkey)
-> Seq Scan on supplier
-> Hash
-> Seq Scan on nation
-> Index Scan using idx_partsupp_suppkey on partsupp
Index Cond: (ps_suppkey = supplier.s_suppkey)
SubPlan 1
-> Aggregate
-> Nested Loop
Join Filter: (nation_1.n_regionkey = region_1.r_regionkey)
-> Seq Scan on region region_1
Filter: (r_name = 'AMERICA'::bpchar)
-> Nested Loop
-> Nested Loop
-> Index Scan using idx_partsupp_partkey on partsupp partsupp_1
Index Cond: (part.p_partkey = ps_partkey)
-> Index Scan using supplier_pkey on supplier supplier_1
Index Cond: (s_suppkey = partsupp_1.ps_suppkey)
-> Index Scan using nation_pkey on nation nation_1
Index Cond: (n_nationkey = supplier_1.s_nationkey)
Ọnụ "jikọta ọnụ" dị n'elu "Gakọta Njikọ". Ya mere, ijikọ anaghị eji nhazi ya. Mana ọnụ "Parallel Index Scan" ka na-enyere aka na akụkụ ahụ part_pkey.
Njikọ site na ngalaba
Na PostgreSQL 11 njikọ site na ngalaba nwere nkwarụ site na ndabara: ọ nwere nhazi oge dị oke ọnụ. Tebụl ndị nwere nkebi yiri ya nwere ike ijikọ nkebi site na nkebi. Otu a Postgres ga-eji obere tebụl hash. Njikọ ọ bụla nke ngalaba nwere ike ịdị na ya.
tpch=# set enable_partitionwise_join=t;
tpch=# explain (costs off) select * from prt1 t1, prt2 t2
where t1.a = t2.b and t1.b = 0 and t2.b between 0 and 10000;
QUERY PLAN
---------------------------------------------------
Append
-> Hash Join
Hash Cond: (t2.b = t1.a)
-> Seq Scan on prt2_p1 t2
Filter: ((b >= 0) AND (b <= 10000))
-> Hash
-> Seq Scan on prt1_p1 t1
Filter: (b = 0)
-> Hash Join
Hash Cond: (t2_1.b = t1_1.a)
-> Seq Scan on prt2_p2 t2_1
Filter: ((b >= 0) AND (b <= 10000))
-> Hash
-> Seq Scan on prt1_p2 t1_1
Filter: (b = 0)
tpch=# set parallel_setup_cost = 1;
tpch=# set parallel_tuple_cost = 0.01;
tpch=# explain (costs off) select * from prt1 t1, prt2 t2
where t1.a = t2.b and t1.b = 0 and t2.b between 0 and 10000;
QUERY PLAN
-----------------------------------------------------------
Gather
Workers Planned: 4
-> Parallel Append
-> Parallel Hash Join
Hash Cond: (t2_1.b = t1_1.a)
-> Parallel Seq Scan on prt2_p2 t2_1
Filter: ((b >= 0) AND (b <= 10000))
-> Parallel Hash
-> Parallel Seq Scan on prt1_p2 t1_1
Filter: (b = 0)
-> Parallel Hash Join
Hash Cond: (t2.b = t1.a)
-> Parallel Seq Scan on prt2_p1 t2
Filter: ((b >= 0) AND (b <= 10000))
-> Parallel Hash
-> Parallel Seq Scan on prt1_p1 t1
Filter: (b = 0)
Ihe bụ isi bụ na njikọ dị na ngalaba dị n'otu ma ọ bụrụ na akụkụ ndị a buru ibu.
Mgbakwunye mgbakwunye
Mgbakwunye mgbakwunye nwere ike iji kama dị iche iche blocks na dị iche iche workflows. Nke a na-eme na UNION niile ajụjụ. Ọdịmma ahụ pere mpe, n'ihi na onye ọrụ ọ bụla na-ahazi arịrịọ 1 naanị.
Enwere usoro ndị ọrụ 2 na-agba ọsọ ebe a, n'agbanyeghị na enyere 4 aka.
tpch=# explain (costs off) select sum(l_quantity) as sum_qty from lineitem where l_shipdate <= date '1998-12-01' - interval '105' day union all select sum(l_quantity) as sum_qty from lineitem where l_shipdate <= date '2000-12-01' - interval '105' day;
QUERY PLAN
------------------------------------------------------------------------------------------------
Gather
Workers Planned: 2
-> Parallel Append
-> Aggregate
-> Seq Scan on lineitem
Filter: (l_shipdate <= '2000-08-18 00:00:00'::timestamp without time zone)
-> Aggregate
-> Seq Scan on lineitem lineitem_1
Filter: (l_shipdate <= '1998-08-18 00:00:00'::timestamp without time zone)
Ihe mgbanwe kachasị mkpa
WORK_MEM na-amachi ebe nchekwa n'otu usoro, ọ bụghị naanị ajụjụ: work_mem usoro njikọ = ọtụtụ ebe nchekwa.
Dịka nke ụdị 9.6, nhazi n'otu oge nwere ike melite arụmọrụ nke ajụjụ dị mgbagwoju anya na-enyocha ọtụtụ ahịrị ma ọ bụ ndeksi. Na PostgreSQL 10, a na-akwado nhazi ya na ndabara. Cheta iji gbanyụọ ya na sava nwere nnukwu ọrụ OLTP. Nyocha usoro ma ọ bụ nyocha index na-eri ọtụtụ ihe. Ọ bụrụ na ị naghị eme mkpesa na dataset niile, ị nwere ike melite arụmọrụ ajụjụ site na ịgbakwunye ndetu efu ma ọ bụ iji nkewa kwesịrị ekwesị.