Niaj hnub nimno CPUs muaj ntau cores. Tau ntau xyoo, cov ntawv thov tau xa cov lus nug mus rau cov ntaub ntawv sib txuas. Yog tias nws yog cov lus nug ntawm ntau kab hauv ib lub rooj, nws khiav nrawm dua thaum siv ntau CPUs, thiab PostgreSQL tau ua qhov no txij li version 9.6.
Nws siv sijhawm 3 xyoos los siv cov lus nug sib luag - peb yuav tsum rov sau cov lej ntawm ntau theem ntawm kev nug ua tiav. PostgreSQL 9.6 qhia txog kev tsim kho kom zoo ntxiv rau cov cai. Nyob rau hauv lub tom ntej versions, lwm yam queries raug tua nyob rau hauv parallel.
Kev txwv
Tsis txhob ua kom muaj kev sib luag yog tias tag nrho cov cores twb tsis khoom, txwv tsis pub lwm qhov kev thov yuav qeeb.
Qhov tseem ceeb tshaj plaws, kev ua haujlwm sib luag nrog cov txiaj ntsig siab WORK_MEM siv ntau lub cim xeeb - txhua tus hash koom lossis txheeb yuav siv lub cim xeeb work_mem.
Tsawg latency OLTP cov lus nug tsis tuaj yeem ua kom nrawm dua los ntawm kev ua kom sib luag. Thiab yog tias cov lus nug rov qab ib kab, kev ua tib yam yuav tsuas ua rau nws qeeb.
Tej zaum nws yuav nrawm dua tsis yog vim kev nyeem ntawv sib luag, tab sis vim tias cov ntaub ntawv tau kis thoob plaws ntau lub CPU cores. Hauv kev ua haujlwm niaj hnub no, PostgreSQL cov ntaub ntawv cov ntaub ntawv raug kaw zoo. Nrog nyeem ua ntej, nws tuaj yeem tau txais qhov thaiv loj dua los ntawm kev cia ntau dua li PG daemon thov. Yog li, cov lus nug kev ua tau zoo tsis txwv los ntawm disk I / O. Nws siv CPU cycles rau:
tpch=# explain analyze select l_quantity as sum_qty from lineitem where l_shipdate <= date '1998-12-01' - interval '105' day;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Seq Scan on lineitem (cost=0.00..1964772.00 rows=58856235 width=5) (actual time=0.014..16951.669 rows=58839715 loops=1)
Filter: (l_shipdate <= '1998-08-18 00:00:00'::timestamp without time zone)
Rows Removed by Filter: 1146337
Planning Time: 0.203 ms
Execution Time: 19035.100 ms
Qhov kev soj ntsuam ua ntu zus ua rau ntau kab tsis muaj kev sib sau ua ke, yog li cov lus nug raug tua los ntawm ib qho CPU core.
Yog koj ntxiv SUM(), koj tuaj yeem pom tias ob txoj haujlwm ua haujlwm yuav pab kom ceev cov lus nug:
explain analyze select sum(l_quantity) as sum_qty from lineitem where l_shipdate <= date '1998-12-01' - interval '105' day;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=1589702.14..1589702.15 rows=1 width=32) (actual time=8553.365..8553.365 rows=1 loops=1)
-> Gather (cost=1589701.91..1589702.12 rows=2 width=32) (actual time=8553.241..8555.067 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=1588701.91..1588701.92 rows=1 width=32) (actual time=8547.546..8547.546 rows=1 loops=3)
-> Parallel Seq Scan on lineitem (cost=0.00..1527393.33 rows=24523431 width=5) (actual time=0.038..5998.417 rows=19613238 loops=3)
Filter: (l_shipdate <= '1998-08-18 00:00:00'::timestamp without time zone)
Rows Removed by Filter: 382112
Planning Time: 0.241 ms
Execution Time: 8555.131 ms
Parallel aggregation
Lub Parallel Seq Scan node tsim kab rau ib nrab aggregation. Cov "Partial Aggregate" node trims cov kab no siv SUM(). Thaum kawg, SUM txee los ntawm txhua tus neeg ua haujlwm txheej txheem yog sau los ntawm "Sau" node.
Qhov kawg tshwm sim yog xam los ntawm "Finalize Aggregate" node. Yog tias koj muaj koj tus kheej ua haujlwm sib sau ua ke, tsis txhob hnov qab kos lawv li "parallel safe".
Tus naj npawb ntawm cov txheej txheem ua haujlwm
Tus naj npawb ntawm cov txheej txheem neeg ua haujlwm tuaj yeem nce ntxiv yam tsis rov pib dua lub server:
explain analyze select sum(l_quantity) as sum_qty from lineitem where l_shipdate <= date '1998-12-01' - interval '105' day;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=1589702.14..1589702.15 rows=1 width=32) (actual time=8553.365..8553.365 rows=1 loops=1)
-> Gather (cost=1589701.91..1589702.12 rows=2 width=32) (actual time=8553.241..8555.067 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=1588701.91..1588701.92 rows=1 width=32) (actual time=8547.546..8547.546 rows=1 loops=3)
-> Parallel Seq Scan on lineitem (cost=0.00..1527393.33 rows=24523431 width=5) (actual time=0.038..5998.417 rows=19613238 loops=3)
Filter: (l_shipdate <= '1998-08-18 00:00:00'::timestamp without time zone)
Rows Removed by Filter: 382112
Planning Time: 0.241 ms
Execution Time: 8555.131 ms
Yuav ua li cas ntawm no? Muaj 2 zaug ntxiv cov txheej txheem ua haujlwm, thiab qhov kev thov dhau los tsuas yog 1,6599 zaug sai dua. Cov kev suav yog nthuav. Peb muaj 2 tus neeg ua haujlwm txheej txheem thiab 1 tus thawj coj. Tom qab qhov kev hloov pauv nws dhau los ua 4 + 1.
Peb qhov siab tshaj plaws los ntawm kev ua haujlwm sib luag: 5/3 = 1,66 (6) zaug.
Ua li cas nws ua hauj lwm?
Cov txheej txheem
Kev thov ua tiav ib txwm pib nrog cov txheej txheem ua. Tus thawj coj ua txhua yam uas tsis yog-parallel thiab qee qhov kev sib txuas ua ke. Lwm cov txheej txheem uas ua tib yam kev thov yog hu ua cov txheej txheem neeg ua haujlwm. Parallel processing siv infrastructure dynamic keeb kwm yav dhau cov txheej txheem ua haujlwm (los ntawm version 9.4). Txij li lwm qhov ntawm PostgreSQL siv cov txheej txheem ntau dua li cov xov, cov lus nug nrog 3 tus neeg ua haujlwm txheej txheem tuaj yeem ua 4 zaug sai dua li kev ua ib txwm ua.
Kev sib cuam tshuam
Cov neeg ua haujlwm ua haujlwm sib txuas lus nrog tus thawj coj los ntawm kab lus (raws li kev nco sib koom). Txhua txheej txheem muaj 2 kab: rau kev ua yuam kev thiab rau tuples.
Yuav tsum muaj pes tsawg workflows?
Qhov tsawg kawg nkaus txwv yog teev los ntawm parameter max_parallel_workers_per_gather. Tus neeg thov khiav yuav siv cov txheej txheem neeg ua haujlwm los ntawm lub pas dej ua ke uas txwv tsis pub dhau qhov ntsuas max_parallel_workers size. Qhov kawg txwv yog max_worker_processes, uas yog, tag nrho cov txheej txheem tom qab.
Yog tias tsis tuaj yeem faib cov txheej txheem rau tus neeg ua haujlwm, kev ua haujlwm yuav yog ib txheej txheem.
Txhua zaus lub rooj yog 3 zaug loj dua min_parallel_(index|table)_scan_size, Postgres ntxiv cov txheej txheem ua haujlwm. Tus naj npawb ntawm kev ua haujlwm tsis yog nyob ntawm tus nqi. Circular dependency ua rau kev siv nyuaj nyuaj. Hloov chaw, tus npaj siv cov cai yooj yim.
Hauv kev xyaum, cov kev cai no tsis yog ib txwm tsim nyog rau kev tsim khoom, yog li koj tuaj yeem hloov cov txheej txheem ntawm cov neeg ua haujlwm rau ib lub rooj tshwj xeeb: ALTER TABLE ... SET (parallel_workers = N).
parallel_setup_cost - kom tsis txhob muaj kev sib luag ntawm kev thov luv luv. Qhov ntsuas no kwv yees lub sijhawm los npaj kev nco, pib txheej txheem, thiab pib sib pauv cov ntaub ntawv.
parallel_tuple_cost: Kev sib txuas lus ntawm tus thawj coj thiab cov neeg ua haujlwm tuaj yeem ncua sijhawm ntawm kev faib ua feem ntawm cov tuples los ntawm cov txheej txheem ua haujlwm. Qhov no parameter xam tus nqi ntawm cov ntaub ntawv pauv.
Nested Loop Joins
PostgreSQL 9.6+ может выполнять вложенные циклы параллельно — это простая операция.
explain (costs off) select c_custkey, count(o_orderkey)
from customer left outer join orders on
c_custkey = o_custkey and o_comment not like '%special%deposits%'
group by c_custkey;
QUERY PLAN
--------------------------------------------------------------------------------------
Finalize GroupAggregate
Group Key: customer.c_custkey
-> Gather Merge
Workers Planned: 4
-> Partial GroupAggregate
Group Key: customer.c_custkey
-> Nested Loop Left Join
-> Parallel Index Only Scan using customer_pkey on customer
-> Index Scan using idx_orders_custkey on orders
Index Cond: (customer.c_custkey = o_custkey)
Filter: ((o_comment)::text !~~ '%special%deposits%'::text)
Cov khoom sau tshwm sim nyob rau theem kawg, yog li Nested Loop Left Join yog ib qho kev ua haujlwm sib luag. Parallel Index Tsuas Scan tau qhia hauv version 10 nkaus xwb. Qhov xwm txheej c_custkey = o_custkey nyeem ib qho kev txiav txim rau tus neeg siv khoom. Yog li nws tsis yog parallel.
Hash koom
Txhua tus neeg ua haujlwm txheej txheem tsim nws tus kheej lub rooj hash kom txog rau thaum PostgreSQL 11. Thiab yog tias muaj ntau tshaj plaub ntawm cov txheej txheem no, kev ua haujlwm yuav tsis txhim kho. Nyob rau hauv lub tshiab version, lub rooj hash sib koom. Txhua tus txheej txheem neeg ua haujlwm tuaj yeem siv WORK_MEM los tsim lub rooj hash.
select
l_shipmode,
sum(case
when o_orderpriority = '1-URGENT'
or o_orderpriority = '2-HIGH'
then 1
else 0
end) as high_line_count,
sum(case
when o_orderpriority <> '1-URGENT'
and o_orderpriority <> '2-HIGH'
then 1
else 0
end) as low_line_count
from
orders,
lineitem
where
o_orderkey = l_orderkey
and l_shipmode in ('MAIL', 'AIR')
and l_commitdate < l_receiptdate
and l_shipdate < l_commitdate
and l_receiptdate >= date '1996-01-01'
and l_receiptdate < date '1996-01-01' + interval '1' year
group by
l_shipmode
order by
l_shipmode
LIMIT 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1964755.66..1964961.44 rows=1 width=27) (actual time=7579.592..7922.997 rows=1 loops=1)
-> Finalize GroupAggregate (cost=1964755.66..1966196.11 rows=7 width=27) (actual time=7579.590..7579.591 rows=1 loops=1)
Group Key: lineitem.l_shipmode
-> Gather Merge (cost=1964755.66..1966195.83 rows=28 width=27) (actual time=7559.593..7922.319 rows=6 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Partial GroupAggregate (cost=1963755.61..1965192.44 rows=7 width=27) (actual time=7548.103..7564.592 rows=2 loops=5)
Group Key: lineitem.l_shipmode
-> Sort (cost=1963755.61..1963935.20 rows=71838 width=27) (actual time=7530.280..7539.688 rows=62519 loops=5)
Sort Key: lineitem.l_shipmode
Sort Method: external merge Disk: 2304kB
Worker 0: Sort Method: external merge Disk: 2064kB
Worker 1: Sort Method: external merge Disk: 2384kB
Worker 2: Sort Method: external merge Disk: 2264kB
Worker 3: Sort Method: external merge Disk: 2336kB
-> Parallel Hash Join (cost=382571.01..1957960.99 rows=71838 width=27) (actual time=7036.917..7499.692 rows=62519 loops=5)
Hash Cond: (lineitem.l_orderkey = orders.o_orderkey)
-> Parallel Seq Scan on lineitem (cost=0.00..1552386.40 rows=71838 width=19) (actual time=0.583..4901.063 rows=62519 loops=5)
Filter: ((l_shipmode = ANY ('{MAIL,AIR}'::bpchar[])) AND (l_commitdate < l_receiptdate) AND (l_shipdate < l_commitdate) AND (l_receiptdate >= '1996-01-01'::date) AND (l_receiptdate < '1997-01-01 00:00:00'::timestamp without time zone))
Rows Removed by Filter: 11934691
-> Parallel Hash (cost=313722.45..313722.45 rows=3750045 width=20) (actual time=2011.518..2011.518 rows=3000000 loops=5)
Buckets: 65536 Batches: 256 Memory Usage: 3840kB
-> Parallel Seq Scan on orders (cost=0.00..313722.45 rows=3750045 width=20) (actual time=0.029..995.948 rows=3000000 loops=5)
Planning Time: 0.977 ms
Execution Time: 7923.770 ms
Lus Nug 12 los ntawm TPC-H qhia meej txog kev sib txuas sib txuas. Txhua tus neeg ua haujlwm cov txheej txheem pab txhawb rau kev tsim cov lus hash.
Merge Koom
Kev sib koom ua ke yog qhov tsis sib xws hauv qhov xwm txheej. Tsis txhob txhawj yog tias qhov no yog cov kauj ruam kawg ntawm cov lus nug - nws tseem tuaj yeem ua haujlwm sib luag.
-- Query 2 from TPC-H
explain (costs off) select s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone, s_comment
from part, supplier, partsupp, nation, region
where
p_partkey = ps_partkey
and s_suppkey = ps_suppkey
and p_size = 36
and p_type like '%BRASS'
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = 'AMERICA'
and ps_supplycost = (
select
min(ps_supplycost)
from partsupp, supplier, nation, region
where
p_partkey = ps_partkey
and s_suppkey = ps_suppkey
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = 'AMERICA'
)
order by s_acctbal desc, n_name, s_name, p_partkey
LIMIT 100;
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Limit
-> Sort
Sort Key: supplier.s_acctbal DESC, nation.n_name, supplier.s_name, part.p_partkey
-> Merge Join
Merge Cond: (part.p_partkey = partsupp.ps_partkey)
Join Filter: (partsupp.ps_supplycost = (SubPlan 1))
-> Gather Merge
Workers Planned: 4
-> Parallel Index Scan using <strong>part_pkey</strong> on part
Filter: (((p_type)::text ~~ '%BRASS'::text) AND (p_size = 36))
-> Materialize
-> Sort
Sort Key: partsupp.ps_partkey
-> Nested Loop
-> Nested Loop
Join Filter: (nation.n_regionkey = region.r_regionkey)
-> Seq Scan on region
Filter: (r_name = 'AMERICA'::bpchar)
-> Hash Join
Hash Cond: (supplier.s_nationkey = nation.n_nationkey)
-> Seq Scan on supplier
-> Hash
-> Seq Scan on nation
-> Index Scan using idx_partsupp_suppkey on partsupp
Index Cond: (ps_suppkey = supplier.s_suppkey)
SubPlan 1
-> Aggregate
-> Nested Loop
Join Filter: (nation_1.n_regionkey = region_1.r_regionkey)
-> Seq Scan on region region_1
Filter: (r_name = 'AMERICA'::bpchar)
-> Nested Loop
-> Nested Loop
-> Index Scan using idx_partsupp_partkey on partsupp partsupp_1
Index Cond: (part.p_partkey = ps_partkey)
-> Index Scan using supplier_pkey on supplier supplier_1
Index Cond: (s_suppkey = partsupp_1.ps_suppkey)
-> Index Scan using nation_pkey on nation nation_1
Index Cond: (n_nationkey = supplier_1.s_nationkey)
Lub "Merge Join" node nyob saum lub "Gather Merge". Yog li kev sib koom ua ke tsis siv cov txheej txheem sib luag. Tab sis qhov "Parallel Index Scan" node tseem pab nrog ntu part_pkey.
Kev sib txuas los ntawm ntu
Hauv PostgreSQL 11 kev sib txuas los ntawm ntu xiam oob khab los ntawm lub neej ntawd: nws muaj kev teem caij kim heev. Cov rooj uas zoo sib xws tuaj yeem muab faib los ntawm kev faib tawm. Txoj kev no Postgres yuav siv cov rooj me me. Txhua qhov kev sib txuas ntawm ntu tuaj yeem ua tib yam.
tpch=# set enable_partitionwise_join=t;
tpch=# explain (costs off) select * from prt1 t1, prt2 t2
where t1.a = t2.b and t1.b = 0 and t2.b between 0 and 10000;
QUERY PLAN
---------------------------------------------------
Append
-> Hash Join
Hash Cond: (t2.b = t1.a)
-> Seq Scan on prt2_p1 t2
Filter: ((b >= 0) AND (b <= 10000))
-> Hash
-> Seq Scan on prt1_p1 t1
Filter: (b = 0)
-> Hash Join
Hash Cond: (t2_1.b = t1_1.a)
-> Seq Scan on prt2_p2 t2_1
Filter: ((b >= 0) AND (b <= 10000))
-> Hash
-> Seq Scan on prt1_p2 t1_1
Filter: (b = 0)
tpch=# set parallel_setup_cost = 1;
tpch=# set parallel_tuple_cost = 0.01;
tpch=# explain (costs off) select * from prt1 t1, prt2 t2
where t1.a = t2.b and t1.b = 0 and t2.b between 0 and 10000;
QUERY PLAN
-----------------------------------------------------------
Gather
Workers Planned: 4
-> Parallel Append
-> Parallel Hash Join
Hash Cond: (t2_1.b = t1_1.a)
-> Parallel Seq Scan on prt2_p2 t2_1
Filter: ((b >= 0) AND (b <= 10000))
-> Parallel Hash
-> Parallel Seq Scan on prt1_p2 t1_1
Filter: (b = 0)
-> Parallel Hash Join
Hash Cond: (t2.b = t1.a)
-> Parallel Seq Scan on prt2_p1 t2
Filter: ((b >= 0) AND (b <= 10000))
-> Parallel Hash
-> Parallel Seq Scan on prt1_p1 t1
Filter: (b = 0)
Parallel Append tuaj yeem siv los hloov cov blocks sib txawv hauv cov haujlwm sib txawv. Qhov no feem ntau tshwm sim nrog UNION TAG NRHO cov lus nug. Qhov tsis zoo yog qhov tsis sib xws, vim tias txhua tus neeg ua haujlwm tsuas yog ua 1 qhov kev thov.
Muaj 2 tus neeg ua haujlwm cov txheej txheem khiav ntawm no, txawm tias 4 tau qhib.
tpch=# explain (costs off) select sum(l_quantity) as sum_qty from lineitem where l_shipdate <= date '1998-12-01' - interval '105' day union all select sum(l_quantity) as sum_qty from lineitem where l_shipdate <= date '2000-12-01' - interval '105' day;
QUERY PLAN
------------------------------------------------------------------------------------------------
Gather
Workers Planned: 2
-> Parallel Append
-> Aggregate
-> Seq Scan on lineitem
Filter: (l_shipdate <= '2000-08-18 00:00:00'::timestamp without time zone)
-> Aggregate
-> Seq Scan on lineitem lineitem_1
Filter: (l_shipdate <= '1998-08-18 00:00:00'::timestamp without time zone)