æè¿ã® CPU ã«ã¯å€ãã®ã³ã¢ãæèŒãããŠããŸãã é·å¹Žã«ããããã¢ããªã±ãŒã·ã§ã³ã¯ããŒã¿ããŒã¹ã«ã¯ãšãªã䞊è¡ããŠéä¿¡ããŠããŸããã ããŒãã«å
ã®è€æ°è¡ã«å¯Ÿããã¬ããŒã ã¯ãšãªã®å Žåãè€æ°ã® CPU ã䜿çšãããšããé«éã«å®è¡ãããŸããPostgreSQL ã¯ããŒãžã§ã³ 9.6 以éããããå®è¡ã§ããããã«ãªããŸããã
䞊åã¯ãšãªæ©èœãå®è£ ããã«ã¯ 3 幎ããããŸãããã¯ãšãªå®è¡ã®ããŸããŸãªæ®µéã§ã³ãŒããæžãçŽãå¿ èŠããããŸããã PostgreSQL 9.6 ã§ã¯ãã³ãŒããããã«æ¹åããããã®ã€ã³ãã©ã¹ãã©ã¯ãã£ãå°å ¥ãããŸããã åŸç¶ã®ããŒãžã§ã³ã§ã¯ãä»ã®ã¿ã€ãã®ã¯ãšãªã䞊è¡ããŠå®è¡ãããŸãã
å¶é
- ãã¹ãŠã®ã³ã¢ããã§ã«ããžãŒç¶æ ã®å Žåã¯ã䞊åå®è¡ãæå¹ã«ããªãã§ãã ãããæå¹ã«ããªããšãä»ã®ãªã¯ãšã¹ãã®é床ãäœäžããŸãã
- æãéèŠãªããšã¯ãé«ã WORK_MEM å€ã§ã®äžŠååŠçã¯å€§éã®ã¡ã¢ãªã䜿çšããããšã§ããåããã·ã¥çµåãŸãã¯ãœãŒã㯠work_mem ã¡ã¢ãªãæ¶è²»ããŸãã
- äœé 延㮠OLTP ã¯ãšãªã¯ã䞊åå®è¡ã«ãã£ãŠé«éåã§ããŸããã ãŸããã¯ãšãªã XNUMX è¡ãè¿ãå Žåã䞊ååŠçã§ã¯é床ãäœäžããã ãã§ãã
- éçºè 㯠TPC-H ãã³ãããŒã¯ã奜ãã§äœ¿çšããŸãã ãããããããå®å šãªäžŠåå®è¡ãæ±ããåæ§ã®ã¯ãšãªããããããããŸããã
- è¿°èªããã¯ã®ãªã SELECT ã¯ãšãªã®ã¿ã䞊åå®è¡ãããŸãã
- å Žåã«ãã£ãŠã¯ãé©åãªã€ã³ããã¯ã¹äœæãã䞊åã¢ãŒãã§ã®é 次ããŒãã« ã¹ãã£ã³ãããåªããŠããããšããããŸãã
- ã¯ãšãªãšã«ãŒãœã«ã®äžæåæ¢ã¯ãµããŒããããŠããŸããã
- ãŠã£ã³ããŠé¢æ°ãšé åºéåéåé¢æ°ã¯äžŠåã§ã¯ãããŸããã
- I/O ã¯ãŒã¯ããŒãã§ã¯äœãåŸãããŸããã
- 䞊å䞊ã¹æ¿ãã¢ã«ãŽãªãºã ã¯ãããŸããã ãã ãããœãŒãã䜿çšããã¯ãšãªã¯ãããã€ãã®ç¹ã§äžŠè¡ããŠå®è¡ã§ããŸãã
- CTE (WITH ...) ããã¹ãããã SELECT ã«çœ®ãæããŠã䞊ååŠçãæå¹ã«ããŸãã
- ãµãŒãããŒãã£ã®ããŒã¿ ã©ãããŒã¯ãŸã 䞊ååŠçããµããŒãããŠããŸãã (ãã ãã䞊ååŠçã¯å¯èœã§ã)ã
- FULL OUTER JOIN ã¯ãµããŒããããŠããŸããã
- max_rows ã¯äžŠååŠçãç¡å¹ã«ããŸãã
- ã¯ãšãªã« PARALLEL SAFE ãšããŒã¯ãããŠããªãé¢æ°ãããå Žåããã®é¢æ°ã¯ã·ã³ã°ã«ã¹ã¬ããã«ãªããŸãã
- SERIALIZABLE ãã©ã³ã¶ã¯ã·ã§ã³åé¢ã¬ãã«ã§ã¯ã䞊ååŠçãç¡å¹ã«ãªããŸãã
ãã¹ãç°å¢
PostgreSQL éçºè
ã¯ãTPC-H ãã³ãããŒã¯ ã¯ãšãªã®å¿çæéãççž®ããããšããŸããã ãã³ãããŒã¯ãããŠã³ããŒãããŠã
- TPC-H_Tools_v2.17.3.zip (ãŸãã¯ãã以éã®ããŒãžã§ã³) ãããŠã³ããŒãããŸãã
TPC ãªããµã€ããã . - makefile.suite ã®ååã Makefile ã«å€æŽããããã§èª¬æãããŠããããã«å€æŽããŸãã
https://github.com/tvondra/pg_tpch ã make ã³ãã³ãã䜿çšããŠã³ãŒããã³ã³ãã€ã«ããŸãã - ããŒã¿ãçæããŸãã
./dbgen -s 10
23 GB ã®ããŒã¿ããŒã¹ãäœæããŸãã ããã¯ã䞊åã¯ãšãªãšé䞊åã¯ãšãªã®ããã©ãŒãã³ã¹ã®éãã確èªããã«ã¯ååã§ãã - ãã¡ã€ã«ãå€æãã
tbl
вcsv Ñ for
Оsed
. - ãªããžããªã®ã¯ããŒã³ãäœæãã
pg_tpch
ãã¡ã€ã«ãã³ããŒããŸãcsv
вpg_tpch/dss/data
. - ã³ãã³ãã§ã¯ãšãªãäœæãã
qgen
. - ã³ãã³ãã䜿çšããŠããŒã¿ããŒã¹ã«ããŒã¿ãããŒãããŸã
./tpch.sh
.
䞊åé 次ã¹ãã£ã³
䞊åèªã¿åãã®ããã§ã¯ãªããããŒã¿ãå€ãã® CPU ã³ã¢ã«åæ£ãããŠãããããé«éã«ãªãå¯èœæ§ããããŸãã ææ°ã®ãªãã¬ãŒãã£ã³ã° ã·ã¹ãã ã§ã¯ãPostgreSQL ããŒã¿ ãã¡ã€ã«ã¯é©åã«ãã£ãã·ã¥ãããŸãã å èªã¿ã䜿çšãããšãPG ããŒã¢ã³ãèŠæ±ããããã倧ããªãããã¯ãã¹ãã¬ãŒãžããååŸã§ããŸãã ãããã£ãŠãã¯ãšãªã®ããã©ãŒãã³ã¹ã¯ãã£ã¹ã¯ I/O ã«ãã£ãŠå¶éãããŸããã 次ã®ç®ç㧠CPU ãµã€ã¯ã«ãæ¶è²»ããŸãã
- ããŒãã«ããŒãžããäžåºŠã« XNUMX è¡ãã€èªã¿åããŸãã
- æååã®å€ãšæ¡ä»¶ãæ¯èŒãã
WHERE
.
ç°¡åãªã¯ãšãªãå®è¡ããŠã¿ãŸããã select
:
tpch=# explain analyze select l_quantity as sum_qty from lineitem where l_shipdate <= date '1998-12-01' - interval '105' day;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Seq Scan on lineitem (cost=0.00..1964772.00 rows=58856235 width=5) (actual time=0.014..16951.669 rows=58839715 loops=1)
Filter: (l_shipdate <= '1998-08-18 00:00:00'::timestamp without time zone)
Rows Removed by Filter: 1146337
Planning Time: 0.203 ms
Execution Time: 19035.100 ms
ã·ãŒã±ã³ã·ã£ã« ã¹ãã£ã³ã§ã¯ãéèšãªãã§çæãããè¡ãå€ããããããã¯ãšãªã¯åäžã® CPU ã³ã¢ã«ãã£ãŠå®è¡ãããŸãã
è¿œå ãããš SUM()
, XNUMX ã€ã®ã¯ãŒã¯ãããŒãã¯ãšãªã®é«éåã«åœ¹ç«ã€ããšãããããŸãã
explain analyze select sum(l_quantity) as sum_qty from lineitem where l_shipdate <= date '1998-12-01' - interval '105' day;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=1589702.14..1589702.15 rows=1 width=32) (actual time=8553.365..8553.365 rows=1 loops=1)
-> Gather (cost=1589701.91..1589702.12 rows=2 width=32) (actual time=8553.241..8555.067 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=1588701.91..1588701.92 rows=1 width=32) (actual time=8547.546..8547.546 rows=1 loops=3)
-> Parallel Seq Scan on lineitem (cost=0.00..1527393.33 rows=24523431 width=5) (actual time=0.038..5998.417 rows=19613238 loops=3)
Filter: (l_shipdate <= '1998-08-18 00:00:00'::timestamp without time zone)
Rows Removed by Filter: 382112
Planning Time: 0.241 ms
Execution Time: 8555.131 ms
䞊åéèš
Parallel Seq Scan ããŒãã¯ãéšåéèšçšã®è¡ãçæããŸãã ãéšåéçŽãããŒãã¯ã次ã䜿çšããŠãããã®è¡ãããªãã³ã°ããŸãã SUM()
ã æåŸã«ãåã¯ãŒã«ãŒ ããã»ã¹ããã® SUM ã«ãŠã³ã¿ãŒããGatherãããŒãã«ãã£ãŠåéãããŸãã
æçµçµæã¯ããFinalize AggregateãããŒãã«ãã£ãŠèšç®ãããŸãã ç¬èªã®éèšé¢æ°ãããå Žåã¯ãããããã䞊åå®å šããšããŠããŒã¯ããããšãå¿ããªãã§ãã ããã
ã¯ãŒã«ãŒããã»ã¹ã®æ°
ã¯ãŒã«ãŒ ããã»ã¹ã®æ°ã¯ããµãŒããŒãåèµ·åããã«å¢ããããšãã§ããŸãã
explain analyze select sum(l_quantity) as sum_qty from lineitem where l_shipdate <= date '1998-12-01' - interval '105' day;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=1589702.14..1589702.15 rows=1 width=32) (actual time=8553.365..8553.365 rows=1 loops=1)
-> Gather (cost=1589701.91..1589702.12 rows=2 width=32) (actual time=8553.241..8555.067 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=1588701.91..1588701.92 rows=1 width=32) (actual time=8547.546..8547.546 rows=1 loops=3)
-> Parallel Seq Scan on lineitem (cost=0.00..1527393.33 rows=24523431 width=5) (actual time=0.038..5998.417 rows=19613238 loops=3)
Filter: (l_shipdate <= '1998-08-18 00:00:00'::timestamp without time zone)
Rows Removed by Filter: 382112
Planning Time: 0.241 ms
Execution Time: 8555.131 ms
äœãèµ·ããŠãïŒ äœæ¥ããã»ã¹ã¯ 2 åã«ãªããŸãããããªã¯ãšã¹ãã®é床㯠1,6599 åã«ãããŸããã§ããã èšç®ã¯é¢çœãã§ããã 2 ã€ã®ã¯ãŒã«ãŒ ããã»ã¹ãš 1 ã€ã®ãªãŒããŒããããŸããã å€æŽåŸã¯4+1ã«ãªããŸããã
䞊ååŠçã«ããæ倧é床åäž: 5/3 = 1,66(6) åã
ããã¯ã©ã®ããã«åäœããŸããïŒ
ÐÑПÑеÑÑÑ
ãªã¯ãšã¹ãã®å®è¡ã¯åžžã«å
é ã®ããã»ã¹ããéå§ãããŸãã ãªãŒããŒã¯ãã¹ãŠã®é䞊ååŠçãšäžéšã®äžŠååŠçãå®è¡ããŸãã åããªã¯ãšã¹ããå®è¡ããä»ã®ããã»ã¹ã¯ã¯ãŒã«ãŒ ããã»ã¹ãšåŒã°ããŸãã ã€ã³ãã©ã¹ãã©ã¯ãã£ã䜿çšãã䞊ååŠç
çžäºäœçš
ã¯ãŒã«ãŒ ããã»ã¹ã¯ãã¡ãã»ãŒãž ãã¥ãŒ (å ±æã¡ã¢ãªã«åºã¥ã) ãéããŠãªãŒããŒãšéä¿¡ããŸãã åããã»ã¹ã«ã¯ããšã©ãŒçšãšã¿ãã«çšã® 2 ã€ã®ãã¥ãŒããããŸãã
ã¯ãŒã¯ãããŒã¯ããã€å¿ èŠã§ãã?
æå°å¶éã¯ãã©ã¡ãŒã¿ã§æå®ãããŸã max_parallel_workers_per_gather
max_parallel_workers size
max_worker_processes
ã¯ãŒã«ãŒããã»ã¹ãå²ãåœãŠãããªãã£ãå Žåã¯ãã·ã³ã°ã«ããã»ã¹ã§ã®åŠçãšãªããŸãã
ã¯ãšãª ãã©ã³ããŒã¯ãããŒãã«ãŸãã¯ã€ã³ããã¯ã¹ã®ãµã€ãºã«å¿ããŠã¯ãŒã¯ãããŒãåæžã§ããŸãã ããã«ã¯ãã©ã¡ãŒã¿ããããŸã min_parallel_table_scan_size
min_parallel_index_scan_size
set min_parallel_table_scan_size='8MB'
8MB table => 1 worker
24MB table => 2 workers
72MB table => 3 workers
x => log(x / min_parallel_table_scan_size) / log(3) + 1 worker
ããŒãã«ã 3 åã«ãªããã³ã«ã min_parallel_(index|table)_scan_size
, Postgresã¯ã¯ãŒã«ãŒããã»ã¹ãè¿œå ããŸãã ã¯ãŒã¯ãããŒã®æ°ã¯ã³ã¹ãã«åºã¥ããŠããŸããã 埪ç°äŸåé¢ä¿ã«ãããè€éãªå®è£
ãå°é£ã«ãªããŸãã 代ããã«ããã©ã³ããŒã¯åçŽãªã«ãŒã«ã䜿çšããŸãã
å®éã«ã¯ããããã®ã«ãŒã«ãåžžã«éçšç°å¢ã«é©ããŠããããã§ã¯ãªããããç¹å®ã®ããŒãã«ã®ã¯ãŒã«ãŒ ããã»ã¹ã®æ°ãå€æŽã§ããŸã: ALTER TABLE ... SET (parallel_workers = N
).
ãªã䞊ååŠçã䜿ãããªãã®ã§ãããã?
å¶éã®é·ããªã¹ãã«å ããŠãã³ã¹ãã®ãã§ãã¯ããããŸãã
parallel_setup_cost
parallel_tuple_cost
ãã¹ããããã«ãŒãçµå
PostgreSQL 9.6+ ÐŒÐŸÐ¶ÐµÑ Ð²ÑпПлМÑÑÑ Ð²Ð»ÐŸÐ¶ÐµÐœÐœÑе ÑÐžÐºÐ»Ñ Ð¿Ð°ÑаллелÑМП â ÑÑП пÑПÑÑÐ°Ñ ÐŸÐ¿ÐµÑаÑОÑ.
explain (costs off) select c_custkey, count(o_orderkey)
from customer left outer join orders on
c_custkey = o_custkey and o_comment not like '%special%deposits%'
group by c_custkey;
QUERY PLAN
--------------------------------------------------------------------------------------
Finalize GroupAggregate
Group Key: customer.c_custkey
-> Gather Merge
Workers Planned: 4
-> Partial GroupAggregate
Group Key: customer.c_custkey
-> Nested Loop Left Join
-> Parallel Index Only Scan using customer_pkey on customer
-> Index Scan using idx_orders_custkey on orders
Index Cond: (customer.c_custkey = o_custkey)
Filter: ((o_comment)::text !~~ '%special%deposits%'::text)
ã³ã¬ã¯ã·ã§ã³ã¯æåŸã®æ®µéã§çºçããããããã¹ããããã«ãŒãå·Šçµåã¯äžŠåæäœã§ãã 䞊åã€ã³ããã¯ã¹ã®ã¿ã®ã¹ãã£ã³ã¯ããŒãžã§ã³ 10 ã§ã®ã¿å°å
¥ãããŸããã䞊åã·ãªã¢ã« ã¹ãã£ã³ãšåæ§ã«æ©èœããŸãã ç¶æ
c_custkey = o_custkey
ã¯ã©ã€ã¢ã³ãæååããšã« XNUMX ã€ã®æ³šæãèªã¿åããŸãã ãããã£ãŠãå¹³è¡ã§ã¯ãããŸããã
ããã·ã¥çµå
PostgreSQL 11 ãŸã§ã¯ãåã¯ãŒã«ãŒ ããã»ã¹ãç¬èªã®ããã·ã¥ ããŒãã«ãäœæããŸãããŸãããããã®ããã»ã¹ã XNUMX ã€ä»¥äžããå Žåãããã©ãŒãã³ã¹ã¯åäžããŸããã æ°ããŒãžã§ã³ã§ã¯ããã·ã¥ããŒãã«ãå ±æãããããã«ãªããŸããã åã¯ãŒã«ãŒ ããã»ã¹ã¯ WORK_MEM ã䜿çšããŠããã·ã¥ ããŒãã«ãäœæã§ããŸãã
select
l_shipmode,
sum(case
when o_orderpriority = '1-URGENT'
or o_orderpriority = '2-HIGH'
then 1
else 0
end) as high_line_count,
sum(case
when o_orderpriority <> '1-URGENT'
and o_orderpriority <> '2-HIGH'
then 1
else 0
end) as low_line_count
from
orders,
lineitem
where
o_orderkey = l_orderkey
and l_shipmode in ('MAIL', 'AIR')
and l_commitdate < l_receiptdate
and l_shipdate < l_commitdate
and l_receiptdate >= date '1996-01-01'
and l_receiptdate < date '1996-01-01' + interval '1' year
group by
l_shipmode
order by
l_shipmode
LIMIT 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1964755.66..1964961.44 rows=1 width=27) (actual time=7579.592..7922.997 rows=1 loops=1)
-> Finalize GroupAggregate (cost=1964755.66..1966196.11 rows=7 width=27) (actual time=7579.590..7579.591 rows=1 loops=1)
Group Key: lineitem.l_shipmode
-> Gather Merge (cost=1964755.66..1966195.83 rows=28 width=27) (actual time=7559.593..7922.319 rows=6 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Partial GroupAggregate (cost=1963755.61..1965192.44 rows=7 width=27) (actual time=7548.103..7564.592 rows=2 loops=5)
Group Key: lineitem.l_shipmode
-> Sort (cost=1963755.61..1963935.20 rows=71838 width=27) (actual time=7530.280..7539.688 rows=62519 loops=5)
Sort Key: lineitem.l_shipmode
Sort Method: external merge Disk: 2304kB
Worker 0: Sort Method: external merge Disk: 2064kB
Worker 1: Sort Method: external merge Disk: 2384kB
Worker 2: Sort Method: external merge Disk: 2264kB
Worker 3: Sort Method: external merge Disk: 2336kB
-> Parallel Hash Join (cost=382571.01..1957960.99 rows=71838 width=27) (actual time=7036.917..7499.692 rows=62519 loops=5)
Hash Cond: (lineitem.l_orderkey = orders.o_orderkey)
-> Parallel Seq Scan on lineitem (cost=0.00..1552386.40 rows=71838 width=19) (actual time=0.583..4901.063 rows=62519 loops=5)
Filter: ((l_shipmode = ANY ('{MAIL,AIR}'::bpchar[])) AND (l_commitdate < l_receiptdate) AND (l_shipdate < l_commitdate) AND (l_receiptdate >= '1996-01-01'::date) AND (l_receiptdate < '1997-01-01 00:00:00'::timestamp without time zone))
Rows Removed by Filter: 11934691
-> Parallel Hash (cost=313722.45..313722.45 rows=3750045 width=20) (actual time=2011.518..2011.518 rows=3000000 loops=5)
Buckets: 65536 Batches: 256 Memory Usage: 3840kB
-> Parallel Seq Scan on orders (cost=0.00..313722.45 rows=3750045 width=20) (actual time=0.029..995.948 rows=3000000 loops=5)
Planning Time: 0.977 ms
Execution Time: 7923.770 ms
TPC-H ã®ã¯ãšãª 12 ã¯ã䞊åããã·ã¥æ¥ç¶ãæ確ã«ç€ºããŠããŸãã åã¯ãŒã«ãŒ ããã»ã¹ã¯ãå ±éã®ããã·ã¥ ããŒãã«ã®äœæã«è²¢ç®ããŸãã
ããŒãžçµå
ããŒãžçµåã¯æ¬è³ªçã«é䞊åã§ãã ãããã¯ãšãªã®æåŸã®ã¹ãããã§ãã£ãŠãå¿é ããå¿ èŠã¯ãããŸãããããã§ã䞊è¡ããŠå®è¡ã§ããŸãã
-- Query 2 from TPC-H
explain (costs off) select s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone, s_comment
from part, supplier, partsupp, nation, region
where
p_partkey = ps_partkey
and s_suppkey = ps_suppkey
and p_size = 36
and p_type like '%BRASS'
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = 'AMERICA'
and ps_supplycost = (
select
min(ps_supplycost)
from partsupp, supplier, nation, region
where
p_partkey = ps_partkey
and s_suppkey = ps_suppkey
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = 'AMERICA'
)
order by s_acctbal desc, n_name, s_name, p_partkey
LIMIT 100;
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Limit
-> Sort
Sort Key: supplier.s_acctbal DESC, nation.n_name, supplier.s_name, part.p_partkey
-> Merge Join
Merge Cond: (part.p_partkey = partsupp.ps_partkey)
Join Filter: (partsupp.ps_supplycost = (SubPlan 1))
-> Gather Merge
Workers Planned: 4
-> Parallel Index Scan using <strong>part_pkey</strong> on part
Filter: (((p_type)::text ~~ '%BRASS'::text) AND (p_size = 36))
-> Materialize
-> Sort
Sort Key: partsupp.ps_partkey
-> Nested Loop
-> Nested Loop
Join Filter: (nation.n_regionkey = region.r_regionkey)
-> Seq Scan on region
Filter: (r_name = 'AMERICA'::bpchar)
-> Hash Join
Hash Cond: (supplier.s_nationkey = nation.n_nationkey)
-> Seq Scan on supplier
-> Hash
-> Seq Scan on nation
-> Index Scan using idx_partsupp_suppkey on partsupp
Index Cond: (ps_suppkey = supplier.s_suppkey)
SubPlan 1
-> Aggregate
-> Nested Loop
Join Filter: (nation_1.n_regionkey = region_1.r_regionkey)
-> Seq Scan on region region_1
Filter: (r_name = 'AMERICA'::bpchar)
-> Nested Loop
-> Nested Loop
-> Index Scan using idx_partsupp_partkey on partsupp partsupp_1
Index Cond: (part.p_partkey = ps_partkey)
-> Index Scan using supplier_pkey on supplier supplier_1
Index Cond: (s_suppkey = partsupp_1.ps_suppkey)
-> Index Scan using nation_pkey on nation nation_1
Index Cond: (n_nationkey = supplier_1.s_nationkey)
ãMerge JoinãããŒãã¯ãGather Mergeãã®äžã«ãããŸãã ãããã£ãŠãããŒãžã§ã¯äžŠååŠçã¯äœ¿çšãããŸããã ãã ãããParallel Index ScanãããŒãã¯äŸç¶ãšããŠã»ã°ã¡ã³ãã«åœ¹ç«ã¡ãŸãã part_pkey
.
ã»ã¯ã·ã§ã³ããšã®æ¥ç¶
PostgreSQL 11 ã§ã¯
tpch=# set enable_partitionwise_join=t;
tpch=# explain (costs off) select * from prt1 t1, prt2 t2
where t1.a = t2.b and t1.b = 0 and t2.b between 0 and 10000;
QUERY PLAN
---------------------------------------------------
Append
-> Hash Join
Hash Cond: (t2.b = t1.a)
-> Seq Scan on prt2_p1 t2
Filter: ((b >= 0) AND (b <= 10000))
-> Hash
-> Seq Scan on prt1_p1 t1
Filter: (b = 0)
-> Hash Join
Hash Cond: (t2_1.b = t1_1.a)
-> Seq Scan on prt2_p2 t2_1
Filter: ((b >= 0) AND (b <= 10000))
-> Hash
-> Seq Scan on prt1_p2 t1_1
Filter: (b = 0)
tpch=# set parallel_setup_cost = 1;
tpch=# set parallel_tuple_cost = 0.01;
tpch=# explain (costs off) select * from prt1 t1, prt2 t2
where t1.a = t2.b and t1.b = 0 and t2.b between 0 and 10000;
QUERY PLAN
-----------------------------------------------------------
Gather
Workers Planned: 4
-> Parallel Append
-> Parallel Hash Join
Hash Cond: (t2_1.b = t1_1.a)
-> Parallel Seq Scan on prt2_p2 t2_1
Filter: ((b >= 0) AND (b <= 10000))
-> Parallel Hash
-> Parallel Seq Scan on prt1_p2 t1_1
Filter: (b = 0)
-> Parallel Hash Join
Hash Cond: (t2.b = t1.a)
-> Parallel Seq Scan on prt2_p1 t2
Filter: ((b >= 0) AND (b <= 10000))
-> Parallel Hash
-> Parallel Seq Scan on prt1_p1 t1
Filter: (b = 0)
éèŠãªããšã¯ãã»ã¯ã·ã§ã³å ã®æ¥ç¶ã䞊åã«ãªãã®ã¯ããããã®ã»ã¯ã·ã§ã³ãååã«å€§ããå Žåã®ã¿ã§ãããšããããšã§ãã
䞊åè¿œå
ããã§ã¯ 2 ã€ã®ã¯ãŒã«ãŒ ããã»ã¹ãå®è¡ãããŠããŸããã4 ã€ãæå¹ã«ãªã£ãŠããŸãã
tpch=# explain (costs off) select sum(l_quantity) as sum_qty from lineitem where l_shipdate <= date '1998-12-01' - interval '105' day union all select sum(l_quantity) as sum_qty from lineitem where l_shipdate <= date '2000-12-01' - interval '105' day;
QUERY PLAN
------------------------------------------------------------------------------------------------
Gather
Workers Planned: 2
-> Parallel Append
-> Aggregate
-> Seq Scan on lineitem
Filter: (l_shipdate <= '2000-08-18 00:00:00'::timestamp without time zone)
-> Aggregate
-> Seq Scan on lineitem lineitem_1
Filter: (l_shipdate <= '1998-08-18 00:00:00'::timestamp without time zone)
æãéèŠãªå€æ°
- WORK_MEM ã¯ã¯ãšãªã ãã§ãªãããã»ã¹ããšã®ã¡ã¢ãªãå¶éããŸã: work_mem ããã»ã¹ æ¥ç¶ = 倧éã®ã¡ã¢ãªã
â å®è¡äžã®ããã°ã©ã ãèšç»ããã®äžŠååŠçã«äœ¿çšããã¯ãŒã«ãŒ ããã»ã¹ã®æ°ãmax_parallel_workers_per_gather
â ã¯ãŒã«ãŒ ããã»ã¹ã®åèšæ°ããµãŒããŒäžã® CPU ã³ã¢ã®æ°ã«èª¿æŽããŸããmax_worker_processes
- åãã§ããã䞊åäœæ¥ããã»ã¹ã®å Žåã§ããmax_parallel_workers
çµæ
ããŒãžã§ã³ 9.6 ã§ã¯ã䞊ååŠçã«ãããå€æ°ã®è¡ãŸãã¯ã€ã³ããã¯ã¹ãã¹ãã£ã³ããè€éãªã¯ãšãªã®ããã©ãŒãã³ã¹ãå€§å¹ ã«åäžããŸãã PostgreSQL 10 ã§ã¯ã䞊ååŠçãããã©ã«ãã§æå¹ã«ãªã£ãŠããŸãã OLTP ã¯ãŒã¯ããŒãã倧ãããµãŒããŒã§ã¯ãå¿ ãç¡å¹ã«ããŠãã ããã é 次ã¹ãã£ã³ãŸãã¯ã€ã³ããã¯ã¹ ã¹ãã£ã³ã¯å€§éã®ãªãœãŒã¹ãæ¶è²»ããŸãã ããŒã¿ã»ããå šäœã«å¯ŸããŠã¬ããŒããå®è¡ããŠããªãå Žåã¯ãäžè¶³ããŠããã€ã³ããã¯ã¹ãè¿œå ããããé©åãªããŒãã£ã·ã§ãã³ã°ã䜿çšããã ãã§ãã¯ãšãªã®ããã©ãŒãã³ã¹ãåäžãããããšãã§ããŸãã
ãªãã¡ã¬ã³ã¹
https://www.postgresql.org/docs/11/how-parallel-query-works.html https://www.postgresql.org/docs/11/parallel-plans.html http://ashutoshpg.blogspot.com/2017/12/partition-wise-joins-divide-and-conquer.html http://rhaas.blogspot.com/2016/04/postgresql-96-with-parallel-query-vs.html http://amitkapila16.blogspot.com/2015/11/parallel-sequential-scans-in-play.html https://write-skew.blogspot.com/2018/01/parallel-hash-for-postgresql.html http://rhaas.blogspot.com/2017/03/parallel-query-v2.html https://blog.2ndquadrant.com/parallel-monster-benchmark/ https://blog.2ndquadrant.com/parallel-aggregate/ https://www.depesz.com/2018/02/12/waiting-for-postgresql-11-support-parallel-btree-index-builds/ PostgreSQL 11 ã®äžŠååŠç
åºæïŒ habr.com