Parallel queries in PostgreSQL

Parallel queries in PostgreSQL
Donec mattis consectetur cores. Pro annis applicationes ad databases in parallelis quaestionibus mittebantur. Si relatio quaestionis de pluribus ordinibus in mensa est, velocius currit cum pluribus CPUs utens, et PostgreSQL hoc facere potuit ex versione 9.6.

III annos sumpsit ad efficiendum interrogationis parallelae plumam - rescribere codicem habuimus diversis statibus interrogationis executionis. PostgreSQL 3 infrastructuram adduximus ad codicem ulteriorem emendandum. In versionibus subsequentibus, aliae rationes quaerendi in parallelis efficiuntur.

modum

  • Non efficiunt parallelam exsecutionem si omnes nuclei iam occupati sunt, alioquin aliae petitiones retardabunt.
  • Maxime, processus parallelus in alta WORK_MEM valores multum memoriae utitur - Quisque hash iungere vel modi memoriam operis_mem suscipit.
  • Humilis latency OLTP quaestionibus parallelis exsecutioni mandari non potest. Et si quaestio unum ordinem reddiderit, processus parallelus tantum retardabit.
  • Tincidunt amant TPC-H uti Probatio. Fortasse similes interrogationes habes ad perfectam parallelam executionem.
  • Solum quaestiones selectae sine praedi- densis in parallelis efficiuntur.
  • Interdum propria indexing melior est quam mensa sequentialis in modo parallelo intuens.
  • Intermissa queries et cursores non praesto sunt.
  • Functiones fenestrae et functiones aggregati ordinatae pone functiones non sunt parallelae.
  • Tu nihil in I/O quod inposuit lucraris.
  • Nullae sunt algorithms parallelae. Sed queries cum generibus in aliquibus aspectibus parallelis fieri possunt.
  • Repone CTE (cum ...) cum nested SELECT ut processus parallelas efficiat.
  • Tertia-pars involucris data nondum processum parallelum sustinent (sed poterant!)
  • Plenus JOIN non valet.
  • max_rows parallela processus priuat.
  • Si quaesitum habet munus quod PARALLEL ARMARIUM notatum non est, una lina erit.
  • Rei solitudo Serializable in processu parallelo disables gradu.

Test environment

Tincimenta PostgreSQL responsionis tempus TPC-H probationis queries reducere conati sunt. Download velit ac PostgreSQL ad aptet. Hic usus privative Probationis TPC-H - non datorum aut ferramentorum comparationis.

  1. Download TPC-H_Tools_v2.17.3.zip (sive versio recentior) ex TPC Collection.
  2. Rename makefile.suite ad facfile et muta ut hic descriptus est: https://github.com/tvondra/pg_tpch . Compilare signum cum imperio facere.
  3. Data generare: ./dbgen -s 10 dat XXIII GB database. Hoc satis est videre differentiam in quaestionibus parallelis et non parallelis faciendis.
  4. Convertere files tbl в csv с for и sed.
  5. Clone repositio pg_tpch et effingo lima csv в pg_tpch/dss/data.
  6. Create queries cum imperio qgen.
  7. Onus data in datorum cum imperio ./tpch.sh.

Sequentia parallela intuens

Velocius sit non ob lectionem parallelam, sed quia notitia multorum CPU coros diffunditur. In modernis systematibus operantibus, PostgreSQL fasciculi notati bene conditi sunt. Cum praemissis legeris, maius scandalum fieri potest e repositionis quam PG daemonis petitiones. Ergo quaestio perficiendi per orbem I/O non terminatur. CPU cyclos consumit ad:

  • legit ordines singulatim e tabulis paginis;
  • simile filum values ​​​​et conditionibus WHERE.

Curramus simplex query select:

tpch=# explain analyze select l_quantity as sum_qty from lineitem where l_shipdate <= date '1998-12-01' - interval '105' day;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Seq Scan on lineitem (cost=0.00..1964772.00 rows=58856235 width=5) (actual time=0.014..16951.669 rows=58839715 loops=1)
Filter: (l_shipdate <= '1998-08-18 00:00:00'::timestamp without time zone)
Rows Removed by Filter: 1146337
Planning Time: 0.203 ms
Execution Time: 19035.100 ms

Scan sequentialis nimium multos ordines sine aggregatione producit, ergo quaesitio ab uno CPU nucleo afficitur.

Si addendi SUM(), videre potes quod duo operativi adiuvabunt ad interrogationem accelerandam:

explain analyze select sum(l_quantity) as sum_qty from lineitem where l_shipdate <= date '1998-12-01' - interval '105' day;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=1589702.14..1589702.15 rows=1 width=32) (actual time=8553.365..8553.365 rows=1 loops=1)
-> Gather (cost=1589701.91..1589702.12 rows=2 width=32) (actual time=8553.241..8555.067 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=1588701.91..1588701.92 rows=1 width=32) (actual time=8547.546..8547.546 rows=1 loops=3)
-> Parallel Seq Scan on lineitem (cost=0.00..1527393.33 rows=24523431 width=5) (actual time=0.038..5998.417 rows=19613238 loops=3)
Filter: (l_shipdate <= '1998-08-18 00:00:00'::timestamp without time zone)
Rows Removed by Filter: 382112
Planning Time: 0.241 ms
Execution Time: 8555.131 ms

aggregatio parallela

Parallelus Seq Scan nodi ordines pro parte aggregationis producit. "Partialis Subgenera" nodi has lineas componet utens SUM(). In fine, summa ratio ab unoquoque laborante processu nodo colligitur.

Novissimus effectus computatur per node aggregatum finalisare. Si functiones aggregationis propriae habes, noli oblivisci ut "parallelum tutum" notare.

Numerus operariorum processuum

Numerus processuum opificum augeri potest sine restarting servo:

explain analyze select sum(l_quantity) as sum_qty from lineitem where l_shipdate <= date '1998-12-01' - interval '105' day;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=1589702.14..1589702.15 rows=1 width=32) (actual time=8553.365..8553.365 rows=1 loops=1)
-> Gather (cost=1589701.91..1589702.12 rows=2 width=32) (actual time=8553.241..8555.067 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=1588701.91..1588701.92 rows=1 width=32) (actual time=8547.546..8547.546 rows=1 loops=3)
-> Parallel Seq Scan on lineitem (cost=0.00..1527393.33 rows=24523431 width=5) (actual time=0.038..5998.417 rows=19613238 loops=3)
Filter: (l_shipdate <= '1998-08-18 00:00:00'::timestamp without time zone)
Rows Removed by Filter: 382112
Planning Time: 0.241 ms
Execution Time: 8555.131 ms

Quid hic agitur? Plures processus laboris erant 2 temporibus, et petitio tantum 1,6599 temporibus velocior facta est. Calculi iucunda sunt. Habuimus 2 adiutorem processuum et 1 ducem. Post mutationem facta 4+1.

Nostra celeritas maxima ex processu parallelo: 5/3 = 1,66(6) temporibus.

Quid opus est?

fiunt

Petitio exsecutionis semper incipit a processu principali. Dux omnia non parallela et processus parallelas agit. Alii processus, qui easdem petitiones peragunt, processus operariorum dicuntur. Parallel processus usus infrastructure dynamic background artifex processus (ex versione 9.4). Cum aliae partes PostgreSQL processuum potius quam sequela utantur, quaesitum est cum 3 processibus operariorum 4 temporibus velocius quam processus traditionalis esse.

commercium

Processus operarius communicat cum duce per nuntium queue (commissa memoria). Quisque processus habet 2 queues: pro erroribus et pro tuples.

Quot workflows requiruntur?

Modus minimus modus specificatur per modulum max_parallel_workers_per_gather. Postulatio cursor deinde accipit processuum opificum e stagno finito modulo max_parallel_workers size. Ultima limitatio est max_worker_processesid est, totum numerum processuum.

Si processum operarium collocare non potuit, processus unius processus erit.

Investigatio consilium de industria secundum magnitudinem tabulae vel index reducere potest. Sunt huius parametri min_parallel_table_scan_size и min_parallel_index_scan_size.

set min_parallel_table_scan_size='8MB'
8MB table => 1 worker
24MB table => 2 workers
72MB table => 3 workers
x => log(x / min_parallel_table_scan_size) / log(3) + 1 worker

Omne tempus mensa est III temporibus maior quam min_parallel_(index|table)_scan_size, Postgres addit processus artificem. Numerus workflows in gratuita non fundatur. Dependentia circularis difficiles exsecutiones facit implicatas. Sed consilio simplicibus regulis utitur.

In praxi, hae regulae non semper aptae sunt ad productionem, ut numerum processuum laborantis mutare possis pro certa mensa: ALTER TABLE ... SET (parallel_workers = N).

Cur processus parallelus non adhibetur?

Praeter longam restrictionem, adsunt etiam sistendi sumptus;

parallel_setup_cost - Ad vitandum processus parallelas brevium petitionum. Hic modulus aestimat tempus praeparationis memoriae, processum committitur et commutatio notitiae initialis.

parallel_tuple_costCommunicatio inter ducem et operarios morari potest pro portione numerorum tuplis ab opere processuum. Hic modulus sumptus notitiarum commutationum computat.

Neded loop Joins

PostgreSQL 9.6+ может выполнять вложенные циклы параллельно — это простая операция.

explain (costs off) select c_custkey, count(o_orderkey)
                from    customer left outer join orders on
                                c_custkey = o_custkey and o_comment not like '%special%deposits%'
                group by c_custkey;
                                      QUERY PLAN
--------------------------------------------------------------------------------------
 Finalize GroupAggregate
   Group Key: customer.c_custkey
   ->  Gather Merge
         Workers Planned: 4
         ->  Partial GroupAggregate
               Group Key: customer.c_custkey
               ->  Nested Loop Left Join
                     ->  Parallel Index Only Scan using customer_pkey on customer
                     ->  Index Scan using idx_orders_custkey on orders
                           Index Cond: (customer.c_custkey = o_custkey)
                           Filter: ((o_comment)::text !~~ '%special%deposits%'::text)

Collectio in ultimo gradu occurrit, ergo Nested Loop Left Join operatio parallela est. Index Parallel Solus Scan in versione tantum introductus est 10. Similem rationem parallelam Vide intuens operatur. Conditio c_custkey = o_custkey per hunc ordinem legit filum. Non est igitur simile.

Nullam Join

Quisque processus operarius mensam suam detrahens creat usque ad PostgreSQL 11. Et si plures quam quattuor ex his processibus sunt, effectus non emendabit. In nova versione, mensa Nullam communicatur. Quisque processus WORK_MEM uti potest ad mensam detrahendam creare.

select
        l_shipmode,
        sum(case
                when o_orderpriority = '1-URGENT'
                        or o_orderpriority = '2-HIGH'
                        then 1
                else 0
        end) as high_line_count,
        sum(case
                when o_orderpriority <> '1-URGENT'
                        and o_orderpriority <> '2-HIGH'
                        then 1
                else 0
        end) as low_line_count
from
        orders,
        lineitem
where
        o_orderkey = l_orderkey
        and l_shipmode in ('MAIL', 'AIR')
        and l_commitdate < l_receiptdate
        and l_shipdate < l_commitdate
        and l_receiptdate >= date '1996-01-01'
        and l_receiptdate < date '1996-01-01' + interval '1' year
group by
        l_shipmode
order by
        l_shipmode
LIMIT 1;
                                                                                                                                    QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1964755.66..1964961.44 rows=1 width=27) (actual time=7579.592..7922.997 rows=1 loops=1)
   ->  Finalize GroupAggregate  (cost=1964755.66..1966196.11 rows=7 width=27) (actual time=7579.590..7579.591 rows=1 loops=1)
         Group Key: lineitem.l_shipmode
         ->  Gather Merge  (cost=1964755.66..1966195.83 rows=28 width=27) (actual time=7559.593..7922.319 rows=6 loops=1)
               Workers Planned: 4
               Workers Launched: 4
               ->  Partial GroupAggregate  (cost=1963755.61..1965192.44 rows=7 width=27) (actual time=7548.103..7564.592 rows=2 loops=5)
                     Group Key: lineitem.l_shipmode
                     ->  Sort  (cost=1963755.61..1963935.20 rows=71838 width=27) (actual time=7530.280..7539.688 rows=62519 loops=5)
                           Sort Key: lineitem.l_shipmode
                           Sort Method: external merge  Disk: 2304kB
                           Worker 0:  Sort Method: external merge  Disk: 2064kB
                           Worker 1:  Sort Method: external merge  Disk: 2384kB
                           Worker 2:  Sort Method: external merge  Disk: 2264kB
                           Worker 3:  Sort Method: external merge  Disk: 2336kB
                           ->  Parallel Hash Join  (cost=382571.01..1957960.99 rows=71838 width=27) (actual time=7036.917..7499.692 rows=62519 loops=5)
                                 Hash Cond: (lineitem.l_orderkey = orders.o_orderkey)
                                 ->  Parallel Seq Scan on lineitem  (cost=0.00..1552386.40 rows=71838 width=19) (actual time=0.583..4901.063 rows=62519 loops=5)
                                       Filter: ((l_shipmode = ANY ('{MAIL,AIR}'::bpchar[])) AND (l_commitdate < l_receiptdate) AND (l_shipdate < l_commitdate) AND (l_receiptdate >= '1996-01-01'::date) AND (l_receiptdate < '1997-01-01 00:00:00'::timestamp without time zone))
                                       Rows Removed by Filter: 11934691
                                 ->  Parallel Hash  (cost=313722.45..313722.45 rows=3750045 width=20) (actual time=2011.518..2011.518 rows=3000000 loops=5)
                                       Buckets: 65536  Batches: 256  Memory Usage: 3840kB
                                       ->  Parallel Seq Scan on orders  (cost=0.00..313722.45 rows=3750045 width=20) (actual time=0.029..995.948 rows=3000000 loops=5)
 Planning Time: 0.977 ms
 Execution Time: 7923.770 ms

Query 12 ex TPC-H clare ostendit nexum Nullam parallelum. Quisque processus operarius ad mensam communis creationi confert.

Merge Join

A coniunge merge est non parallelus in natura. Noli solliciti esse si hic est ultimus gradus interrogationis - in parallela adhuc currere potest.

-- Query 2 from TPC-H
explain (costs off) select s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone, s_comment
from    part, supplier, partsupp, nation, region
where
        p_partkey = ps_partkey
        and s_suppkey = ps_suppkey
        and p_size = 36
        and p_type like '%BRASS'
        and s_nationkey = n_nationkey
        and n_regionkey = r_regionkey
        and r_name = 'AMERICA'
        and ps_supplycost = (
                select
                        min(ps_supplycost)
                from    partsupp, supplier, nation, region
                where
                        p_partkey = ps_partkey
                        and s_suppkey = ps_suppkey
                        and s_nationkey = n_nationkey
                        and n_regionkey = r_regionkey
                        and r_name = 'AMERICA'
        )
order by s_acctbal desc, n_name, s_name, p_partkey
LIMIT 100;
                                                QUERY PLAN
----------------------------------------------------------------------------------------------------------
 Limit
   ->  Sort
         Sort Key: supplier.s_acctbal DESC, nation.n_name, supplier.s_name, part.p_partkey
         ->  Merge Join
               Merge Cond: (part.p_partkey = partsupp.ps_partkey)
               Join Filter: (partsupp.ps_supplycost = (SubPlan 1))
               ->  Gather Merge
                     Workers Planned: 4
                     ->  Parallel Index Scan using <strong>part_pkey</strong> on part
                           Filter: (((p_type)::text ~~ '%BRASS'::text) AND (p_size = 36))
               ->  Materialize
                     ->  Sort
                           Sort Key: partsupp.ps_partkey
                           ->  Nested Loop
                                 ->  Nested Loop
                                       Join Filter: (nation.n_regionkey = region.r_regionkey)
                                       ->  Seq Scan on region
                                             Filter: (r_name = 'AMERICA'::bpchar)
                                       ->  Hash Join
                                             Hash Cond: (supplier.s_nationkey = nation.n_nationkey)
                                             ->  Seq Scan on supplier
                                             ->  Hash
                                                   ->  Seq Scan on nation
                                 ->  Index Scan using idx_partsupp_suppkey on partsupp
                                       Index Cond: (ps_suppkey = supplier.s_suppkey)
               SubPlan 1
                 ->  Aggregate
                       ->  Nested Loop
                             Join Filter: (nation_1.n_regionkey = region_1.r_regionkey)
                             ->  Seq Scan on region region_1
                                   Filter: (r_name = 'AMERICA'::bpchar)
                             ->  Nested Loop
                                   ->  Nested Loop
                                         ->  Index Scan using idx_partsupp_partkey on partsupp partsupp_1
                                               Index Cond: (part.p_partkey = ps_partkey)
                                         ->  Index Scan using supplier_pkey on supplier supplier_1
                                               Index Cond: (s_suppkey = partsupp_1.ps_suppkey)
                                   ->  Index Scan using nation_pkey on nation nation_1
                                         Index Cond: (n_nationkey = supplier_1.s_nationkey)

Nodus "Merge Join" supra "Colligite Merge" sita est. Ita bus non utitur processui parallelo. Sed "parallelis Index Scan" nodi adhuc segmentum adiuvat part_pkey.

Connexio per sectiones

In PostgreSQL 11 nexum per sectiones debilitata per default: valde pretiosa scheduling habet. Tabulae cum similibus partitionibus partiendo coniungi possunt. Hoc modo Postgres minoribus Nullam tabulis utetur. Utraque connexio sectionum esse potest parallela.

tpch=# set enable_partitionwise_join=t;
tpch=# explain (costs off) select * from prt1 t1, prt2 t2
where t1.a = t2.b and t1.b = 0 and t2.b between 0 and 10000;
                    QUERY PLAN
---------------------------------------------------
 Append
   ->  Hash Join
         Hash Cond: (t2.b = t1.a)
         ->  Seq Scan on prt2_p1 t2
               Filter: ((b >= 0) AND (b <= 10000))
         ->  Hash
               ->  Seq Scan on prt1_p1 t1
                     Filter: (b = 0)
   ->  Hash Join
         Hash Cond: (t2_1.b = t1_1.a)
         ->  Seq Scan on prt2_p2 t2_1
               Filter: ((b >= 0) AND (b <= 10000))
         ->  Hash
               ->  Seq Scan on prt1_p2 t1_1
                     Filter: (b = 0)
tpch=# set parallel_setup_cost = 1;
tpch=# set parallel_tuple_cost = 0.01;
tpch=# explain (costs off) select * from prt1 t1, prt2 t2
where t1.a = t2.b and t1.b = 0 and t2.b between 0 and 10000;
                        QUERY PLAN
-----------------------------------------------------------
 Gather
   Workers Planned: 4
   ->  Parallel Append
         ->  Parallel Hash Join
               Hash Cond: (t2_1.b = t1_1.a)
               ->  Parallel Seq Scan on prt2_p2 t2_1
                     Filter: ((b >= 0) AND (b <= 10000))
               ->  Parallel Hash
                     ->  Parallel Seq Scan on prt1_p2 t1_1
                           Filter: (b = 0)
         ->  Parallel Hash Join
               Hash Cond: (t2.b = t1.a)
               ->  Parallel Seq Scan on prt2_p1 t2
                     Filter: ((b >= 0) AND (b <= 10000))
               ->  Parallel Hash
                     ->  Parallel Seq Scan on prt1_p1 t1
                           Filter: (b = 0)

Summa est, quod connexio in sectionibus est parallela tantum si hae partes satis magnae sunt.

Parallel Append

Parallel Append adhiberi potest pro diversis caudices in diversis workflows. Hoc plerumque fit cum CONIUNCTIONE OMNES queries. Incommodum minus est parallelismus, quia unumquodque laborantis processum tantum processuum 1 petitio est.

Processus operarii hic sunt 2 currentes, licet 4 valeant.

tpch=# explain (costs off) select sum(l_quantity) as sum_qty from lineitem where l_shipdate <= date '1998-12-01' - interval '105' day union all select sum(l_quantity) as sum_qty from lineitem where l_shipdate <= date '2000-12-01' - interval '105' day;
                                           QUERY PLAN
------------------------------------------------------------------------------------------------
 Gather
   Workers Planned: 2
   ->  Parallel Append
         ->  Aggregate
               ->  Seq Scan on lineitem
                     Filter: (l_shipdate <= '2000-08-18 00:00:00'::timestamp without time zone)
         ->  Aggregate
               ->  Seq Scan on lineitem lineitem_1
                     Filter: (l_shipdate <= '1998-08-18 00:00:00'::timestamp without time zone)

Maxime variables

  • WORK_MEM memoriam per processum limitat, non solum quaesita: work_mem fiunt hospites = multum memoria.
  • max_parallel_workers_per_gather - quot operarius processus progressionis exsecutionis adhibebit ad processum parallelum ex consilio.
  • max_worker_processes - numerum processuum operariorum ad numerum CPU nucleorum in calculonis componit.
  • max_parallel_workers — Ejusdem, sed pro processibus parallelis operis.

results

Sicut de versione 9.6, processus parallelus valde emendare potest pervestigationum complexarum quae multos ordines seu indices lustrant. In PostgreSQL X, processus parallelus de defalta potens est. Memento ut disable eam in servientibus cum magna OLTP inposuit. Sequential scans vel index scans multum opes consumunt. Si relationem de tota scriptione non currit, interrogationem perficiendam emendare potes additis indices absentis vel propriis partitionibus utens.

References

Source: www.habr.com

Add a comment