CPU za kisasa zina cores nyingi. Kwa miaka mingi, maombi yamekuwa yakituma maswali kwa hifadhidata kwa sambamba. Ikiwa ni swali la ripoti kwenye safu mlalo nyingi kwenye jedwali, huendesha haraka wakati wa kutumia CPU nyingi, na PostgreSQL imeweza kufanya hivi tangu toleo la 9.6.
Ilichukua miaka 3 kutekeleza kipengele cha hoja sambamba - ilitubidi kuandika upya msimbo katika hatua tofauti za utekelezaji wa hoja. PostgreSQL 9.6 ilianzisha miundombinu ili kuboresha zaidi msimbo. Katika matoleo yanayofuata, aina nyingine za maswali hutekelezwa kwa sambamba.
Vikwazo
Usiwashe utekelezaji sambamba ikiwa cores zote tayari zina shughuli, vinginevyo maombi mengine yatapungua.
Muhimu zaidi, usindikaji sambamba na maadili ya juu ya WORK_MEM hutumia kumbukumbu nyingi - kila heshi ya kujiunga au kupanga inachukua kumbukumbu ya work_mem.
Hoja za muda wa chini za OLTP haziwezi kuharakishwa kwa utekelezaji sambamba. Na ikiwa swali litarejesha safu mlalo moja, uchakataji sambamba utalipunguza tu.
Wasanidi programu wanapenda kutumia alama ya TPC-H. Labda una maswali sawa kwa utekelezaji kamili sambamba.
Hoji za CHAGUA pekee bila kufunga kiima hutekelezwa kwa sambamba.
Wakati mwingine uorodheshaji sahihi ni bora kuliko utambazaji wa jedwali mfuatano katika hali sambamba.
Kusitisha hoja na vishale havitumiki.
Vitendaji vya dirisha na utendakazi wa seti ya jumla iliyopangwa haziwiani.
Hupati chochote katika mzigo wa kazi wa I/O.
Hakuna algoriti za kupanga sambamba. Lakini hoja zilizo na aina zinaweza kutekelezwa kwa sambamba katika baadhi ya vipengele.
Badilisha CTE (WITH ...) na CHAGUA kilichowekwa ili kuwezesha uchakataji sambamba.
Vifungashio vya data vya mtu wa tatu bado havitumii uchakataji sambamba (lakini vinaweza!)
FULL OUTER JOIN haitumiki.
max_rows huzima uchakataji sambamba.
Ikiwa swali lina chaguo za kukokotoa ambazo hazijawekewa alama ya PARALLEL SAFE, itakuwa na thread moja.
Kiwango cha kutenga cha SERIALIZABLE huzima uchakataji sambamba.
Mazingira ya mtihani
Wasanidi wa PostgreSQL walijaribu kupunguza muda wa majibu wa hoja za alama za TPC-H. Pakua alama na ibadilishe kwa PostgreSQL. Haya ni matumizi yasiyo rasmi ya kipimo cha TPC-H - si cha ulinganisho wa hifadhidata au maunzi.
Badilisha jina makefile.suite kuwa Makefile na ubadilishe kama ilivyoelezwa hapa: https://github.com/tvondra/pg_tpch . Kusanya msimbo na make amri.
Tengeneza data: ./dbgen -s 10 huunda hifadhidata ya GB 23. Hii inatosha kuona tofauti katika utendaji wa maswali sambamba na yasiyo ya sambamba.
Badilisha faili tbl в csv с for и sed.
Funga hazina pg_tpch na kunakili faili csv в pg_tpch/dss/data.
Unda maswali kwa amri qgen.
Pakia data kwenye hifadhidata kwa amri ./tpch.sh.
Uchanganuzi wa kufuatana sambamba
Huenda ikawa kasi zaidi si kwa sababu ya usomaji sambamba, lakini kwa sababu data imeenea kwenye viini vingi vya CPU. Katika mifumo ya kisasa ya uendeshaji, faili za data za PostgreSQL zimehifadhiwa vizuri. Kwa kusoma mbele, inawezekana kupata kizuizi kikubwa kutoka kwa hifadhi kuliko maombi ya daemon ya PG. Kwa hiyo, utendaji wa swala hauzuiliwi na diski I/O. Inatumia mizunguko ya CPU kwa:
soma safu moja baada ya nyingine kutoka kwa kurasa za jedwali;
kulinganisha maadili ya kamba na masharti WHERE.
Wacha tufanye swali rahisi select:
tpch=# explain analyze select l_quantity as sum_qty from lineitem where l_shipdate <= date '1998-12-01' - interval '105' day;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Seq Scan on lineitem (cost=0.00..1964772.00 rows=58856235 width=5) (actual time=0.014..16951.669 rows=58839715 loops=1)
Filter: (l_shipdate <= '1998-08-18 00:00:00'::timestamp without time zone)
Rows Removed by Filter: 1146337
Planning Time: 0.203 ms
Execution Time: 19035.100 ms
Uchanganuzi mfuatano hutoa safu mlalo nyingi sana bila kujumlisha, kwa hivyo hoja inatekelezwa na msingi mmoja wa CPU.
Ukiongeza SUM(), unaweza kuona kwamba mtiririko wa kazi mbili utasaidia kuharakisha hoja:
explain analyze select sum(l_quantity) as sum_qty from lineitem where l_shipdate <= date '1998-12-01' - interval '105' day;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=1589702.14..1589702.15 rows=1 width=32) (actual time=8553.365..8553.365 rows=1 loops=1)
-> Gather (cost=1589701.91..1589702.12 rows=2 width=32) (actual time=8553.241..8555.067 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=1588701.91..1588701.92 rows=1 width=32) (actual time=8547.546..8547.546 rows=1 loops=3)
-> Parallel Seq Scan on lineitem (cost=0.00..1527393.33 rows=24523431 width=5) (actual time=0.038..5998.417 rows=19613238 loops=3)
Filter: (l_shipdate <= '1998-08-18 00:00:00'::timestamp without time zone)
Rows Removed by Filter: 382112
Planning Time: 0.241 ms
Execution Time: 8555.131 ms
Mkusanyiko sambamba
Nodi ya Sambamba ya Uchanganuzi wa Seq hutoa safu mlalo kwa ujumlisho wa sehemu. Nodi ya "Partial Aggregate" hupunguza mistari hii kwa kutumia SUM(). Mwishoni, kaunta ya SUM kutoka kwa kila mchakato wa mfanyakazi inakusanywa na nodi ya "Kusanya".
Matokeo ya mwisho yanahesabiwa na nodi ya "Finalize Aggregate". Ikiwa una vitendaji vyako vya kujumlisha, usisahau kuzitia alama kama "salama sambamba".
Idadi ya michakato ya wafanyikazi
Idadi ya michakato ya wafanyikazi inaweza kuongezeka bila kuanzisha tena seva:
explain analyze select sum(l_quantity) as sum_qty from lineitem where l_shipdate <= date '1998-12-01' - interval '105' day;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=1589702.14..1589702.15 rows=1 width=32) (actual time=8553.365..8553.365 rows=1 loops=1)
-> Gather (cost=1589701.91..1589702.12 rows=2 width=32) (actual time=8553.241..8555.067 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=1588701.91..1588701.92 rows=1 width=32) (actual time=8547.546..8547.546 rows=1 loops=3)
-> Parallel Seq Scan on lineitem (cost=0.00..1527393.33 rows=24523431 width=5) (actual time=0.038..5998.417 rows=19613238 loops=3)
Filter: (l_shipdate <= '1998-08-18 00:00:00'::timestamp without time zone)
Rows Removed by Filter: 382112
Planning Time: 0.241 ms
Execution Time: 8555.131 ms
Nini kinaendelea hapa? Kulikuwa na michakato ya kazi mara 2 zaidi, na ombi likawa mara 1,6599 tu haraka. Mahesabu yanavutia. Tulikuwa na michakato 2 ya wafanyikazi na kiongozi 1. Baada ya mabadiliko ikawa 4+1.
Upeo wetu wa kasi kutoka kwa usindikaji sambamba: 5/3 = 1,66 (6) mara.
Jinsi gani kazi?
Mchakato
Utekelezaji wa ombi daima huanza na mchakato unaoongoza. Kiongozi hufanya kila kitu kisicho sawa na usindikaji fulani sambamba. Michakato mingine inayofanya maombi sawa inaitwa michakato ya wafanyikazi. Usindikaji sambamba hutumia miundombinu michakato ya mfanyikazi wa mandharinyuma yenye nguvu (kutoka toleo la 9.4). Kwa kuwa sehemu zingine za PostgreSQL hutumia michakato badala ya nyuzi, hoja iliyo na michakato 3 ya wafanyikazi inaweza kuwa haraka mara 4 kuliko usindikaji wa jadi.
Mwingiliano
Michakato ya wafanyikazi huwasiliana na kiongozi kupitia foleni ya ujumbe (kulingana na kumbukumbu iliyoshirikiwa). Kila mchakato una foleni 2: kwa makosa na nakala.
Kila wakati meza ni kubwa mara 3 kuliko min_parallel_(index|table)_scan_size, Postgres inaongeza mchakato wa mfanyakazi. Idadi ya mtiririko wa kazi haitegemei gharama. Utegemezi wa mduara hufanya utekelezaji changamano kuwa mgumu. Badala yake, mpangaji hutumia sheria rahisi.
Kwa mazoezi, sheria hizi hazifai kila wakati kwa uzalishaji, kwa hivyo unaweza kubadilisha idadi ya michakato ya wafanyikazi kwa meza maalum: ALTER TABLE ... SET (parallel_workers = N).
Kwa nini usindikaji sambamba hautumiwi?
Mbali na orodha ndefu ya vikwazo, pia kuna ukaguzi wa gharama:
parallel_setup_cost - ili kuepuka usindikaji sambamba wa maombi mafupi. Kigezo hiki kinakadiria wakati wa kuandaa kumbukumbu, kuanza mchakato na kubadilishana data ya awali.
parallel_tuple_cost: mawasiliano kati ya kiongozi na wafanyikazi yanaweza kucheleweshwa kulingana na idadi ya nakala kutoka kwa michakato ya kazi. Kigezo hiki huhesabu gharama ya kubadilishana data.
Nested Loop Joins
PostgreSQL 9.6+ может выполнять вложенные циклы параллельно — это простая операция.
explain (costs off) select c_custkey, count(o_orderkey)
from customer left outer join orders on
c_custkey = o_custkey and o_comment not like '%special%deposits%'
group by c_custkey;
QUERY PLAN
--------------------------------------------------------------------------------------
Finalize GroupAggregate
Group Key: customer.c_custkey
-> Gather Merge
Workers Planned: 4
-> Partial GroupAggregate
Group Key: customer.c_custkey
-> Nested Loop Left Join
-> Parallel Index Only Scan using customer_pkey on customer
-> Index Scan using idx_orders_custkey on orders
Index Cond: (customer.c_custkey = o_custkey)
Filter: ((o_comment)::text !~~ '%special%deposits%'::text)
Mkusanyiko hutokea katika hatua ya mwisho, kwa hivyo Nested Loop Left Join ni operesheni sambamba. Sambamba Index Pekee Scan ilianzishwa tu katika toleo la 10. Inafanya kazi sawa na utambazaji wa serial sambamba. Hali c_custkey = o_custkey husoma agizo moja kwa kila mfuatano wa mteja. Hivyo si sambamba.
Hash Jiunge
Kila mchakato wa mfanyakazi huunda jedwali lake la heshi hadi PostgreSQL 11. Na ikiwa kuna zaidi ya michakato minne ya michakato hii, utendakazi hautaboreka. Katika toleo jipya, jedwali la hashi linashirikiwa. Kila mchakato wa mfanyakazi unaweza kutumia WORK_MEM kuunda jedwali la hashi.
select
l_shipmode,
sum(case
when o_orderpriority = '1-URGENT'
or o_orderpriority = '2-HIGH'
then 1
else 0
end) as high_line_count,
sum(case
when o_orderpriority <> '1-URGENT'
and o_orderpriority <> '2-HIGH'
then 1
else 0
end) as low_line_count
from
orders,
lineitem
where
o_orderkey = l_orderkey
and l_shipmode in ('MAIL', 'AIR')
and l_commitdate < l_receiptdate
and l_shipdate < l_commitdate
and l_receiptdate >= date '1996-01-01'
and l_receiptdate < date '1996-01-01' + interval '1' year
group by
l_shipmode
order by
l_shipmode
LIMIT 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1964755.66..1964961.44 rows=1 width=27) (actual time=7579.592..7922.997 rows=1 loops=1)
-> Finalize GroupAggregate (cost=1964755.66..1966196.11 rows=7 width=27) (actual time=7579.590..7579.591 rows=1 loops=1)
Group Key: lineitem.l_shipmode
-> Gather Merge (cost=1964755.66..1966195.83 rows=28 width=27) (actual time=7559.593..7922.319 rows=6 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Partial GroupAggregate (cost=1963755.61..1965192.44 rows=7 width=27) (actual time=7548.103..7564.592 rows=2 loops=5)
Group Key: lineitem.l_shipmode
-> Sort (cost=1963755.61..1963935.20 rows=71838 width=27) (actual time=7530.280..7539.688 rows=62519 loops=5)
Sort Key: lineitem.l_shipmode
Sort Method: external merge Disk: 2304kB
Worker 0: Sort Method: external merge Disk: 2064kB
Worker 1: Sort Method: external merge Disk: 2384kB
Worker 2: Sort Method: external merge Disk: 2264kB
Worker 3: Sort Method: external merge Disk: 2336kB
-> Parallel Hash Join (cost=382571.01..1957960.99 rows=71838 width=27) (actual time=7036.917..7499.692 rows=62519 loops=5)
Hash Cond: (lineitem.l_orderkey = orders.o_orderkey)
-> Parallel Seq Scan on lineitem (cost=0.00..1552386.40 rows=71838 width=19) (actual time=0.583..4901.063 rows=62519 loops=5)
Filter: ((l_shipmode = ANY ('{MAIL,AIR}'::bpchar[])) AND (l_commitdate < l_receiptdate) AND (l_shipdate < l_commitdate) AND (l_receiptdate >= '1996-01-01'::date) AND (l_receiptdate < '1997-01-01 00:00:00'::timestamp without time zone))
Rows Removed by Filter: 11934691
-> Parallel Hash (cost=313722.45..313722.45 rows=3750045 width=20) (actual time=2011.518..2011.518 rows=3000000 loops=5)
Buckets: 65536 Batches: 256 Memory Usage: 3840kB
-> Parallel Seq Scan on orders (cost=0.00..313722.45 rows=3750045 width=20) (actual time=0.029..995.948 rows=3000000 loops=5)
Planning Time: 0.977 ms
Execution Time: 7923.770 ms
Hoja ya 12 kutoka TPC-H inaonyesha wazi muunganisho wa heshi sambamba. Kila mchakato wa mfanyakazi huchangia kuundwa kwa meza ya kawaida ya hashi.
Unganisha Jiunge
Unganisho la kuunganisha halilingani kwa asili. Usijali ikiwa hii ni hatua ya mwisho ya hoja - bado inaweza kufanya kazi sambamba.
-- Query 2 from TPC-H
explain (costs off) select s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone, s_comment
from part, supplier, partsupp, nation, region
where
p_partkey = ps_partkey
and s_suppkey = ps_suppkey
and p_size = 36
and p_type like '%BRASS'
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = 'AMERICA'
and ps_supplycost = (
select
min(ps_supplycost)
from partsupp, supplier, nation, region
where
p_partkey = ps_partkey
and s_suppkey = ps_suppkey
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = 'AMERICA'
)
order by s_acctbal desc, n_name, s_name, p_partkey
LIMIT 100;
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Limit
-> Sort
Sort Key: supplier.s_acctbal DESC, nation.n_name, supplier.s_name, part.p_partkey
-> Merge Join
Merge Cond: (part.p_partkey = partsupp.ps_partkey)
Join Filter: (partsupp.ps_supplycost = (SubPlan 1))
-> Gather Merge
Workers Planned: 4
-> Parallel Index Scan using <strong>part_pkey</strong> on part
Filter: (((p_type)::text ~~ '%BRASS'::text) AND (p_size = 36))
-> Materialize
-> Sort
Sort Key: partsupp.ps_partkey
-> Nested Loop
-> Nested Loop
Join Filter: (nation.n_regionkey = region.r_regionkey)
-> Seq Scan on region
Filter: (r_name = 'AMERICA'::bpchar)
-> Hash Join
Hash Cond: (supplier.s_nationkey = nation.n_nationkey)
-> Seq Scan on supplier
-> Hash
-> Seq Scan on nation
-> Index Scan using idx_partsupp_suppkey on partsupp
Index Cond: (ps_suppkey = supplier.s_suppkey)
SubPlan 1
-> Aggregate
-> Nested Loop
Join Filter: (nation_1.n_regionkey = region_1.r_regionkey)
-> Seq Scan on region region_1
Filter: (r_name = 'AMERICA'::bpchar)
-> Nested Loop
-> Nested Loop
-> Index Scan using idx_partsupp_partkey on partsupp partsupp_1
Index Cond: (part.p_partkey = ps_partkey)
-> Index Scan using supplier_pkey on supplier supplier_1
Index Cond: (s_suppkey = partsupp_1.ps_suppkey)
-> Index Scan using nation_pkey on nation nation_1
Index Cond: (n_nationkey = supplier_1.s_nationkey)
Nodi ya "Unganisha Jiunge" iko juu ya "Kusanya Unganisha". Kwa hivyo kuunganisha haitumii usindikaji sambamba. Lakini nodi ya "Sambamba Index Scan" bado inasaidia na sehemu part_pkey.
Uunganisho kwa sehemu
Katika PostgreSQL 11 kuunganishwa kwa sehemu imezimwa kwa chaguo-msingi: ina upangaji wa gharama kubwa sana. Jedwali zilizo na ugawaji sawa zinaweza kuunganishwa kwa kizigeu. Kwa njia hii Postgres itatumia meza ndogo za hashi. Kila uunganisho wa sehemu unaweza kuwa sambamba.
tpch=# set enable_partitionwise_join=t;
tpch=# explain (costs off) select * from prt1 t1, prt2 t2
where t1.a = t2.b and t1.b = 0 and t2.b between 0 and 10000;
QUERY PLAN
---------------------------------------------------
Append
-> Hash Join
Hash Cond: (t2.b = t1.a)
-> Seq Scan on prt2_p1 t2
Filter: ((b >= 0) AND (b <= 10000))
-> Hash
-> Seq Scan on prt1_p1 t1
Filter: (b = 0)
-> Hash Join
Hash Cond: (t2_1.b = t1_1.a)
-> Seq Scan on prt2_p2 t2_1
Filter: ((b >= 0) AND (b <= 10000))
-> Hash
-> Seq Scan on prt1_p2 t1_1
Filter: (b = 0)
tpch=# set parallel_setup_cost = 1;
tpch=# set parallel_tuple_cost = 0.01;
tpch=# explain (costs off) select * from prt1 t1, prt2 t2
where t1.a = t2.b and t1.b = 0 and t2.b between 0 and 10000;
QUERY PLAN
-----------------------------------------------------------
Gather
Workers Planned: 4
-> Parallel Append
-> Parallel Hash Join
Hash Cond: (t2_1.b = t1_1.a)
-> Parallel Seq Scan on prt2_p2 t2_1
Filter: ((b >= 0) AND (b <= 10000))
-> Parallel Hash
-> Parallel Seq Scan on prt1_p2 t1_1
Filter: (b = 0)
-> Parallel Hash Join
Hash Cond: (t2.b = t1.a)
-> Parallel Seq Scan on prt2_p1 t2
Filter: ((b >= 0) AND (b <= 10000))
-> Parallel Hash
-> Parallel Seq Scan on prt1_p1 t1
Filter: (b = 0)
Jambo kuu ni kwamba uunganisho katika sehemu ni sambamba tu ikiwa sehemu hizi ni kubwa za kutosha.
Nyongeza Sambamba
Nyongeza Sambamba inaweza kutumika badala ya vizuizi tofauti katika mtiririko tofauti wa kazi. Hii kwa kawaida hutokea kwa hoja za UNION ALL. Ubaya ni usawa mdogo, kwa sababu kila mchakato wa mfanyakazi huchakata ombi 1 pekee.
Kuna michakato 2 ya wafanyikazi inayoendesha hapa, ingawa 4 imewezeshwa.
tpch=# explain (costs off) select sum(l_quantity) as sum_qty from lineitem where l_shipdate <= date '1998-12-01' - interval '105' day union all select sum(l_quantity) as sum_qty from lineitem where l_shipdate <= date '2000-12-01' - interval '105' day;
QUERY PLAN
------------------------------------------------------------------------------------------------
Gather
Workers Planned: 2
-> Parallel Append
-> Aggregate
-> Seq Scan on lineitem
Filter: (l_shipdate <= '2000-08-18 00:00:00'::timestamp without time zone)
-> Aggregate
-> Seq Scan on lineitem lineitem_1
Filter: (l_shipdate <= '1998-08-18 00:00:00'::timestamp without time zone)
Vigezo muhimu zaidi
WORK_MEM inaweka kikomo kumbukumbu kwa kila mchakato, si hoja tu: work_mem michakato miunganisho = kumbukumbu nyingi.
max_parallel_workers_per_gather — ni wangapi wa michakato ya wafanyikazi ambayo programu ya utekelezaji itatumia kwa usindikaji sambamba kutoka kwa mpango.
max_worker_processes - hurekebisha jumla ya idadi ya michakato ya mfanyakazi kwa idadi ya cores za CPU kwenye seva.
Kuanzia toleo la 9.6, uchakataji sambamba unaweza kuboresha kwa kiasi kikubwa utendakazi wa hoja tata ambazo huchanganua safu mlalo au faharasa nyingi. Katika PostgreSQL 10, usindikaji sambamba huwezeshwa na chaguo-msingi. Kumbuka kuizima kwenye seva zilizo na mzigo mkubwa wa kazi wa OLTP. Uchanganuzi mfuatano au uchanganuzi wa faharasa hutumia rasilimali nyingi. Ikiwa hauonyeshi ripoti kwenye mkusanyiko mzima wa data, unaweza kuboresha utendaji wa hoja kwa kuongeza faharasa zinazokosekana au kutumia ugawaji sahihi.