Nchịkọta arụmọrụ na ụlọ ọrụ microservice: enyemaka na ngwa ngwa Postgres FDW

Microservice architecture, dị ka ihe niile dị n'ụwa a, nwere uru na ọghọm ya. Ụfọdụ usoro na-adị mfe na ya, ndị ọzọ siri ike karị. Na n'ihi ọsọ nke mgbanwe na mma scalability, ị kwesịrị ịchụ àjà. Otu n'ime ha bụ mgbagwoju anya nke nyocha. Ọ bụrụ na n'ime monolith, enwere ike belata nyocha ọrụ niile na ajụjụ SQL ka ọ bụrụ ihe nyocha nyocha, yabụ na ụlọ ọrụ multiservice ọrụ ọ bụla nwere nchekwa data nke ya yana ọ dị ka otu ajụjụ ezughị (ma ọ bụ ikekwe ọ ga-?). Maka ndị nwere mmasị na otu anyị si edozi nsogbu nke nyocha ọrụ na ụlọ ọrụ anyị na otu anyị si amụta ibi ndụ na ngwọta a - welcome.

Nchịkọta arụmọrụ na ụlọ ọrụ microservice: enyemaka na ngwa ngwa Postgres FDW
Aha m bụ Pavel Sivash, na DomClick m na-arụ ọrụ na otu ndị na-ahụ maka idobe ụlọ nkwakọba ihe nyocha. N'otu oge, ihe omume anyị nwere ike ịsị na injinịa data, mana, n'ezie, ụdị ọrụ dị iche iche sara mbara karị. Enwere ọkọlọtọ data engineering ETL / ELT, nkwado na mmegharị nke ngwaọrụ nyocha data na mmepe nke ngwaọrụ nke ha. Karịsịa, maka mkpesa arụ ọrụ, anyị kpebiri "ime ka a ga-asị" na anyị nwere monolith ma nye ndị nyocha otu nchekwa data nke ga-enwe data niile ha chọrọ.

N'ozuzu, anyị tụlere nhọrọ dị iche iche. Ọ bụ omume na-ewu a zuru-fledged repository - anyị ọbụna gbalịrị, ma, n'eziokwu, anyị enweghị ike ime enyi na pụtara ugboro ugboro mgbanwe na mgbagha na a kama ngwa ngwa usoro iwu a repository na-eme mgbanwe na ya ( ọ bụrụ na mmadụ aga nke ọma, dee na comments otú). Ị nwere ike ịsị ndị nyocha: "Ụmụ okorobịa, mụta Python wee gaa n'ahịrị nyocha," mana nke a bụ ihe ọzọ a chọrọ n'ọrụ, ma ọ dị ka ekwesịrị izere nke a ma ọ bụrụ na ọ ga-ekwe omume. Anyị kpebiri ịnwale iji teknụzụ FDW (Foreign Data Wrapper): n'ezie, nke a bụ ọkọlọtọ dblink, nke dị na ọkọlọtọ SQL, mana yana interface ya dabara adaba. Na ndabere ya, anyị mere mkpebi, nke mechara gbanye mgbọrọgwụ, anyị kpebiri na ya. Nkọwa ya bụ isiokwu nke isiokwu dị iche, ma eleghị anya karịa otu, n'ihi na achọrọ m ikwu banyere ọtụtụ ihe: site na mmekọrịta schema nchekwa data iji nweta njikwa na depersonalization nke data nkeonwe. Ekwesiri ighota na ngwọta a abụghị ihe ngbanwe maka ezigbo ọdụ data nyocha na ebe nchekwa, ọ na-edozi naanị otu nsogbu.

N'ọkwa dị elu ọ dị ka nke a:

Nchịkọta arụmọrụ na ụlọ ọrụ microservice: enyemaka na ngwa ngwa Postgres FDW
Enwere nchekwa data PostgreSQL ebe ndị ọrụ nwere ike ịchekwa data ọrụ ha, na nke kachasị mkpa, ejikọrọ ihe nyocha nke ọrụ niile na nchekwa data a site na FDW. Nke a na-eme ka o kwe omume ịde ajụjụ na ọtụtụ ọdụ data, na ọ dịghị mkpa ihe ọ bụ: PostgreSQL, MySQL, MongoDB ma ọ bụ ihe ọzọ (faịlụ, API, ma ọ bụrụ na mberede enweghị ihe mkpuchi kwesịrị ekwesị, ị nwere ike ide nke gị). Ọfọn, ihe niile yiri ka ọ dị mma! Na-ekewa?

Ọ bụrụ na ihe niile kwụsịrị ngwa ngwa na ngwa ngwa, mgbe ahụ, eleghị anya, isiokwu ahụ agaghị adị.

Ọ dị mkpa ime ka o doo anya ka postgres si ejikwa arịrịọ na sava dịpụrụ adịpụ. Nke a dị ka ihe ezi uche dị na ya, ma ọtụtụ mgbe ndị mmadụ anaghị ege ntị na ya: postgres na-ekewa ajụjụ ahụ n'ime akụkụ ndị a na-egbu onwe ha na sava ndị dịpụrụ adịpụ, na-anakọta data a, ma na-eme mgbakọ ikpeazụ n'onwe ya, ya mere ọsọ ọsọ ajụjụ ga-adabere n'ụzọ dị ukwuu. e dere ya. E kwesịkwara ịmara: mgbe data sitere na ihe nkesa dịpụrụ adịpụ, ha enwekwaghị indexes, ọ dịghị ihe ọ bụla ga-enyere onye nhazi oge aka, ya mere, naanị anyị onwe anyị nwere ike inyere aka na-atụ aro ya. Ma nke ahụ bụ ihe m chọrọ ikwu n'ụzọ zuru ezu karị.

Arịrịọ dị mfe na atụmatụ ya

Iji gosi ka Postgres na-ajụ ajụjụ tebụl nde 6 na sava dịpụrụ adịpụ, ka anyị lelee atụmatụ dị mfe.

explain analyze verbose  
SELECT count(1)
FROM fdw_schema.table;

Aggregate  (cost=418383.23..418383.24 rows=1 width=8) (actual time=3857.198..3857.198 rows=1 loops=1)
  Output: count(1)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.00..402376.14 rows=6402838 width=0) (actual time=4.874..3256.511 rows=6406868 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Remote SQL: SELECT NULL FROM fdw_schema.table
Planning time: 0.986 ms
Execution time: 3857.436 ms

Iji nkwupụta VERBOSE na-enye gị ohere ịhụ ajụjụ a ga-eziga na sava dịpụrụ adịpụ yana nsonaazụ nke anyị ga-enweta maka nhazi ọzọ (Eriri RemoteSQL).

Ka anyị gaa n'ihu ntakịrị wee tinye ọtụtụ nzacha na ajụjụ anyị: otu site boolean ubi, otu site ntinye timestamp kwa nkeji na otu otu jsonb.

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table 
WHERE is_active is True
AND created_dt BETWEEN CURRENT_DATE - INTERVAL '7 month' 
AND CURRENT_DATE - INTERVAL '6 month'
AND meta->>'source' = 'test';

Aggregate  (cost=577487.69..577487.70 rows=1 width=8) (actual time=27473.818..25473.819 rows=1 loops=1)
  Output: count(1)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.00..577469.21 rows=7390 width=0) (actual time=31.369..25372.466 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Filter: (("table".is_active IS TRUE) AND (("table".meta ->> 'source'::text) = 'test'::text) AND ("table".created_dt >= (('now'::cstring)::date - '7 mons'::interval)) AND ("table".created_dt <= ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)))
        Rows Removed by Filter: 5046843
        Remote SQL: SELECT created_dt, is_active, meta FROM fdw_schema.table
Planning time: 0.665 ms
Execution time: 27474.118 ms

Nke a bụ ebe oge ahụ dị, nke ị kwesịrị ịṅa ntị mgbe ị na-ede ajụjụ. Ebufeghị ihe nzacha ndị ahụ na sava dịpụrụ adịpụ, nke pụtara na iji mebie ya, postgres na-adọta ahịrị nde isii niile iji nyochaa mpaghara (akara nzacha) wee mee nchịkọta ma emechaa. Isi ihe na-aga nke ọma bụ ịde ajụjụ ka a na-ebufe ihe nzacha na igwe dịpụrụ adịpụ, anyị na-enweta ma na-achịkọta naanị ahịrị ndị dị mkpa.

Nke ahụ bụ ụfọdụ booleanshit

Site na ubi boolean, ihe niile dị mfe. N'ajụjụ mbụ, nsogbu bụ n'ihi onye ọrụ is. Ọ bụrụ na anyị ejiri dochie ya =, mgbe ahụ anyị na-enweta nsonaazụ ndị a:

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table
WHERE is_active = True
AND created_dt BETWEEN CURRENT_DATE - INTERVAL '7 month' 
AND CURRENT_DATE - INTERVAL '6 month'
AND meta->>'source' = 'test';

Aggregate  (cost=508010.14..508010.15 rows=1 width=8) (actual time=19064.314..19064.314 rows=1 loops=1)
  Output: count(1)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.00..507988.44 rows=8679 width=0) (actual time=33.035..18951.278 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Filter: ((("table".meta ->> 'source'::text) = 'test'::text) AND ("table".created_dt >= (('now'::cstring)::date - '7 mons'::interval)) AND ("table".created_dt <= ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)))
        Rows Removed by Filter: 3567989
        Remote SQL: SELECT created_dt, meta FROM fdw_schema.table WHERE (is_active)
Planning time: 0.834 ms
Execution time: 19064.534 ms

Dịka ị na-ahụ, nzacha ahụ fegara na sava dịpụrụ adịpụ, ma belata oge igbu ya site na 27 ruo 19 sekọnd.

Ekwesiri iburu n'uche na onye ọrụ is dị iche na onye ọrụ = nke nwere ike ịrụ ọrụ na uru Null. Ọ pụtara na abụghị Eziokwu na nzacha ga-ahapụ ụkpụrụ Ụgha na efu, mgbe != Eziokwu ga-ahapụ naanị ụkpụrụ ụgha. Ya mere, mgbe dochie onye ọrụ adịghị ị ga-ebufe ọnọdụ abụọ na nzacha ya na onye ọrụ OR, dịka ọmụmaatụ, Ebe (col ! = Ezi) MA ọ bụ (col bụ efu).

Na boolean chepụtara, na-aga n'ihu. Ka ọ dị ugbu a, ka anyị weghachi nzacha site na uru boolean n'ụdị mbụ ya ka anyị wee tụleghachi mmetụta nke mgbanwe ndị ọzọ n'onwe ya.

timestamptz? hz

N'ozuzu, ị na-enwekarị ịnwale otu esi ede ajụjụ n'ụzọ ziri ezi nke metụtara sava ndị dịpụrụ adịpụ, naanị wee chọọ nkọwa ihe kpatara nke a ji eme. Enwere ike ịchọta ozi gbasara nke a na ịntanetị. Yabụ, n'ime nnwale, anyị chọpụtara na nzacha ụbọchị a kapịrị ọnụ na-efega na sava dịpụrụ adịpụ nke nwere bang, mana mgbe anyị chọrọ ịtọ ụbọchị ahụ nke ọma, dịka ọmụmaatụ, ugbu a () ma ọ bụ CURRENT_DATE, nke a anaghị eme. N'ọmụmaatụ anyị, anyị atụkwasịla ihe nzacha nke mere na kọlụm Create_at nwere data maka otu ọnwa gara aga (BETWEEN CURRENT_DATE - INTERVAL '1 month' AND CURRENT_DATE - INTERVAL '7 month'). Gịnị ka anyị mere n'okwu a?

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table 
WHERE is_active is True
AND created_dt >= (SELECT CURRENT_DATE::timestamptz - INTERVAL '7 month') 
AND created_dt <(SELECT CURRENT_DATE::timestamptz - INTERVAL '6 month')
AND meta->>'source' = 'test';

Aggregate  (cost=306875.17..306875.18 rows=1 width=8) (actual time=4789.114..4789.115 rows=1 loops=1)
  Output: count(1)
  InitPlan 1 (returns $0)
    ->  Result  (cost=0.00..0.02 rows=1 width=8) (actual time=0.007..0.008 rows=1 loops=1)
          Output: ((('now'::cstring)::date)::timestamp with time zone - '7 mons'::interval)
  InitPlan 2 (returns $1)
    ->  Result  (cost=0.00..0.02 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=1)
          Output: ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.02..306874.86 rows=105 width=0) (actual time=23.475..4681.419 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Filter: (("table".is_active IS TRUE) AND (("table".meta ->> 'source'::text) = 'test'::text))
        Rows Removed by Filter: 76934
        Remote SQL: SELECT is_active, meta FROM fdw_schema.table WHERE ((created_dt >= $1::timestamp with time zone)) AND ((created_dt < $2::timestamp with time zone))
Planning time: 0.703 ms
Execution time: 4789.379 ms

Anyị kpaliri onye na-eme atụmatụ ka ọ gbakọọ ụbọchị tupu oge eruo na subquery ma nyefee mgbanwe ahụ akwadoro na nzacha. Na ndumodu a nyere anyị nnukwu nsonaazụ, ajụjụ ahụ ghọrọ ihe fọrọ nke nta ka ọ bụrụ ugboro 6 ngwa ngwa!

Ọzọ, ọ dị mkpa ịkpachara anya ebe a: ụdị data dị na subquery ga-adị ka nke ubi nke anyị na-enyocha ya, ma ọ bụghị ya, onye nhazi ahụ ga-ekpebi na ebe ọ bụ na ụdị dị iche iche dị iche iche na ọ dị mkpa ka ọ buru ụzọ nweta ihe niile. data wee nyochaa ya na mpaghara.

Ka anyị weghachi nzacha site na ụbọchị na uru mbụ ya.

Freddy vs. jsonb

N'ozuzu, ubi boolean na ụbọchị akwalitelarị ajụjụ anyị nke ọma, mana enwere otu ụdị data ọzọ. Agha na nzacha site na ya, n'eziokwu, akwụsịbeghị, n'agbanyeghị na e nwekwara ihe ịga nke ọma ebe a. Yabụ, nke a bụ ka anyị siri nwee ike ịgafe nzacha ahụ jsonb ubi gaa na nkesa dịpụrụ adịpụ.

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table 
WHERE is_active is True
AND created_dt BETWEEN CURRENT_DATE - INTERVAL '7 month' 
AND CURRENT_DATE - INTERVAL '6 month'
AND meta @> '{"source":"test"}'::jsonb;

Aggregate  (cost=245463.60..245463.61 rows=1 width=8) (actual time=6727.589..6727.590 rows=1 loops=1)
  Output: count(1)
  ->  Foreign Scan on fdw_schema."table"  (cost=1100.00..245459.90 rows=1478 width=0) (actual time=16.213..6634.794 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Filter: (("table".is_active IS TRUE) AND ("table".created_dt >= (('now'::cstring)::date - '7 mons'::interval)) AND ("table".created_dt <= ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)))
        Rows Removed by Filter: 619961
        Remote SQL: SELECT created_dt, is_active FROM fdw_schema.table WHERE ((meta @> '{"source": "test"}'::jsonb))
Planning time: 0.747 ms
Execution time: 6727.815 ms

Kama ihichapụ ndị ọrụ, ị ga-ejirịrị ọnụnọ nke otu onye ọrụ. jsonb n'ụzọ dị iche. 7 sekọnd kama nke mbụ 29. ​​Ka ọ dị ugbu a, nke a bụ naanị nhọrọ na-aga nke ọma maka ịnyefe nzacha n'elu jsonb na ihe nkesa dịpụrụ adịpụ, ma ebe a ọ dị mkpa iburu n'uche otu njedebe: anyị na-eji ụdị 9.6 nke nchekwa data, ma na njedebe nke April, anyị na-eme atụmatụ imecha ule ikpeazụ wee gaa na ụdị 12. Ozugbo anyị melite, anyị ga-ede otú o si metụta, n'ihi na e nwere ọtụtụ mgbanwe nke e nwere ọtụtụ olileanya: json_path, ọhụrụ CTE omume, push ala (dị si version 10). Achọrọ m ịnwale ya ngwa ngwa.

Mechaa ya

Anyị lere anya ka mgbanwe ọ bụla si emetụta ọsọ ajụjụ n'otu n'otu. Ka anyị hụ ihe na-eme mgbe edechara ihe nzacha atọ ahụ nke ọma.

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table 
WHERE is_active = True
AND created_dt >= (SELECT CURRENT_DATE::timestamptz - INTERVAL '7 month') 
AND created_dt <(SELECT CURRENT_DATE::timestamptz - INTERVAL '6 month')
AND meta @> '{"source":"test"}'::jsonb;

Aggregate  (cost=322041.51..322041.52 rows=1 width=8) (actual time=2278.867..2278.867 rows=1 loops=1)
  Output: count(1)
  InitPlan 1 (returns $0)
    ->  Result  (cost=0.00..0.02 rows=1 width=8) (actual time=0.010..0.010 rows=1 loops=1)
          Output: ((('now'::cstring)::date)::timestamp with time zone - '7 mons'::interval)
  InitPlan 2 (returns $1)
    ->  Result  (cost=0.00..0.02 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=1)
          Output: ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.02..322041.41 rows=25 width=0) (actual time=8.597..2153.809 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Remote SQL: SELECT NULL FROM fdw_schema.table WHERE (is_active) AND ((created_dt >= $1::timestamp with time zone)) AND ((created_dt < $2::timestamp with time zone)) AND ((meta @> '{"source": "test"}'::jsonb))
Planning time: 0.820 ms
Execution time: 2279.087 ms

Ee, ajụjụ ahụ na-ele anya mgbagwoju anya, ọ bụ ọnụ ahịa mmanye, mana ọsọ igbu ya bụ 2 sekọnd, nke karịrị ugboro 10 ngwa ngwa! Ma anyị na-ekwu maka ajụjụ dị mfe na obere data dị ntakịrị. Na arịrịọ n'ezie, anyị nwetara mmụba ruo ọtụtụ narị ugboro.

Iji chịkọta ya: ọ bụrụ na ị na-eji PostgreSQL na FDW, lelee mgbe niile ma ọ bụrụ na ezigara ihe nzacha niile na sava dịpụrụ adịpụ ma ị ga-enwe obi ụtọ ... Ọ dịkarịa ala ruo mgbe ị ga-abanye n'etiti tebụl site na sava dị iche iche. Mana nke ahụ bụ akụkọ maka edemede ọzọ.

Daalụ maka itinye uche gị! Ọ ga-amasị m ịnụ ajụjụ, nkọwa na akụkọ gbasara ahụmịhe gị na nkwupụta.

isi: www.habr.com

Tinye a comment