Ukuhlaziya okusebenzayo ekwakhiweni kwe-microservice: usizo kanye nokwazisa i-Postgres FDW

I-Microservice architecture, njengayo yonke into kulo mhlaba, inezinzuzo nezingozi zayo. Ezinye izinqubo ziba lula ngakho, ezinye zibe nzima nakakhulu. Futhi ngenxa yejubane loshintsho kanye nokulinganisa okungcono, udinga ukuzidela. Enye yazo ukukhula kobunzima bezibalo. Uma ku-monolith zonke izibalo zokusebenza zingancishiswa zibe imibuzo ye-SQL ku-replica yokuhlaziya, khona-ke ekwakhiweni kwe-multiservice isevisi ngayinye ine-database yayo futhi kubonakala sengathi umbuzo owodwa awukwazi ukwenziwa (noma mhlawumbe ungakwenza?). Kulabo abanentshisekelo yokuthi siyixazulule kanjani inkinga yokuhlaziya ukusebenza enkampanini yethu nokuthi sifunde kanjani ukuphila nalesi sixazululo - wamukelekile.

Ukuhlaziya okusebenzayo ekwakhiweni kwe-microservice: usizo kanye nokwazisa i-Postgres FDW
Igama lami ngingu-Pavel Sivash, kwa-DomClick ngisebenza eqenjini elinesibopho sokugcina inqolobane yedatha yokuhlaziya. Ngokwejwayelekile, imisebenzi yethu ingahlukaniswa njengobunjiniyela bedatha, kodwa, empeleni, ububanzi bemisebenzi bubanzi kakhulu. Kukhona indinganiso ye-ETL/ELT yobunjiniyela bedatha, ukusekela nokujwayela amathuluzi okuhlaziya idatha nokuthuthukisa amathuluzi akho. Ikakhulukazi, ngokubika kokusebenza, sinqume "ukwenza sengathi" sine-monolith futhi sinikeze abahlaziyi isizindalwazi esisodwa esizoqukatha yonke idatha abayidingayo.

Ngokuvamile, sicabangele izinketho ezahlukene. Kube nokwenzeka ukwakha inqolobane egcwele ngokugcwele - saze sazama, kodwa, uma sikhuluma iqiniso, asikwazanga ukuhlanganisa izinguquko ezivamile kumqondo nenqubo ehamba kancane yokwakha indawo yokugcina nokwenza izinguquko kuyo (uma othile ephumelele. , bhala emazwaneni ukuthi kanjani). Kwakungenzeka ukutshela abahlaziyi: "Guys, fundani i-python futhi niye kuma-analytical replicas," kodwa lokhu kuyisidingo esengeziwe sokuqasha, futhi kubonakala sengathi lokhu kufanele kugwenywe uma kungenzeka. Sinqume ukuzama ukusebenzisa ubuchwepheshe be-FDW (Foreign Data Wrapper): empeleni, lena i-dblink ejwayelekile, esezingeni le-SQL, kodwa enokuxhumana kwayo okulula kakhulu. Ngokusekelwe kulo, senze isixazululo, esagcina sibambekile, futhi sazinza kuso. Imininingwane yayo iyisihloko se-athikili ehlukile, futhi mhlawumbe engaphezu kweyodwa, njengoba ngifuna ukukhuluma okuningi: kusukela ekuvumelaniseni izikimu zedathabhesi ukufinyelela ekulawuleni nasekuhlukaniseni idatha yomuntu siqu. Kuyadingeka futhi ukubhukha ukuthi lesi sixazululo asikhona ukumiselela isizindalwazi sangempela sokuhlaziya namakhosombe; sixazulula inkinga ethile kuphela.

Ezingeni eliphezulu kubonakala kanje:

Ukuhlaziya okusebenzayo ekwakhiweni kwe-microservice: usizo kanye nokwazisa i-Postgres FDW
Kukhona isizindalwazi se-PostgreSQL lapho abasebenzisi bengagcina khona idatha yabo yomsebenzi, futhi okubaluleke kakhulu, ama-analytical replicas azo zonke izinsiza axhunywe kule database nge-FDW. Lokhu kwenza kube lula ukubhala umbuzo kuzinqolobane eziningana, futhi akunandaba ukuthi kuyini: I-PostgreSQL, i-MySQL, i-MongoDB noma enye into (ifayela, i-API, uma kungazelelwe kungabikho ukugoqa okufanelekile, ungabhala eyakho). Hhayi-ke, konke kubonakala kukuhle! Siyahlukana?

Uma konke kuphelile ngokushesha futhi kalula, khona-ke, mhlawumbe, bekungeke kube khona isihloko.

Kubalulekile ukucacisa ukuthi i-Postgres icubungula kanjani izicelo kumaseva akude. Lokhu kubonakala kunengqondo, kodwa ngokuvamile abantu abakunaki: I-Postgres ihlukanisa isicelo sibe izingxenye ezisebenza ngokuzimela kumaseva akude, iqoqa le datha, futhi yenze izibalo zokugcina ngokwayo, ngakho isivinini sombuzo sizoncika kakhulu ukuthi kulotshwe kanjani. Kufanele futhi kuqashelwe: lapho idatha ifika isuka kuseva ekude, ayisekho izinkomba, akukho lutho oluzosiza umhleli, ngakho-ke, thina kuphela esingakwazi ukumsiza nokumeluleka. Futhi yilokhu kanye engifuna ukukhuluma ngakho ngokuningiliziwe.

Umbuzo olula kanye nohlelo nawo

Ukukhombisa ukuthi i-Postgres ibuza kanjani ithebula lemigqa eyizigidi ezingu-6 kuseva ekude, ake sibheke icebo elilula.

explain analyze verbose  
SELECT count(1)
FROM fdw_schema.table;

Aggregate  (cost=418383.23..418383.24 rows=1 width=8) (actual time=3857.198..3857.198 rows=1 loops=1)
  Output: count(1)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.00..402376.14 rows=6402838 width=0) (actual time=4.874..3256.511 rows=6406868 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Remote SQL: SELECT NULL FROM fdw_schema.table
Planning time: 0.986 ms
Execution time: 3857.436 ms

Ukusebenzisa isitatimende se-VERBOSE kusivumela ukuthi sibone umbuzo ozothunyelwa kuseva ekude kanye nemiphumela esizoyithola ukuze siqhubeke nokucutshungulwa (umugqa we-RemoteSQL).

Asiqhubeke kancane futhi sengeze izihlungi ezimbalwa esicelweni sethu: esisodwa se i-boolean inkambu, okukodwa nokwenzeka isitembu sesikhathi phakathi nenkathi ngayinye jsonb.

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table 
WHERE is_active is True
AND created_dt BETWEEN CURRENT_DATE - INTERVAL '7 month' 
AND CURRENT_DATE - INTERVAL '6 month'
AND meta->>'source' = 'test';

Aggregate  (cost=577487.69..577487.70 rows=1 width=8) (actual time=27473.818..25473.819 rows=1 loops=1)
  Output: count(1)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.00..577469.21 rows=7390 width=0) (actual time=31.369..25372.466 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Filter: (("table".is_active IS TRUE) AND (("table".meta ->> 'source'::text) = 'test'::text) AND ("table".created_dt >= (('now'::cstring)::date - '7 mons'::interval)) AND ("table".created_dt <= ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)))
        Rows Removed by Filter: 5046843
        Remote SQL: SELECT created_dt, is_active, meta FROM fdw_schema.table
Planning time: 0.665 ms
Execution time: 27474.118 ms

Kulapho iphuzu okumele ulinake uma ubhala imibuzo lilele khona. Izihlungi azizange zidluliselwe kuseva ekude, okusho ukuthi ukuze ikwenze, i-Postgres ikhipha yonke imigqa eyizigidi ezingu-6 ukuze bese ihlunga endaweni (Hlunga umugqa) futhi yenza ukuhlanganisa. Isihluthulelo sempumelelo ukubhala umbuzo ukuze izihlungi zidluliselwe emshinini oqhelile, futhi sithola futhi sihlanganise imigqa edingekayo kuphela.

Lokho kungamanga athile

Ngezinkambu ze-boolean yonke into ilula. Esicelweni sokuqala, inkinga ibingenxa yomsebenzisi is. Uma uyishintsha ngokuthi =, bese sithola umphumela olandelayo:

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table
WHERE is_active = True
AND created_dt BETWEEN CURRENT_DATE - INTERVAL '7 month' 
AND CURRENT_DATE - INTERVAL '6 month'
AND meta->>'source' = 'test';

Aggregate  (cost=508010.14..508010.15 rows=1 width=8) (actual time=19064.314..19064.314 rows=1 loops=1)
  Output: count(1)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.00..507988.44 rows=8679 width=0) (actual time=33.035..18951.278 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Filter: ((("table".meta ->> 'source'::text) = 'test'::text) AND ("table".created_dt >= (('now'::cstring)::date - '7 mons'::interval)) AND ("table".created_dt <= ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)))
        Rows Removed by Filter: 3567989
        Remote SQL: SELECT created_dt, meta FROM fdw_schema.table WHERE (is_active)
Planning time: 0.834 ms
Execution time: 19064.534 ms

Njengoba ubona, isihlungi sandizela kuseva ekude, futhi isikhathi sokwenza sancishiswa sisuka kumasekhondi angama-27 siye kwayi-19.

Kuyaphawuleka ukuthi opharetha is ehlukile ku-opharetha = ngoba ingasebenza ngenani le-Null. Kusho ukuthi akulona iqiniso izoshiya amanani Angamanga futhi Ayilutho kusihlungi, kuyilapho != Iqiniso izoshiya kuphela amanani Amanga. Ngakho-ke, lapho ushintsha u-opharetha ayiyona izimo ezimbili ezino-opharetha NOMA kufanele zidluliselwe kusihlungi, isibonelo, LAPHO (col != True) NOMA (col is null).

Sibhekane ne-boolean, asiqhubeke. Okwamanje, ake sibuyisele isihlungi se-Boolean esimweni saso soqobo ukuze sicabangele ngokuzimela umthelela wezinye izinguquko.

timestamptz? hz

Ngokuvamile, ngokuvamile kufanele uzame ukuthi ungabhala kanjani kahle isicelo esifaka amaseva akude, bese ubheka incazelo yokuthi kungani lokhu kwenzeka. Ulwazi oluncane kakhulu mayelana nalokhu lungatholakala ku-inthanethi. Ngakho-ke, ekuhloleni sithole ukuthi isihlungi sedethi egxilile sindizela kuseva ekude ngokuqhuqha, kodwa uma sifuna ukusetha idethi ngamandla, isibonelo, manje() noma CURRENT_DATE, lokhu akwenzeki. Esibonelweni sethu, sengeze isihlungi ukuze ikholomu ye-created_at iqukethe idatha yenyanga engu-1 ncamashi esikhathini esidlule (PHAKATHI KWE-CURRENT_DATE - INTERVAL 'inyanga engu-7' KANYE NO-CURRENT_DATE - INTERVAL 'izinyanga ezingu-6'). Senzeni kuleli cala?

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table 
WHERE is_active is True
AND created_dt >= (SELECT CURRENT_DATE::timestamptz - INTERVAL '7 month') 
AND created_dt <(SELECT CURRENT_DATE::timestamptz - INTERVAL '6 month')
AND meta->>'source' = 'test';

Aggregate  (cost=306875.17..306875.18 rows=1 width=8) (actual time=4789.114..4789.115 rows=1 loops=1)
  Output: count(1)
  InitPlan 1 (returns $0)
    ->  Result  (cost=0.00..0.02 rows=1 width=8) (actual time=0.007..0.008 rows=1 loops=1)
          Output: ((('now'::cstring)::date)::timestamp with time zone - '7 mons'::interval)
  InitPlan 2 (returns $1)
    ->  Result  (cost=0.00..0.02 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=1)
          Output: ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.02..306874.86 rows=105 width=0) (actual time=23.475..4681.419 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Filter: (("table".is_active IS TRUE) AND (("table".meta ->> 'source'::text) = 'test'::text))
        Rows Removed by Filter: 76934
        Remote SQL: SELECT is_active, meta FROM fdw_schema.table WHERE ((created_dt >= $1::timestamp with time zone)) AND ((created_dt < $2::timestamp with time zone))
Planning time: 0.703 ms
Execution time: 4789.379 ms

Sitshele umhleli ukuthi abale idethi kumbuzo omncane kusenesikhathi futhi adlulisele okuguquguqukayo osekwenziwe kusihlungi. Futhi leli cebiso lisinike umphumela omuhle kakhulu, isicelo saba ngokushesha izikhathi ezi-6!

Nalapha futhi, kubalulekile ukuqaphela lapha: uhlobo lwedatha ku-subquery kufanele lufane nalolo lwenkundla esihlunga kuyo, ngaphandle kwalokho umhleli uzonquma ukuthi njengoba izinhlobo zihlukile, kuyadingeka ukuthi kuqala uthole zonke. idatha bese uyihlunga endaweni.

Masibuyisele isihlungi sedethi enanini laso langempela.

UFreddy vs. Jsonb

Ngokuvamile, izinkambu ze-Boolean nezinsuku sezivele ziwusheshise umbuzo wethu ngokwanele, kodwa bekusele olunye uhlobo lwedatha. Impi yokuhlunga ngakho, uma ngikhuluma iqiniso, ayikapheli, nakuba ikhona impumelelo nalapha. Ngakho-ke, yile ndlela esikwazi ngayo ukudlulisa isihlungi jsonb indawo kuseva ekude.

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table 
WHERE is_active is True
AND created_dt BETWEEN CURRENT_DATE - INTERVAL '7 month' 
AND CURRENT_DATE - INTERVAL '6 month'
AND meta @> '{"source":"test"}'::jsonb;

Aggregate  (cost=245463.60..245463.61 rows=1 width=8) (actual time=6727.589..6727.590 rows=1 loops=1)
  Output: count(1)
  ->  Foreign Scan on fdw_schema."table"  (cost=1100.00..245459.90 rows=1478 width=0) (actual time=16.213..6634.794 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Filter: (("table".is_active IS TRUE) AND ("table".created_dt >= (('now'::cstring)::date - '7 mons'::interval)) AND ("table".created_dt <= ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)))
        Rows Removed by Filter: 619961
        Remote SQL: SELECT created_dt, is_active FROM fdw_schema.table WHERE ((meta @> '{"source": "test"}'::jsonb))
Planning time: 0.747 ms
Execution time: 6727.815 ms

Esikhundleni sokuhlunga ama-opharetha, kufanele usebenzise ukuba khona ko-opharetha oyedwa jsonb kwehlukile. Imizuzwana engu-7 esikhundleni seyokuqala engu-29. Kuze kube manje lena ukuphela kwenketho eyimpumelelo yokudlulisa izihlungi nge jsonb kuseva ekude, kodwa lapha kubalulekile ukucabangela umkhawulo owodwa: sisebenzisa inguqulo 9.6 yesizindalwazi, kodwa ekupheleni kuka-April sihlela ukuqedela ukuhlolwa kokugcina futhi sidlulele enguqulweni ye-12. Uma sesibuyekeze, sizobhala ngokuthi kuthinte kanjani, ngoba kunezinguquko eziningi impela okunethemba elikhulu ngazo: i-json_path, ukuziphatha okusha kwe-CTE, phushela phansi (ezikhona kusukela kunguqulo 10). Ngifuna ukuyizama maduze.

Aqedele yena

Sihlole ukuthi ushintsho ngalunye luthinte kanjani isivinini sesicelo ngazinye. Manje ake sibone ukuthi kwenzekani uma zontathu izihlungi zibhalwe kahle.

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table 
WHERE is_active = True
AND created_dt >= (SELECT CURRENT_DATE::timestamptz - INTERVAL '7 month') 
AND created_dt <(SELECT CURRENT_DATE::timestamptz - INTERVAL '6 month')
AND meta @> '{"source":"test"}'::jsonb;

Aggregate  (cost=322041.51..322041.52 rows=1 width=8) (actual time=2278.867..2278.867 rows=1 loops=1)
  Output: count(1)
  InitPlan 1 (returns $0)
    ->  Result  (cost=0.00..0.02 rows=1 width=8) (actual time=0.010..0.010 rows=1 loops=1)
          Output: ((('now'::cstring)::date)::timestamp with time zone - '7 mons'::interval)
  InitPlan 2 (returns $1)
    ->  Result  (cost=0.00..0.02 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=1)
          Output: ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.02..322041.41 rows=25 width=0) (actual time=8.597..2153.809 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Remote SQL: SELECT NULL FROM fdw_schema.table WHERE (is_active) AND ((created_dt >= $1::timestamp with time zone)) AND ((created_dt < $2::timestamp with time zone)) AND ((meta @> '{"source": "test"}'::jsonb))
Planning time: 0.820 ms
Execution time: 2279.087 ms

Yebo, isicelo sibukeka siyinkimbinkimbi kakhulu, lokhu kuyimali ephoqelelwe, kodwa isivinini sokubulawa siyimizuzwana engu-2, okungaphezu kwezikhathi ezingu-10 ngokushesha! Futhi sikhuluma ngombuzo olula ngokumelene nesethi encane yedatha. Ezicelweni zangempela, sithole ukukhushulwa okufika izikhathi ezingamakhulu ambalwa.

Ukufingqa: uma usebenzisa i-PostgreSQL nge-FDW, hlola njalo ukuthi zonke izihlungi zithunyelwa kuseva ekude, futhi uzojabula... Okungenani uze ufike ekuhlanganiseni phakathi kwamatafula asuka eziphakelini ezihlukene. Kodwa leyo yindaba yesinye isihloko.

Ngiyabonga ukulalela kwenu! Ngingathanda ukuzwa imibuzo, ukuphawula, kanye nezindaba mayelana nomuzwa wakho kumazwana.

Source: www.habr.com

Engeza amazwana