Te tātari mahi i roto i te hoahoanga microservice: awhina me te akiaki i te Postgres FDW

Ko te hoahoanga Microservice, he rite ki nga mea katoa o tenei ao, he pai me te kino. Ko etahi o nga tukanga ka ngawari ake, ko etahi ka uaua ake. Na mo te tere o te whakarereke me te pai ake o te tauine, me mahi patunga tapu koe. Ko tetahi o enei ko te piki haere o te uaua o te tātaritanga. Mena i roto i te monolith ka taea te whakaheke i nga tātaritanga whakahaere katoa ki nga patai SQL ki te tauira tātari, na i roto i te hoahoanga ratonga maha kei ia ratonga tana ake putunga raraunga me te ahua kaore e taea te mahi (ka taea ranei?). Mo te hunga e hiahia ana ki te whakatau i te raru o te tātaritanga whakahaere i to taatau kamupene me te pehea i ako ai matou ki te noho me tenei otinga - nau mai.

Te tātari mahi i roto i te hoahoanga microservice: awhina me te akiaki i te Postgres FDW
Ko Pavel Sivash toku ingoa, kei DomClick kei te mahi ahau i roto i te roopu e mau ana te whare putunga raraunga tātari. Ko te tikanga, ka taea te whakarōpūhia a maatau mahi hei miihini raraunga, engari, ko te tikanga, he whanui ake te whānuitanga o nga mahi. He paerewa ETL/ELT mo te hangarau raraunga, tautoko me te urutau o nga taputapu mo te tātari raraunga me te whakawhanaketanga o au ake taputapu. Ina koa, mo nga purongo whakahaere, i whakatau matou ki te "whakaahua" kei a matou he monolith me te hoatu ki nga kaitätari he paparangi raraunga ka mau nga raraunga katoa e hiahiatia ana e ratou.

I te nuinga o te waa, i whakaarohia e matou nga whiringa rereke. I taea te hanga i tetahi putunga putunga - i whakamatau ano matou, engari, ki te pono, kaore i taea e matou te whakakotahi i nga huringa i nga wa katoa o te arorau me te mahi puhoi o te hanga putunga me te whakarereketanga (mehemea kua angitu tetahi. , tuhia ki nga korero me pehea). I taea te korero ki nga kaitätari: "E tama ma, akohia te python ka haere ki nga tauira taatai," engari he whakaritenga taapiri tenei mo te whakauru, me te mea me karohia tenei mehemea ka taea. I whakatau matou ki te ngana ki te whakamahi hangarau FDW (Foreign Data Wrapper): ko te tikanga, he dblink paerewa tenei, kei roto i te paerewa SQL, engari me tana ake atanga tino watea. I runga i tera, i hanga e matou he otinga, i te mutunga ka mau, ka tau ki runga. Ko ona taipitopito ko te kaupapa o te tuhinga motuhake, me te neke atu pea i te kotahi, i te mea e hiahia ana ahau ki te korero mo te maha: mai i te tukutahi i nga mahere papaarangi ki te uru atu ki te mana whakahaere me te whakakore i nga raraunga whaiaro. He mea tika ano kia rahuitia ko tenei otinga ehara i te whakakapinga mo nga papaaaarangi tātari me nga putunga putunga, engari he raru motuhake anake.

I te taumata o runga ka penei te ahua:

Te tātari mahi i roto i te hoahoanga microservice: awhina me te akiaki i te Postgres FDW
He pātengi raraunga PostgreSQL ka taea e nga kaiwhakamahi te penapena i o raatau raraunga mahi, ko te mea nui, ko nga tauira tātari o nga ratonga katoa e hono ana ki tenei papaarangi ma te FDW. Ma tenei ka taea te tuhi patai ki te maha o nga papaa raraunga, kaore he aha he aha: PostgreSQL, MySQL, MongoDB, tetahi atu mea ranei (kōnae, API, mena karekau he takai pai, ka taea e koe te tuhi i a koe ake). Ana, he ahua pai nga mea katoa! Kei te wehe tatou?

Mena he tere me te ngawari te mutunga o nga mea katoa, tera pea, kaore he tuhinga.

He mea nui kia maarama me pehea te tono a Postgres ki nga kaitoro mamao. He ahua arorau tenei, engari he maha nga wa kaore nga tangata e aro ki a ia: Ka wehewehea te tono a Postgres ki etahi waahanga ka mahia takitahitia i runga i nga kaitoro mamao, ka kohi i enei raraunga, ka mahia ano nga tatauranga whakamutunga, na reira ko te tere o te mahi uiui ka tino whakawhirinaki ki runga. pehea te tuhituhinga. Me tohu ano hoki: ka tae mai nga raraunga mai i te kaimau mamao, kua kore he tohu, kaore he mea hei awhina i te kaihoroi, no reira, ko tatou anake te awhina me te tohutohu ki a ia. A koinei tonu taku e hiahia ana ki te korero mo etahi atu korero.

He patai ngawari me tetahi mahere

Hei whakaatu me pehea te patai a Postgres ki te 6 miriona rarangi rarangi i runga i te tūmau mamao, me titiro ki tetahi mahere ngawari.

explain analyze verbose  
SELECT count(1)
FROM fdw_schema.table;

Aggregate  (cost=418383.23..418383.24 rows=1 width=8) (actual time=3857.198..3857.198 rows=1 loops=1)
  Output: count(1)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.00..402376.14 rows=6402838 width=0) (actual time=4.874..3256.511 rows=6406868 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Remote SQL: SELECT NULL FROM fdw_schema.table
Planning time: 0.986 ms
Execution time: 3857.436 ms

Ma te whakamahi i te korero VERBOSE ka taea e tatou te kite i te patai ka tukuna ki te tūmau mamao me nga hua ka riro mai mo te tukatuka ano (Raina RemoteSQL).

Me haere ake ka taapirihia etahi whiriwhiringa ki ta maatau tono: kotahi mo boolean mara, kotahi i te takanga waitohuwā i roto i te waahi me te kotahi ma jsonb.

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table 
WHERE is_active is True
AND created_dt BETWEEN CURRENT_DATE - INTERVAL '7 month' 
AND CURRENT_DATE - INTERVAL '6 month'
AND meta->>'source' = 'test';

Aggregate  (cost=577487.69..577487.70 rows=1 width=8) (actual time=27473.818..25473.819 rows=1 loops=1)
  Output: count(1)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.00..577469.21 rows=7390 width=0) (actual time=31.369..25372.466 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Filter: (("table".is_active IS TRUE) AND (("table".meta ->> 'source'::text) = 'test'::text) AND ("table".created_dt >= (('now'::cstring)::date - '7 mons'::interval)) AND ("table".created_dt <= ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)))
        Rows Removed by Filter: 5046843
        Remote SQL: SELECT created_dt, is_active, meta FROM fdw_schema.table
Planning time: 0.665 ms
Execution time: 27474.118 ms

Koinei te waahi me aro koe ki te tuhi patai. Ko nga whiriwhiringa kaore i whakawhitia ki te tūmau mamao, ko te tikanga kia mahia, ka toia e Postgres nga rarangi e 6 miriona kia taea ai te tātari i te rohe (Tatari rarangi) me te mahi whakahiato. Ko te mea nui ki te angitu ko te tuhi patai kia whakawhitia nga whiriwhiringa ki te miihini mamao, ka whiwhi me te whakahiato i nga rarangi e tika ana.

Ko te tahi booleanshit

Ma nga mara boolean he ngawari nga mea katoa. I roto i te tono tuatahi, na te kaiwhakahaere te raruraru is. Mena ka whakakapi koe ki te =, ka whiwhi tatou i te hua e whai ake nei:

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table
WHERE is_active = True
AND created_dt BETWEEN CURRENT_DATE - INTERVAL '7 month' 
AND CURRENT_DATE - INTERVAL '6 month'
AND meta->>'source' = 'test';

Aggregate  (cost=508010.14..508010.15 rows=1 width=8) (actual time=19064.314..19064.314 rows=1 loops=1)
  Output: count(1)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.00..507988.44 rows=8679 width=0) (actual time=33.035..18951.278 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Filter: ((("table".meta ->> 'source'::text) = 'test'::text) AND ("table".created_dt >= (('now'::cstring)::date - '7 mons'::interval)) AND ("table".created_dt <= ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)))
        Rows Removed by Filter: 3567989
        Remote SQL: SELECT created_dt, meta FROM fdw_schema.table WHERE (is_active)
Planning time: 0.834 ms
Execution time: 19064.534 ms

Ka kite koe, ka rere te tātari ki te tūmau mamao, ka whakahekehia te wa mahi mai i te 27 ki te 19 hēkona.

He mea tika kia mohio ko te kaiwhakahaere is rerekē i te kaiwhakahaere = na te mea ka taea te mahi me te uara Null. Ko te tikanga tera ehara i te Pono ka waiho nga uara Hete me te Null i roto i te tātari, engari != Tika ka waiho he uara Heke anake. Na reira, ka whakakapi i te kaiwhakahaere ko kore kia rua nga tikanga me te kaiwhakahaere OR me tuku ki te tātari, hei tauira, WHERE (col != True) RĀNEI (he null te col).

Kua mahi matou mo te boolean, me haere tonu. Inaianei, me whakahoki ano te tātari Boolean ki tona ahua taketake kia taea ai te whakaaro takitahi ki te paanga o etahi atu huringa.

timestamptz? hz

I te nuinga o te waa, me whakamatau koe me pehea te tuhi tika i tetahi tono e uru ana ki nga kaitoro mamao, katahi ka rapu whakamarama he aha i puta ai tenei. He iti noa nga korero mo tenei ka kitea i runga i te Ipurangi. Na, i roto i nga whakamatautau ka kitea e matou ka rere te tātari ra kua whakaritea ki te tūmau mamao me te pupuhi, engari ka hiahia matou ki te tautuhi i te ra, hei tauira, inaianei() CURRENT_DATE ranei, kare e puta. I roto i ta maatau tauira, i taapirihia he tātari kia mau ai nga raraunga i roto i te pou hanga_at mo te 1 marama o mua (WAWAI CURRENT_DATE - INTERVAL '7 marama' ME CURRENT_DATE - INTERVAL '6 marama'). He aha ta matou mahi i tenei keehi?

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table 
WHERE is_active is True
AND created_dt >= (SELECT CURRENT_DATE::timestamptz - INTERVAL '7 month') 
AND created_dt <(SELECT CURRENT_DATE::timestamptz - INTERVAL '6 month')
AND meta->>'source' = 'test';

Aggregate  (cost=306875.17..306875.18 rows=1 width=8) (actual time=4789.114..4789.115 rows=1 loops=1)
  Output: count(1)
  InitPlan 1 (returns $0)
    ->  Result  (cost=0.00..0.02 rows=1 width=8) (actual time=0.007..0.008 rows=1 loops=1)
          Output: ((('now'::cstring)::date)::timestamp with time zone - '7 mons'::interval)
  InitPlan 2 (returns $1)
    ->  Result  (cost=0.00..0.02 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=1)
          Output: ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.02..306874.86 rows=105 width=0) (actual time=23.475..4681.419 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Filter: (("table".is_active IS TRUE) AND (("table".meta ->> 'source'::text) = 'test'::text))
        Rows Removed by Filter: 76934
        Remote SQL: SELECT is_active, meta FROM fdw_schema.table WHERE ((created_dt >= $1::timestamp with time zone)) AND ((created_dt < $2::timestamp with time zone))
Planning time: 0.703 ms
Execution time: 4789.379 ms

I kii atu matou ki te kaiwhakamahere kia tatau i te ra i roto i te patai i mua atu ka tuku i te taurangi kua rite ki te tātari. Na tenei tohu i homai he hua pai, tata ki te ono nga wa tere ake te tono!

Ano, he mea nui kia tupato i konei: ko te momo raraunga kei roto i te patai kia rite ki tera o te mara e taraihia ana e matou, ki te kore ka whakatauhia e te kaiwhakatakoto whakaaro na te mea he rereke nga momo, me matua whiwhi katoa. te raraunga me te tātari i te rohe.

Me whakahoki te tātari ra ki tona uara taketake.

Freddy vs. Jsonb

I te nuinga o te waa, ko nga apure Boolean me nga ra kua tere ake ta matou patai, engari kotahi ano te momo raraunga e toe ana. Ko te pakanga ki te tātari, kia pono, kaore ano kia mutu, ahakoa he angitu kei konei hoki. Na, koinei te huarahi i taea ai e matou te whakawhiti i te tātari jsonb mara ki te tūmau mamao.

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table 
WHERE is_active is True
AND created_dt BETWEEN CURRENT_DATE - INTERVAL '7 month' 
AND CURRENT_DATE - INTERVAL '6 month'
AND meta @> '{"source":"test"}'::jsonb;

Aggregate  (cost=245463.60..245463.61 rows=1 width=8) (actual time=6727.589..6727.590 rows=1 loops=1)
  Output: count(1)
  ->  Foreign Scan on fdw_schema."table"  (cost=1100.00..245459.90 rows=1478 width=0) (actual time=16.213..6634.794 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Filter: (("table".is_active IS TRUE) AND ("table".created_dt >= (('now'::cstring)::date - '7 mons'::interval)) AND ("table".created_dt <= ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)))
        Rows Removed by Filter: 619961
        Remote SQL: SELECT created_dt, is_active FROM fdw_schema.table WHERE ((meta @> '{"source": "test"}'::jsonb))
Planning time: 0.747 ms
Execution time: 6727.815 ms

Engari ki te tātari i nga kaiwhakahaere, me whakamahi koe i te aroaro o tetahi kaiwhakahaere jsonb i roto i te rereke. 7 hēkona hei utu mo te taketake 29. I tenei wa koinei anake te whiringa angitu mo te tuku whiriwhiringa ma jsonb ki te tūmau mamao, engari i konei he mea nui kia whai whakaaro ki tetahi herenga: kei te whakamahi matou i te putanga 9.6 o te putunga raraunga, engari hei te mutunga o Paenga-whawha ka whakamahere matou ki te whakaoti i nga whakamatautau whakamutunga ka neke ki te putanga 12. Ina whakahou ana matou, ka tuhia e matou he pehea te paanga, na te mea he maha nga huringa he nui te tumanako: json_path, whanonga CTE hou, pana ki raro (mai i te putanga 10). Ka tino hiahia ahau ki te whakamatau wawe.

Whakaoti ia

I whakamatauria e maatau nga huringa e pa ana ki te tere tono takitahi. Kia kite tatou i te mea ka puta ina tuhia tika nga whiriwhiringa e toru.

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table 
WHERE is_active = True
AND created_dt >= (SELECT CURRENT_DATE::timestamptz - INTERVAL '7 month') 
AND created_dt <(SELECT CURRENT_DATE::timestamptz - INTERVAL '6 month')
AND meta @> '{"source":"test"}'::jsonb;

Aggregate  (cost=322041.51..322041.52 rows=1 width=8) (actual time=2278.867..2278.867 rows=1 loops=1)
  Output: count(1)
  InitPlan 1 (returns $0)
    ->  Result  (cost=0.00..0.02 rows=1 width=8) (actual time=0.010..0.010 rows=1 loops=1)
          Output: ((('now'::cstring)::date)::timestamp with time zone - '7 mons'::interval)
  InitPlan 2 (returns $1)
    ->  Result  (cost=0.00..0.02 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=1)
          Output: ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.02..322041.41 rows=25 width=0) (actual time=8.597..2153.809 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Remote SQL: SELECT NULL FROM fdw_schema.table WHERE (is_active) AND ((created_dt >= $1::timestamp with time zone)) AND ((created_dt < $2::timestamp with time zone)) AND ((meta @> '{"source": "test"}'::jsonb))
Planning time: 0.820 ms
Execution time: 2279.087 ms

Ae, he uaua ake te ahua o te tono, he utu takoha tenei, engari ko te tere o te mahi he 2 hēkona, neke atu i te 10 nga wa tere ake! A kei te korero matou mo tetahi patai ngawari ki tetahi huinga raraunga iti. I runga i nga tono pono, kua piki ake ki te maha nga rau nga wa.

Hei whakarāpopototanga: ki te whakamahi koe i te PostgreSQL me te FDW, tirohia i nga wa katoa ka tukuna nga whiriwhiringa katoa ki te tūmau mamao, a ka koa koe... I te iti rawa kia uru atu koe ki waenga i nga ripanga mai i nga tuunga rereke. Engari he korero tera mo tetahi atu tuhinga.

Mauruuru koe mo to aro! E pai ana ahau ki te whakarongo ki nga paatai, nga korero, me nga korero mo o wheako i roto i nga korero.

Source: will.com

Tāpiri i te kōrero