Nā ʻikepili hana i ka hoʻolālā microservice: kōkua a wikiwiki iā Postgres FDW

ʻO ka hoʻolālā Microservice, e like me nā mea āpau o kēia ao, aia kona mau pono a me nā pōʻino. E maʻalahi kekahi mau kaʻina me ia, ʻoi aku ka paʻakikī o kekahi. A no ka wikiwiki o ka loli a me ka scalability ʻoi aku ka maikaʻi, pono ʻoe e hana i nā mōhai. ʻO kekahi o lākou ka paʻakikī o ka analytics. Inā i loko o kahi monolith hiki ke hoʻemi ʻia nā ʻikepili hana a pau i nā nīnau SQL i kahi replica analytical, a laila i loko o ka multiservice architecture kēlā me kēia lawelawe i kona waihona ponoʻī a me he mea lā ʻaʻole lawa kahi nīnau (a i ʻole paha?). No ka poʻe makemake i ka hoʻoponopono ʻana i ka pilikia o ka ʻikepili hana i kā mākou hui a pehea mākou i aʻo ai e noho me kēia hopena - aloha.

Nā ʻikepili hana i ka hoʻolālā microservice: kōkua a wikiwiki iā Postgres FDW
ʻO Pavel Sivash koʻu inoa, ma DomClick ke hana nei au i kahi hui nona ke kuleana no ka mālama ʻana i ka hale waihona ʻikepili analytical. Ma keʻano maʻamau, hiki ke hoʻopili ʻia kā mākou mau hana i ka ʻenekinia data, akā, ʻoiaʻiʻo, ʻoi aku ka laulā o nā hana. Aia nā ʻenehana data maʻamau ETL / ELT, kākoʻo a me ka hoʻoponopono ʻana i nā mea hana ʻikepili a me ka hoʻomohala ʻana i kā lākou mau pono ponoʻī. ʻO ka mea nui, no ka hōʻike ʻana i ka hana, ua hoʻoholo mākou e "hoʻohālike" he monolith kā ​​mākou a hāʻawi i nā mea loiloi i hoʻokahi waihona e loaʻa ai nā ʻikepili āpau a lākou e pono ai.

Ma keʻano laulā, ua noʻonoʻo mākou i nā koho likeʻole. Ua hiki ke kūkulu i kahi waihona piha piha - ua hoʻāʻo mākou, akā, ʻo ka ʻoiaʻiʻo, ʻaʻole hiki iā mākou ke hana i nā hoaaloha me nā loli pinepine i ka loina me kahi kaʻina lohi o ke kūkulu ʻana i kahi waihona a me ka hoʻololi ʻana iā ia ( inā lanakila kekahi, e kākau i nā manaʻo pehea). Hiki iā ʻoe ke ʻōlelo i ka poʻe loiloi: "E nā kāne, e aʻo i ka python a hele i nā laina analytical," akā he koi hou kēia, a me he mea lā e pale ʻia kēia inā hiki. Ua hoʻoholo mākou e ho'āʻo e hoʻohana i ka ʻenehana FDW (Foreign Data Wrapper): ʻoiaʻiʻo, he dblink maʻamau kēia, aia i loko o ka maʻamau SQL, akā me kāna mea maʻalahi loa. Ma muli o ia mea, ua hoʻoholo mākou, a i ka hopena, ua hoʻoholo mākou. ʻO kāna mau kikoʻī ke kumuhana o kahi ʻatikala ʻokoʻa, a ʻoi aku paha ma mua o hoʻokahi, no ka mea makemake wau e kamaʻilio e pili ana i nā mea he nui: mai ka ʻikepili schema synchronization e kiʻi i ka mana a me ka depersonalization o ka ʻikepili pilikino. Pono e hoʻomaopopo ʻia ʻaʻole kēia hoʻonā he pani no nā ʻikepili analytical maoli a me nā repositories, hoʻoponopono wale ia i kahi pilikia kikoʻī.

Ma ka pae kiʻekiʻe e like me kēia:

Nā ʻikepili hana i ka hoʻolālā microservice: kōkua a wikiwiki iā Postgres FDW
Aia kahi waihona PostgreSQL kahi e hiki ai i nā mea hoʻohana ke mālama i kā lākou ʻikepili hana, a ʻo ka mea nui loa, ua hoʻopili ʻia nā replicas analytical o nā lawelawe āpau i kēia waihona ma o FDW. ʻO kēia ka mea e hiki ai ke kākau i kahi nīnau i kekahi mau ʻikepili, a ʻaʻole ia he mea nui: PostgreSQL, MySQL, MongoDB a i ʻole kekahi mea ʻē aʻe (faila, API, inā ʻaʻohe mea hoʻopili kūpono, hiki iā ʻoe ke kākau iā ʻoe iho). ʻAe, ua maikaʻi nā mea a pau! Haʻalele?

Inā wikiwiki a maʻalahi nā mea a pau, a laila, ʻaʻole paha e ola ka ʻatikala.

He mea nui e maopopo e pili ana i ka lawelawe ʻana o nā postgres i nā noi i nā kikowaena mamao. He mea kūpono kēia, akā pinepine ka poʻe ʻaʻole e hoʻolohe iā ia: hoʻokaʻawale ka poʻe postgres i ka nīnau i nā ʻāpana i hana kūʻokoʻa ma nā kikowaena mamao, hōʻiliʻili i kēia ʻikepili, a hana i nā helu hope loa, no laila e hilinaʻi nui ka wikiwiki o ka hoʻokō ʻana i ka nīnau. ua kakauia. Pono e hoʻomaopopo ʻia: i ka wā e hele mai ai ka ʻikepili mai kahi kikowaena mamao, ʻaʻohe o lākou mau index, ʻaʻohe mea e kōkua i ka mea hoʻonohonoho, no laila, ʻo mākou wale nō ke kōkua a kuhikuhi. A ʻo ia kaʻu makemake e kamaʻilio hou aku.

ʻO kahi noi maʻalahi a me kahi hoʻolālā me ia

No ka hōʻike ʻana i ka nīnau ʻana o Postgres i kahi papa ʻaina 6 miliona ma kahi kikowaena mamao, e nānā i kahi hoʻolālā maʻalahi.

explain analyze verbose  
SELECT count(1)
FROM fdw_schema.table;

Aggregate  (cost=418383.23..418383.24 rows=1 width=8) (actual time=3857.198..3857.198 rows=1 loops=1)
  Output: count(1)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.00..402376.14 rows=6402838 width=0) (actual time=4.874..3256.511 rows=6406868 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Remote SQL: SELECT NULL FROM fdw_schema.table
Planning time: 0.986 ms
Execution time: 3857.436 ms

ʻO ka hoʻohana ʻana i ka ʻōlelo VERBOSE hiki iā ʻoe ke ʻike i ka nīnau e hoʻouna ʻia i ke kikowaena mamao a me nā hopena e loaʻa iā mākou no ka hana hou ʻana (RemoteSQL string).

E hele hou a hoʻohui i kekahi mau kānana i kā mākou nīnau: hoʻokahi ma kāleʻa kahua, hoʻokahi ma ke komo ʻana ka manawa kuhikuhi i kēlā me kēia manawa a me hoʻokahi jsonb.

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table 
WHERE is_active is True
AND created_dt BETWEEN CURRENT_DATE - INTERVAL '7 month' 
AND CURRENT_DATE - INTERVAL '6 month'
AND meta->>'source' = 'test';

Aggregate  (cost=577487.69..577487.70 rows=1 width=8) (actual time=27473.818..25473.819 rows=1 loops=1)
  Output: count(1)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.00..577469.21 rows=7390 width=0) (actual time=31.369..25372.466 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Filter: (("table".is_active IS TRUE) AND (("table".meta ->> 'source'::text) = 'test'::text) AND ("table".created_dt >= (('now'::cstring)::date - '7 mons'::interval)) AND ("table".created_dt <= ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)))
        Rows Removed by Filter: 5046843
        Remote SQL: SELECT created_dt, is_active, meta FROM fdw_schema.table
Planning time: 0.665 ms
Execution time: 27474.118 ms

ʻO kēia kahi e waiho ai ka manawa, pono ʻoe e hoʻolohe i ke kākau ʻana i nā nīnau. ʻAʻole i hoʻoili ʻia nā kānana i ke kikowaena mamao, ʻo ia hoʻi i mea e hoʻokō ai, huki nā postgres i nā lālani 6 miliona i mea e kānana kūloko ma hope (ka laina kānana) a hana i ka hoʻohui. ʻO ke kī i ka holomua ʻo ke kākau ʻana i kahi nīnau i hoʻouna ʻia nā kānana i ka mīkini mamao, a loaʻa iā mākou a hōʻuluʻulu wale i nā lālani pono.

ʻO ia kekahi ʻano booleanshit

Me nā māla boolean, maʻalahi nā mea a pau. Ma ka nīnau kumu, no ka mea hoʻohana ka pilikia is. Inā mākou e pani me =, a laila loaʻa iā mākou ka hopena penei:

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table
WHERE is_active = True
AND created_dt BETWEEN CURRENT_DATE - INTERVAL '7 month' 
AND CURRENT_DATE - INTERVAL '6 month'
AND meta->>'source' = 'test';

Aggregate  (cost=508010.14..508010.15 rows=1 width=8) (actual time=19064.314..19064.314 rows=1 loops=1)
  Output: count(1)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.00..507988.44 rows=8679 width=0) (actual time=33.035..18951.278 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Filter: ((("table".meta ->> 'source'::text) = 'test'::text) AND ("table".created_dt >= (('now'::cstring)::date - '7 mons'::interval)) AND ("table".created_dt <= ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)))
        Rows Removed by Filter: 3567989
        Remote SQL: SELECT created_dt, meta FROM fdw_schema.table WHERE (is_active)
Planning time: 0.834 ms
Execution time: 19064.534 ms

E like me kāu e ʻike ai, lele ka kānana i ke kikowaena mamao, a ua hoʻemi ʻia ka manawa hoʻokō mai 27 a 19 kekona.

Pono e hoʻomaopopo i ka mea hoʻohana is ʻokoʻa mai ka mea hoʻohana = ka mea hiki ke hana me ka waiwai Null. ʻO ia hoʻi aole oiaio i ka kānana e waiho i nā waiwai False a Null, ʻoiai != Oiaio e waiho wale i nā waiwai False. No laila, i ka wā e pani ai i ka mea hoʻohana ka mea, aole pono ʻoe e hāʻawi i ʻelua kūlana i ka kānana me ka mea hoʻohana OR, no ka laʻana, WHERE (col != ʻOiaʻiʻo) A I ʻole (ʻaʻole null ka col).

Me ka boolean i ʻike ʻia, neʻe nei. I kēia manawa, e hoʻihoʻi kāua i ka kānana ma ka waiwai boolean i kona ʻano kumu i mea e noʻonoʻo kūʻokoʻa ai i ka hopena o nā hoʻololi ʻē aʻe.

timestamptz? hz

Ma keʻano laulā, pono ʻoe e hoʻāʻo pinepine i ke kākau pololei ʻana i kahi nīnau e pili ana i nā kikowaena mamao, a laila e ʻimi wale i kahi wehewehe no ke kumu o kēia. Hiki ke loaʻa ka ʻike liʻiliʻi e pili ana i kēia ma ka Pūnaewele. No laila, ma nā hoʻokolohua, ua ʻike mākou e lele ana kahi kānana lā paʻa i kahi kikowaena mamao me kahi bang, akā inā makemake mākou e hoʻonohonoho i ka lā me ka ikaika, no ka laʻana, i kēia manawa () a i ʻole CURRENT_DATE, ʻaʻole hiki kēia. Ma kā mākou laʻana, ua hoʻohui mākou i kānana i loaʻa i ka kolamu create_at ka ʻikepili no 1 mahina i hala (BETWEEN CURRENT_DATE - INTERVAL '7 mahina' AND CURRENT_DATE - INTERVAL '6 mahina'). He aha kā mākou i hana ai i kēia hihia?

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table 
WHERE is_active is True
AND created_dt >= (SELECT CURRENT_DATE::timestamptz - INTERVAL '7 month') 
AND created_dt <(SELECT CURRENT_DATE::timestamptz - INTERVAL '6 month')
AND meta->>'source' = 'test';

Aggregate  (cost=306875.17..306875.18 rows=1 width=8) (actual time=4789.114..4789.115 rows=1 loops=1)
  Output: count(1)
  InitPlan 1 (returns $0)
    ->  Result  (cost=0.00..0.02 rows=1 width=8) (actual time=0.007..0.008 rows=1 loops=1)
          Output: ((('now'::cstring)::date)::timestamp with time zone - '7 mons'::interval)
  InitPlan 2 (returns $1)
    ->  Result  (cost=0.00..0.02 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=1)
          Output: ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.02..306874.86 rows=105 width=0) (actual time=23.475..4681.419 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Filter: (("table".is_active IS TRUE) AND (("table".meta ->> 'source'::text) = 'test'::text))
        Rows Removed by Filter: 76934
        Remote SQL: SELECT is_active, meta FROM fdw_schema.table WHERE ((created_dt >= $1::timestamp with time zone)) AND ((created_dt < $2::timestamp with time zone))
Planning time: 0.703 ms
Execution time: 4789.379 ms

Ua koi mākou i ka mea hoʻolālā e helu i ka lā ma mua o ka subquery a hāʻawi i ka hoʻololi i hoʻomākaukau mua ʻia i ka kānana. A ua hāʻawi kēia hōʻailona iā mākou i kahi hopena maikaʻi loa, ua lilo ka nīnau ma kahi o 6 mau manawa wikiwiki!

Eia hou, he mea nui e makaʻala ma ʻaneʻi: ʻo ke ʻano o ka ʻikepili i loko o ka subquery pono e like me ke ʻano o ke kahua a mākou e kānana ai, i ʻole e hoʻoholo ka mea hoʻolālā no ka mea ʻokoʻa nā ʻano a pono e kiʻi mua i nā mea āpau. ʻikepili a kānana ma ka ʻāina.

E hoʻihoʻi kāua i ka kānana ma ka lā i kona waiwai kumu.

ʻO Freddy vs. jsonb

Ma keʻano laulā, ua lawa pono nā mahina boolean a me nā lā i kā mākou nīnau, akā aia kekahi ʻano ʻikepili hou aʻe. ʻO ke kaua me kāna kānana, ʻo ka ʻoiaʻiʻo, ʻaʻole i pau, ʻoiai aia nā kūleʻa ma aneʻi. No laila, eia pehea mākou i hoʻokō ai i ka kānana jsonb kahua i kahi kikowaena mamao.

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table 
WHERE is_active is True
AND created_dt BETWEEN CURRENT_DATE - INTERVAL '7 month' 
AND CURRENT_DATE - INTERVAL '6 month'
AND meta @> '{"source":"test"}'::jsonb;

Aggregate  (cost=245463.60..245463.61 rows=1 width=8) (actual time=6727.589..6727.590 rows=1 loops=1)
  Output: count(1)
  ->  Foreign Scan on fdw_schema."table"  (cost=1100.00..245459.90 rows=1478 width=0) (actual time=16.213..6634.794 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Filter: (("table".is_active IS TRUE) AND ("table".created_dt >= (('now'::cstring)::date - '7 mons'::interval)) AND ("table".created_dt <= ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)))
        Rows Removed by Filter: 619961
        Remote SQL: SELECT created_dt, is_active FROM fdw_schema.table WHERE ((meta @> '{"source": "test"}'::jsonb))
Planning time: 0.747 ms
Execution time: 6727.815 ms

Ma kahi o ke kānana ʻana i nā mea hoʻohana, pono ʻoe e hoʻohana i ke alo o kahi mea hoʻohana. jsonb ma kahi ʻokoʻa. 7 kekona ma kahi o ka 29 mua. I kēia manawa, ʻo kēia wale nō ke koho kūleʻa no ka hoʻouna ʻana i nā kānana jsonb i kahi kikowaena mamao, akā eia ka mea nui e noʻonoʻo i hoʻokahi palena: hoʻohana mākou i ka mana 9.6 o ka waihona, akā ma ka hopena o ʻApelila mākou e hoʻolālā ai e hoʻopau i nā hoʻokolohua hope loa a neʻe i ka mana 12. Ke hōʻano hou nei mākou, e kākau mākou i ke ʻano o ka hopena, no ka mea he nui nā loli i nui nā manaʻolana: json_path, ʻano CTE hou, kaomi i lalo (mai ka mana 10). Makemake au e ho'āʻo koke.

Hoʻopau iā ia

Ua nānā mākou i ka hopena o kēlā me kēia hoʻololi i ka wikiwiki o ka nīnau. E ʻike kākou i kēia manawa ke kākau pololei ʻia nā kānana ʻekolu.

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table 
WHERE is_active = True
AND created_dt >= (SELECT CURRENT_DATE::timestamptz - INTERVAL '7 month') 
AND created_dt <(SELECT CURRENT_DATE::timestamptz - INTERVAL '6 month')
AND meta @> '{"source":"test"}'::jsonb;

Aggregate  (cost=322041.51..322041.52 rows=1 width=8) (actual time=2278.867..2278.867 rows=1 loops=1)
  Output: count(1)
  InitPlan 1 (returns $0)
    ->  Result  (cost=0.00..0.02 rows=1 width=8) (actual time=0.010..0.010 rows=1 loops=1)
          Output: ((('now'::cstring)::date)::timestamp with time zone - '7 mons'::interval)
  InitPlan 2 (returns $1)
    ->  Result  (cost=0.00..0.02 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=1)
          Output: ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.02..322041.41 rows=25 width=0) (actual time=8.597..2153.809 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Remote SQL: SELECT NULL FROM fdw_schema.table WHERE (is_active) AND ((created_dt >= $1::timestamp with time zone)) AND ((created_dt < $2::timestamp with time zone)) AND ((meta @> '{"source": "test"}'::jsonb))
Planning time: 0.820 ms
Execution time: 2279.087 ms

ʻAe, ʻoi aku ka paʻakikī o ka nīnau, he kumukūʻai koi, akā ʻo ka wikiwiki hoʻokō ʻo 2 kekona, ʻoi aku ia ma mua o 10 mau manawa wikiwiki! A ke kamaʻilio nei mākou e pili ana i kahi nīnau maʻalahi ma kahi pūʻulu ʻikepili liʻiliʻi. Ma nā noi maoli, loaʻa iā mākou ka hoʻonui a hiki i nā haneli he nui.

No ka hōʻuluʻuluʻana: ināʻoe e hoʻohana nei i ka PostgreSQL me FDW, e nānā mau inā hoʻounaʻia nā kānana a pau i kahi kikowaena mamao a hauʻoliʻoe ... Ma ka liʻiliʻi a hiki i kou komoʻana i waena o nā papa mai nā kikowaena likeʻole. Akā, he moʻolelo kēlā no kekahi ʻatikala.

Mahalo i kou nānā ʻana! Makemake au e lohe i nā nīnau, nā manaʻo, a me nā moʻolelo e pili ana i kāu mau ʻike ma nā manaʻo.

Source: www.habr.com

Pākuʻi i ka manaʻo hoʻopuka