Awọn atupale iṣẹ ṣiṣe ni faaji microservice: iranlọwọ ati kiakia Postgres FDW

Microservice faaji, bi ohun gbogbo ni aye yi, ni o ni awọn oniwe-Aleebu ati awọn konsi. Diẹ ninu awọn ilana di rọrun pẹlu rẹ, awọn miiran nira sii. Ati nitori iyara ti iyipada ati scalability ti o dara julọ, o nilo lati ṣe awọn irubọ. Ọkan ninu wọn ni awọn npo complexity ti awọn atupale. Ti o ba jẹ pe ni monolith gbogbo awọn atupale iṣiṣẹ le dinku si awọn ibeere SQL si ajọra atupale, lẹhinna ni iṣẹ faaji multiservice iṣẹ kọọkan ni aaye data tirẹ ati pe o dabi pe ibeere kan ko le ṣe (tabi boya o le?). Fun awọn ti o nifẹ si bii a ṣe yanju iṣoro ti awọn atupale iṣiṣẹ ni ile-iṣẹ wa ati bii a ṣe kọ ẹkọ lati gbe pẹlu ojutu yii - kaabọ.

Awọn atupale iṣẹ ṣiṣe ni faaji microservice: iranlọwọ ati kiakia Postgres FDW
Orukọ mi ni Pavel Sivash, ni DomClick Mo ṣiṣẹ ni ẹgbẹ kan ti o ni iduro fun mimu ibi ipamọ data itupalẹ. Ni aṣa, awọn iṣẹ wa le jẹ ipin bi imọ-ẹrọ data, ṣugbọn, ni otitọ, ibiti awọn iṣẹ ṣiṣe pọ si. Iwọn ETL/ELT wa fun imọ-ẹrọ data, atilẹyin ati isọdọtun ti awọn irinṣẹ fun itupalẹ data ati idagbasoke awọn irinṣẹ tirẹ. Ni pataki, fun ijabọ iṣiṣẹ, a pinnu lati “dibọ” pe a ni monolith kan ati fun awọn atunnkanka ni ibi ipamọ data kan ti yoo ni gbogbo data ti wọn nilo.

Ni gbogbogbo, a ṣe akiyesi awọn aṣayan oriṣiriṣi. O ṣee ṣe lati kọ ibi ipamọ ti o ni kikun - a gbiyanju paapaa, ṣugbọn, lati sọ ooto, a ko ni anfani lati darapo awọn ayipada loorekoore ni oye pẹlu ilana kuku lọra ti kikọ ibi ipamọ kan ati ṣiṣe awọn ayipada si rẹ (ti ẹnikan ba ṣaṣeyọri , kọ ninu awọn asọye bawo ni). O ṣee ṣe lati sọ fun awọn atunnkanka pe: “Awọn ọmọkunrin, kọ ẹkọ Python ki o lọ si awọn ẹda atupọ,” ṣugbọn eyi jẹ ibeere afikun fun igbanisiṣẹ, ati pe o dabi pe o yẹ ki o yago fun eyi ti o ba ṣeeṣe. A pinnu lati gbiyanju lati lo imọ-ẹrọ FDW (Ajeji Data Wrapper): ni pataki, eyi jẹ dblink boṣewa, eyiti o wa ninu boṣewa SQL, ṣugbọn pẹlu wiwo irọrun pupọ diẹ sii. Da lori rẹ, a ṣe ojutu kan, eyiti o mu nikẹhin, a si yanju lori rẹ. Awọn alaye rẹ jẹ koko-ọrọ ti nkan lọtọ, ati boya diẹ sii ju ọkan lọ, niwọn igba ti Mo fẹ lati sọrọ nipa pupọ: lati mimuuṣiṣẹpọ awọn eto ibi ipamọ data lati wọle si iṣakoso ati sisọnu data ti ara ẹni. O tun jẹ dandan lati ṣe ifiṣura pe ojutu yii kii ṣe rirọpo fun awọn apoti isura infomesonu gidi ati awọn ibi ipamọ; o yanju iṣoro kan pato.

Ni ipele oke o dabi eyi:

Awọn atupale iṣẹ ṣiṣe ni faaji microservice: iranlọwọ ati kiakia Postgres FDW
Ipamọ data PostgreSQL wa nibiti awọn olumulo le tọju data iṣẹ wọn, ati pataki julọ, awọn ẹda atupale ti gbogbo awọn iṣẹ ni asopọ si data data yii nipasẹ FDW. Eyi jẹ ki o ṣee ṣe lati kọ ibeere kan si awọn apoti isura infomesonu pupọ, ati pe ko ṣe pataki ohun ti o jẹ: PostgreSQL, MySQL, MongoDB tabi nkan miiran (faili, API, ti o ba jẹ pe lojiji ko si ipari ti o dara, o le kọ tirẹ). O dara, ohun gbogbo dabi ẹni nla! Ṣe a yapa bi?

Ti ohun gbogbo ba pari ni iyara ati irọrun, lẹhinna, boya, kii yoo jẹ nkan kan.

O ṣe pataki lati ṣe alaye nipa bii Postgres ṣe n ṣe ilana awọn ibeere si awọn olupin latọna jijin. Eyi dabi ọgbọn, ṣugbọn nigbagbogbo awọn eniyan ko ṣe akiyesi rẹ: Postgres pin ibeere naa si awọn apakan ti a ṣe ni ominira lori awọn olupin latọna jijin, gba data yii, ati ṣe awọn iṣiro ikẹhin funrararẹ, nitorinaa iyara ipaniyan ibeere yoo dale pupọ. bi o ti kọ. O tun yẹ ki o ṣe akiyesi: nigbati data ba de lati ọdọ olupin latọna jijin, ko ni awọn atọka mọ, ko si ohunkan ti yoo ṣe iranlọwọ fun oluṣeto, nitorina, nikan awa tikararẹ le ṣe iranlọwọ ati imọran fun u. Ati pe eyi ni pato ohun ti Mo fẹ lati sọrọ nipa ni awọn alaye diẹ sii.

Ibeere ti o rọrun ati ero pẹlu rẹ

Lati fihan bi Postgres ṣe n beere tabili ila 6 milionu kan lori olupin latọna jijin, jẹ ki a wo ero ti o rọrun.

explain analyze verbose  
SELECT count(1)
FROM fdw_schema.table;

Aggregate  (cost=418383.23..418383.24 rows=1 width=8) (actual time=3857.198..3857.198 rows=1 loops=1)
  Output: count(1)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.00..402376.14 rows=6402838 width=0) (actual time=4.874..3256.511 rows=6406868 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Remote SQL: SELECT NULL FROM fdw_schema.table
Planning time: 0.986 ms
Execution time: 3857.436 ms

Lilo alaye VERBOSE gba wa laaye lati rii ibeere ti yoo firanṣẹ si olupin latọna jijin ati awọn abajade eyiti a yoo gba fun sisẹ siwaju sii (laini jijinSQL).

Jẹ ki a lọ siwaju diẹ sii ki a ṣafikun ọpọlọpọ awọn asẹ si ibeere wa: ọkan fun ariwo aaye, ọkan nipa iṣẹlẹ timestamp ni aarin ati ọkan nipa jsonb.

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table 
WHERE is_active is True
AND created_dt BETWEEN CURRENT_DATE - INTERVAL '7 month' 
AND CURRENT_DATE - INTERVAL '6 month'
AND meta->>'source' = 'test';

Aggregate  (cost=577487.69..577487.70 rows=1 width=8) (actual time=27473.818..25473.819 rows=1 loops=1)
  Output: count(1)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.00..577469.21 rows=7390 width=0) (actual time=31.369..25372.466 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Filter: (("table".is_active IS TRUE) AND (("table".meta ->> 'source'::text) = 'test'::text) AND ("table".created_dt >= (('now'::cstring)::date - '7 mons'::interval)) AND ("table".created_dt <= ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)))
        Rows Removed by Filter: 5046843
        Remote SQL: SELECT created_dt, is_active, meta FROM fdw_schema.table
Planning time: 0.665 ms
Execution time: 27474.118 ms

Eyi ni aaye ti o nilo lati fiyesi si nigbati kikọ awọn ibeere wa da. A ko gbe awọn asẹ naa lọ si olupin latọna jijin, eyiti o tumọ si pe lati ṣiṣẹ, Postgres fa gbogbo awọn ori ila 6 miliọnu jade lati le ṣe àlẹmọ ni agbegbe (ila Ajọ) ati ṣe akojọpọ. Bọtini lati ṣaṣeyọri ni lati kọ ibeere kan ki awọn asẹ ti gbe lọ si ẹrọ latọna jijin, ati pe a gba ati ṣajọpọ awọn ori ila pataki nikan.

Iyẹn ni diẹ ninu booleanshit

Pẹlu awọn aaye boolean ohun gbogbo rọrun. Ninu ibeere atilẹba, iṣoro naa jẹ nitori oniṣẹ ẹrọ is. Ti o ba ropo o pẹlu =lẹhinna a gba abajade wọnyi:

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table
WHERE is_active = True
AND created_dt BETWEEN CURRENT_DATE - INTERVAL '7 month' 
AND CURRENT_DATE - INTERVAL '6 month'
AND meta->>'source' = 'test';

Aggregate  (cost=508010.14..508010.15 rows=1 width=8) (actual time=19064.314..19064.314 rows=1 loops=1)
  Output: count(1)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.00..507988.44 rows=8679 width=0) (actual time=33.035..18951.278 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Filter: ((("table".meta ->> 'source'::text) = 'test'::text) AND ("table".created_dt >= (('now'::cstring)::date - '7 mons'::interval)) AND ("table".created_dt <= ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)))
        Rows Removed by Filter: 3567989
        Remote SQL: SELECT created_dt, meta FROM fdw_schema.table WHERE (is_active)
Planning time: 0.834 ms
Execution time: 19064.534 ms

Bii o ti le rii, àlẹmọ naa fò si olupin latọna jijin, ati pe akoko ipaniyan dinku lati 27 si awọn aaya 19.

O tọ lati ṣe akiyesi pe oniṣẹ ẹrọ is yatọ si oniṣẹ ẹrọ = nitori ti o le ṣiṣẹ pẹlu awọn Null iye. Iyẹn tumọ si kii ṣe Otitọ yoo lọ kuro ni awọn iye eke ati Null ninu àlẹmọ, botilẹjẹpe != Otitọ yoo fi nikan Eke iye. Nitorina, nigbati o ba rọpo oniṣẹ ẹrọ ni ko awọn ipo meji pẹlu oniṣẹ OR yẹ ki o kọja si àlẹmọ, fun apẹẹrẹ, NIBI (col != True) TABI (col is asan).

A ti ṣe pẹlu boolean, jẹ ki a tẹsiwaju. Ni bayi, jẹ ki a da àlẹmọ Boolean pada si fọọmu atilẹba rẹ lati le ni ominira ro ipa ti awọn iyipada miiran.

timestamptz? hz

Ni gbogbogbo, o nigbagbogbo ni lati ṣe idanwo pẹlu bi o ṣe le kọ ibeere ni deede ti o kan awọn olupin latọna jijin, ati lẹhinna wa fun alaye idi ti eyi fi ṣẹlẹ. Alaye kekere pupọ nipa eyi ni a le rii lori Intanẹẹti. Nitorinaa, ninu awọn idanwo a rii pe àlẹmọ ọjọ ti o wa titi n fo si olupin latọna jijin pẹlu bang kan, ṣugbọn nigba ti a ba fẹ ṣeto ọjọ naa ni agbara, fun apẹẹrẹ, ni bayi () tabi CURRENT_DATE, eyi ko ṣẹlẹ. Nínú àpẹrẹ wa, a ṣàfikún àlẹ̀ kan kí fáìlì Create_at ní dátà nínú fún oṣù kan pàtó ní sẹ́yìn (LÁRIN CURRENT_DATE - INTERVAL '1 month' AND CURRENT_DATE - INTERVAL '7 month'). Kini a ṣe ninu ọran yii?

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table 
WHERE is_active is True
AND created_dt >= (SELECT CURRENT_DATE::timestamptz - INTERVAL '7 month') 
AND created_dt <(SELECT CURRENT_DATE::timestamptz - INTERVAL '6 month')
AND meta->>'source' = 'test';

Aggregate  (cost=306875.17..306875.18 rows=1 width=8) (actual time=4789.114..4789.115 rows=1 loops=1)
  Output: count(1)
  InitPlan 1 (returns $0)
    ->  Result  (cost=0.00..0.02 rows=1 width=8) (actual time=0.007..0.008 rows=1 loops=1)
          Output: ((('now'::cstring)::date)::timestamp with time zone - '7 mons'::interval)
  InitPlan 2 (returns $1)
    ->  Result  (cost=0.00..0.02 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=1)
          Output: ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.02..306874.86 rows=105 width=0) (actual time=23.475..4681.419 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Filter: (("table".is_active IS TRUE) AND (("table".meta ->> 'source'::text) = 'test'::text))
        Rows Removed by Filter: 76934
        Remote SQL: SELECT is_active, meta FROM fdw_schema.table WHERE ((created_dt >= $1::timestamp with time zone)) AND ((created_dt < $2::timestamp with time zone))
Planning time: 0.703 ms
Execution time: 4789.379 ms

A sọ fun oluṣeto lati ṣe iṣiro ọjọ ti o wa ni abẹlẹ ni ilosiwaju ki o kọja iyipada ti a ti ṣetan si àlẹmọ. Ati pe ofiri yii fun wa ni abajade ti o dara julọ, ibeere naa fẹrẹ to awọn akoko 6 yiyara!

Lẹẹkansi, o ṣe pataki lati ṣọra nibi: iru data ti o wa ni abẹlẹ gbọdọ jẹ kanna bi ti aaye lori eyiti a ṣe sisẹ, bibẹẹkọ oluṣeto yoo pinnu pe nitori awọn oriṣi yatọ, o jẹ dandan lati kọkọ gba gbogbo rẹ. awọn data ki o si àlẹmọ o tibile.

Jẹ ki a da àlẹmọ ọjọ pada si iye atilẹba rẹ.

Freddy vs. Jsonb

Ni gbogbogbo, awọn aaye Boolean ati awọn ọjọ ti mu ibeere wa pọ si tẹlẹ, ṣugbọn iru data kan wa ti o ku. Ogun pẹlu sisẹ nipasẹ rẹ, lati sọ ooto, ko tii pari, botilẹjẹpe aṣeyọri wa nibi paapaa. Nitorinaa, eyi ni bii a ṣe ṣakoso lati kọja àlẹmọ nipasẹ jsonb aaye si olupin latọna jijin.

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table 
WHERE is_active is True
AND created_dt BETWEEN CURRENT_DATE - INTERVAL '7 month' 
AND CURRENT_DATE - INTERVAL '6 month'
AND meta @> '{"source":"test"}'::jsonb;

Aggregate  (cost=245463.60..245463.61 rows=1 width=8) (actual time=6727.589..6727.590 rows=1 loops=1)
  Output: count(1)
  ->  Foreign Scan on fdw_schema."table"  (cost=1100.00..245459.90 rows=1478 width=0) (actual time=16.213..6634.794 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Filter: (("table".is_active IS TRUE) AND ("table".created_dt >= (('now'::cstring)::date - '7 mons'::interval)) AND ("table".created_dt <= ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)))
        Rows Removed by Filter: 619961
        Remote SQL: SELECT created_dt, is_active FROM fdw_schema.table WHERE ((meta @> '{"source": "test"}'::jsonb))
Planning time: 0.747 ms
Execution time: 6727.815 ms

Dipo awọn oniṣẹ sisẹ, o gbọdọ lo wiwa oniṣẹ kan jsonb ni o yatọ si. 7 aaya dipo ti awọn atilẹba 29. Nítorí jina yi jẹ nikan ni aseyori aṣayan fun a atagba Ajọ nipasẹ jsonb si olupin latọna jijin, ṣugbọn nibi o ṣe pataki lati ṣe akiyesi aropin kan: a nlo ẹya 9.6 ti database, ṣugbọn ni opin Oṣu Kẹrin a gbero lati pari awọn idanwo to kẹhin ati gbe si ẹya 12. Ni kete ti a ba ṣe imudojuiwọn, a yoo kọ nipa bii o ṣe kan, nitori ọpọlọpọ awọn ayipada wa fun eyiti ireti pupọ wa: json_path, ihuwasi CTE tuntun, titari si isalẹ (ti o wa lati ẹya 10). Mo fe looto lati gbiyanju laipe.

Pari rẹ

A ṣe idanwo bi iyipada kọọkan ṣe kan iyara ibeere ni ẹyọkan. Jẹ ki a wo ohun ti o ṣẹlẹ nigbati gbogbo awọn asẹ mẹta ti kọ ni deede.

explain analyze verbose
SELECT count(1)
FROM fdw_schema.table 
WHERE is_active = True
AND created_dt >= (SELECT CURRENT_DATE::timestamptz - INTERVAL '7 month') 
AND created_dt <(SELECT CURRENT_DATE::timestamptz - INTERVAL '6 month')
AND meta @> '{"source":"test"}'::jsonb;

Aggregate  (cost=322041.51..322041.52 rows=1 width=8) (actual time=2278.867..2278.867 rows=1 loops=1)
  Output: count(1)
  InitPlan 1 (returns $0)
    ->  Result  (cost=0.00..0.02 rows=1 width=8) (actual time=0.010..0.010 rows=1 loops=1)
          Output: ((('now'::cstring)::date)::timestamp with time zone - '7 mons'::interval)
  InitPlan 2 (returns $1)
    ->  Result  (cost=0.00..0.02 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=1)
          Output: ((('now'::cstring)::date)::timestamp with time zone - '6 mons'::interval)
  ->  Foreign Scan on fdw_schema."table"  (cost=100.02..322041.41 rows=25 width=0) (actual time=8.597..2153.809 rows=1360025 loops=1)
        Output: "table".id, "table".is_active, "table".meta, "table".created_dt
        Remote SQL: SELECT NULL FROM fdw_schema.table WHERE (is_active) AND ((created_dt >= $1::timestamp with time zone)) AND ((created_dt < $2::timestamp with time zone)) AND ((meta @> '{"source": "test"}'::jsonb))
Planning time: 0.820 ms
Execution time: 2279.087 ms

Bẹẹni, ibeere naa dabi idiju diẹ sii, eyi jẹ owo ti a fi agbara mu, ṣugbọn iyara ipaniyan jẹ awọn aaya 2, eyiti o jẹ diẹ sii ju awọn akoko 10 yiyara! Ati pe a n sọrọ nipa ibeere ti o rọrun kan lodi si ipilẹ data kekere kan. Lori awọn ibeere gidi, a gba ilosoke ti o to awọn igba ọgọrun.

Lati ṣe akopọ: ti o ba lo PostgreSQL pẹlu FDW, nigbagbogbo ṣayẹwo pe gbogbo awọn asẹ ni a firanṣẹ si olupin latọna jijin, ati pe iwọ yoo ni idunnu… o kere ju titi iwọ o fi gba lati darapọ mọ awọn tabili lati awọn olupin oriṣiriṣi. Ṣugbọn iyẹn jẹ itan fun nkan miiran.

Mo dupe fun ifetisile re! Emi yoo nifẹ lati gbọ awọn ibeere, awọn asọye, ati awọn itan nipa awọn iriri rẹ ninu awọn asọye.

orisun: www.habr.com

Fi ọrọìwòye kun