Amaresiphi emibuzo ye-SQL egulayo

Izinyanga ezedlulile samemezela chaza.tensor.ru - umphakathi isevisi yokuhlaziya nokubona ngeso lengqondo izinhlelo zemibuzo ku-PostgreSQL.

Usuvele uyisebenzise izikhathi ezingaphezu kuka-6000, kodwa isici esisodwa esiwusizo okungenzeka asizange sinakwe izinkomba zesakhiwo, okubukeka kanjena:

Amaresiphi emibuzo ye-SQL egulayo

Zilalele, futhi izicelo zakho ‘ziyoba bushelelezi futhi zibe silika. 🙂

Kodwa ngokungathí sina, izimo eziningi ezenza isicelo singasheshi futhi silambele izinsiza zijwayelekile futhi zingabonwa ngesakhiwo nedatha yohlelo.

Kulokhu, umthuthukisi ngamunye akudingeki abheke inketho yokuthuthukisa eyedwa, encika kokuhlangenwe nakho kwakhe kuphela - singamtshela ukuthi kwenzekani lapha, kungaba yini isizathu, futhi indlela yokubhekana nesixazululo. Yilokho esakwenza.

Amaresiphi emibuzo ye-SQL egulayo

Ake sibhekisise lezi zimo - ukuthi zichazwa kanjani nokuthi yiziphi izincomo eziholela kuzo.

Ukuze uzicwilise kangcono esihlokweni, ungaqala ulalele ibhulokhi elihambisanayo ukusuka umbiko wami ku-PGConf.Russia 2020, bese udlulela ekuhlaziyeni okuningiliziwe kwesibonelo ngasinye:

#1: inkomba "undersorting"

Nini

Bonisa i-invoyisi yakamuva yeklayenti "LLC Kolokolchik".

Indlela yokubona

-> Limit
   -> Sort
      -> Index [Only] Scan [Backward] | Bitmap Heap Scan

Izincomo

Inkomba esetshenzisiwe nweba ngezinkambu zokuhlunga.

Isibonelo:

CREATE TABLE tbl AS
SELECT
  generate_series(1, 100000) pk  -- 100K "фактов"
, (random() * 1000)::integer fk_cli; -- 1K разных внешних ключей

CREATE INDEX ON tbl(fk_cli); -- индекс для foreign key

SELECT
  *
FROM
  tbl
WHERE
  fk_cli = 1 -- отбор по конкретной связи
ORDER BY
  pk DESC -- хотим всего одну "последнюю" запись
LIMIT 1;

Amaresiphi emibuzo ye-SQL egulayo
[buka kokuthi explain.tensor.ru]

Ungabona ngokushesha ukuthi amarekhodi angaphezu kwe-100 akhishwe ohlwini, okwabe sekuhlungwa wonke, kwabe sekusele elilodwa kuphela.

Ukulungisa:

DROP INDEX tbl_fk_cli_idx;
CREATE INDEX ON tbl(fk_cli, pk DESC); -- добавили ключ сортировки

Amaresiphi emibuzo ye-SQL egulayo
[buka kokuthi explain.tensor.ru]

Ngisho nakusampula yakudala enjalo - 8.5 izikhathi ngokushesha futhi 33 izikhathi ezimbalwa ukufunda. Lapho uba “namaqiniso” engeziwe ngenani ngalinye, umphumela uba sobala kakhulu fk.

Ngiyaqaphela ukuthi inkomba enjalo izosebenza njengenkomba “yesiqalo” ingebi kakhulu kunangaphambili kweminye imibuzo fk, lapho hlunga khona pk yayingekho futhi yayingekho (ungafunda kabanzi ngalokhu esihlokweni sami mayelana nokuthola izinkomba ezingasebenzi). Kubandakanya, izohlinzeka ngokujwayelekile ukwesekwa kokhiye wangaphandle okusobala kule nsimu.

#2: ukuphambana kwenkomba (BitmapAnd)

Nini

Bonisa zonke izivumelwano zeklayenti "LLC Kolokolchik", eziphethwe egameni le-"NAO Buttercup".

Indlela yokubona

-> BitmapAnd
   -> Bitmap Index Scan
   -> Bitmap Index Scan

Izincomo

Dala inkomba eyinhlanganisela ngezinkambu ezisuka kuzo zombili ezangempela noma wandise eyodwa kwezikhona ngezinkambu ukusuka kweyesibili.

Isibonelo:

CREATE TABLE tbl AS
SELECT
  generate_series(1, 100000) pk      -- 100K "фактов"
, (random() *  100)::integer fk_org  -- 100 разных внешних ключей
, (random() * 1000)::integer fk_cli; -- 1K разных внешних ключей

CREATE INDEX ON tbl(fk_org); -- индекс для foreign key
CREATE INDEX ON tbl(fk_cli); -- индекс для foreign key

SELECT
  *
FROM
  tbl
WHERE
  (fk_org, fk_cli) = (1, 999); -- отбор по конкретной паре

Amaresiphi emibuzo ye-SQL egulayo
[buka kokuthi explain.tensor.ru]

Ukulungisa:

DROP INDEX tbl_fk_org_idx;
CREATE INDEX ON tbl(fk_org, fk_cli);

Amaresiphi emibuzo ye-SQL egulayo
[buka kokuthi explain.tensor.ru]

Inkokhelo lapha incane, njengoba i-Bitmap Heap Scan isebenza ngokwayo. Kodwa noma kunjalo 7 izikhathi ngokushesha futhi 2.5 izikhathi ezimbalwa ukufunda.

#3: Hlanganisa izinkomba (BitmapOr)

Nini

Bonisa "thina" endala kakhulu engu-20 noma izicelo ezingabelwe zokucutshungulwa, ezakho zize kuqala.

Indlela yokubona

-> BitmapOr
   -> Bitmap Index Scan
   -> Bitmap Index Scan

Izincomo

Sebenzisa Inyunyana [KONKE] ukuhlanganisa imibuzo emincane ngayinye ye-OR-blocks yemibandela.

Isibonelo:

CREATE TABLE tbl AS
SELECT
  generate_series(1, 100000) pk  -- 100K "фактов"
, CASE
    WHEN random() < 1::real/16 THEN NULL -- с вероятностью 1:16 запись "ничья"
    ELSE (random() * 100)::integer -- 100 разных внешних ключей
  END fk_own;

CREATE INDEX ON tbl(fk_own, pk); -- индекс с "вроде как подходящей" сортировкой

SELECT
  *
FROM
  tbl
WHERE
  fk_own = 1 OR -- свои
  fk_own IS NULL -- ... или "ничьи"
ORDER BY
  pk
, (fk_own = 1) DESC -- сначала "свои"
LIMIT 20;

Amaresiphi emibuzo ye-SQL egulayo
[buka kokuthi explain.tensor.ru]

Ukulungisa:

(
  SELECT
    *
  FROM
    tbl
  WHERE
    fk_own = 1 -- сначала "свои" 20
  ORDER BY
    pk
  LIMIT 20
)
UNION ALL
(
  SELECT
    *
  FROM
    tbl
  WHERE
    fk_own IS NULL -- потом "ничьи" 20
  ORDER BY
    pk
  LIMIT 20
)
LIMIT 20; -- но всего - 20, больше и не надо

Amaresiphi emibuzo ye-SQL egulayo
[buka kokuthi explain.tensor.ru]

Sisebenzise leli thuba lokuthi wonke amarekhodi angama-20 adingekayo atholwe ngokushesha ebhulokhini lokuqala, ngakho elesibili, eline-Bitmap Heap Scan “emba eqolo”, alizange libulawe - ekugcineni. 22x ngokushesha, 44x ukufundwa okumbalwa!

Indaba enemininingwane eminingi ngale ndlela yokuthuthukisa ngokusebenzisa izibonelo ezithile ingafundwa ezihlokweni I-PostgreSQL Antipatterns: ama-JOIN ayingozi nama-ORs и Ama-PostgreSQL Antipatterns: inganekwane yokuphindaphinda kabusha kokusesha ngegama, noma “Ukuthuthukisa emuva naphambili”.

Inguqulo ejwayelekile ukukhetha okukhethiwe ngokusekelwe kokhiye abambalwa (hhayi nje i-const/NULL pair) kuxoxwa ngayo esihlokweni I-SQL HowTo: ukubhala i-loop yesikhashana ngqo embuzweni, noma "Izinyathelo ezintathu zokuqala".

#4: Sifunda izinto eziningi ezingadingekile

Nini

Njengomthetho, kuvela uma ufuna "ukunamathisela esinye isihlungi" esicelweni esivele sikhona.

“Futhi awunayo efanayo, kodwa ngezinkinobho zikamama weparele? » ifilimu "The Diamond Arm"

Isibonelo, ukulungisa umsebenzi ongenhla, bonisa izicelo zokuqala ezingama-20 “ezibucayi” zokucutshungulwa, kungakhathaliseki ukuthi iyini injongo yazo.

Indlela yokubona

-> Seq Scan | Bitmap Heap Scan | Index [Only] Scan [Backward]
   && 5 × rows < RRbF -- отфильтровано >80% прочитанного
   && loops × RRbF > 100 -- и при этом больше 100 записей суммарно

Izincomo

Dala [okwengeziwe] okukhethekile inkomba enesimo sokuthi LAPHO noma ufake izinkambu ezengeziwe kunkomba.

Uma isimo sokuhlunga sithi "static" ngezinjongo zakho - okungukuthi akusho ukwanda uhlu lwamanani esikhathini esizayo - kungcono ukusebenzisa inkomba yokuthi KUPHI. Izimo ezihlukahlukene ze-boolean/enum zingena kahle kulesi sigaba.

Uma isimo sokuhlunga ingathatha izincazelo ezahlukene, ngakho-ke kungcono ukwandisa inkomba ngalezi zinkambu - njengakusimo nge-BitmapAnd ngenhla.

Isibonelo:

CREATE TABLE tbl AS
SELECT
  generate_series(1, 100000) pk -- 100K "фактов"
, CASE
    WHEN random() < 1::real/16 THEN NULL
    ELSE (random() * 100)::integer -- 100 разных внешних ключей
  END fk_own
, (random() < 1::real/50) critical; -- 1:50, что заявка "критичная"

CREATE INDEX ON tbl(pk);
CREATE INDEX ON tbl(fk_own, pk);

SELECT
  *
FROM
  tbl
WHERE
  critical
ORDER BY
  pk
LIMIT 20;

Amaresiphi emibuzo ye-SQL egulayo
[buka kokuthi explain.tensor.ru]

Ukulungisa:

CREATE INDEX ON tbl(pk)
  WHERE critical; -- добавили "статичное" условие фильтрации

Amaresiphi emibuzo ye-SQL egulayo
[buka kokuthi explain.tensor.ru]

Njengoba ubona, ukuhlunga kunyamalale ngokuphelele ohlelweni, futhi isicelo sesiphenduke 5 izikhathi ngokushesha.

#5: itafula elincane

Nini

Imizamo ehlukahlukene yokuzakhela owakho umugqa wokucubungula umsebenzi, lapho inani elikhulu lezibuyekezo/ukususwa kwamarekhodi etafuleni kuholela esimweni senani elikhulu lamarekhodi “afile”.

Indlela yokubona

-> Seq Scan | Bitmap Heap Scan | Index [Only] Scan [Backward]
   && loops × (rows + RRbF) < (shared hit + shared read) × 8
      -- прочитано больше 1KB на каждую запись
   && shared hit + shared read > 64

Izincomo

Kwenze ngokwakho njalo IVACUUM [KUGCWELE] noma uthole ukuqeqeshwa njalo ngokwanele i-autovacuum ngokulungisa kahle imingcele yayo, kuhlanganisa okwetafula elithile.

Ezimweni eziningi, izinkinga ezinjalo zibangelwa ukubunjwa kwemibuzo okungekuhle lapho ufona usuka kungqondongqondo yebhizinisi njengalezo okuxoxwe ngazo kuyo I-PostgreSQL Antipatterns: ukulwa nezindimbane “zabafileyo”.

Kodwa udinga ukuqonda ukuthi ngisho ne-VACUUM FULL ingase ingasizi ngaso sonke isikhathi. Ezimweni ezinjalo, kufanelekile ukuzijwayeza nge-algorithm evela esihlokweni I-DBA: lapho i-VACUUM ihluleka, sihlanza itafula mathupha.

#6: Ukufunda kusuka “phakathi” kwenkomba

Nini

Kubonakala sengathi sifunde kancane, futhi yonke into yayikhonjiwe, futhi asihlunga muntu ngokweqile - kodwa noma kunjalo sifunda amakhasi amaningi kakhulu kunalokho ebesingathanda.

Indlela yokubona

-> Index [Only] Scan [Backward]
   && loops × (rows + RRbF) < (shared hit + shared read) × 8
      -- прочитано больше 1KB на каждую запись
   && shared hit + shared read > 64

Izincomo

Bhekisisa kahle ukwakheka kwenkomba esetshenzisiwe kanye nezinkambu ezibalulekile ezishiwo embuzweni - okungenzeka kakhulu ingxenye yenkomba ayisethiwe. Kungenzeka ukuthi udale inkomba efanayo, kodwa ngaphandle kwezinkambu zesiqalo noma funda ukuphindaphinda amanani abo.

Isibonelo:

CREATE TABLE tbl AS
SELECT
  generate_series(1, 100000) pk      -- 100K "фактов"
, (random() *  100)::integer fk_org  -- 100 разных внешних ключей
, (random() * 1000)::integer fk_cli; -- 1K разных внешних ключей

CREATE INDEX ON tbl(fk_org, fk_cli); -- все почти как в #2
-- только вот отдельный индекс по fk_cli мы уже посчитали лишним и удалили

SELECT
  *
FROM
  tbl
WHERE
  fk_cli = 999 -- а fk_org не задано, хотя стоит в индексе раньше
LIMIT 20;

Amaresiphi emibuzo ye-SQL egulayo
[buka kokuthi explain.tensor.ru]

Konke kubonakala kuhamba kahle, noma ngokusho kwenkomba, kodwa ngandlela thize kuyasolisa - kumarekhodi angama-20 afundiwe, bekumele sikhiphe amakhasi wedatha angu-4, ama-32KB ngerekhodi ngalinye - akusona isibindi leso? Negama lenkomba tbl_fk_org_fk_cli_idx eshukumisa ingqondo.

Ukulungisa:

CREATE INDEX ON tbl(fk_cli);

Amaresiphi emibuzo ye-SQL egulayo
[buka kokuthi explain.tensor.ru]

Ngokushesha - 10 izikhathi ngokushesha, futhi izikhathi 4 kancane ukufunda!

Ezinye izibonelo zezimo zokungasebenzi kahle kwezinkomba zingabonakala esihlokweni I-DBA: ukuthola izinkomba ezingenamsebenzi.

#7: CTE × CTE

Nini

Ngesicelo uthole "amafutha" CTE kusuka ematafuleni ahlukene, wabe esenquma ukukwenza phakathi kwabo JOIN.

Icala lihambisana nezinguqulo ezingaphansi kwe-v12 noma izicelo nge WITH MATERIALIZED.

Indlela yokubona

-> CTE Scan
   && loops > 10
   && loops × (rows + RRbF) > 10000
      -- слишком большое декартово произведение CTE

Izincomo

Hlaziya ngokucophelela isicelo - futhi Ingabe ama-CTE ayadingeka lapha nhlobo?? Uma kunjalo, khona-ke sebenzisa "isichazamazwi" ku-hstore/json ngokwemodeli echazwe ku Ama-Antipatterns e-PostgreSQL: ake sishaye i-JOIN esindayo ngesichazamazwi.

#8: shintshela kudiski (ithembu libhaliwe)

Nini

Ukucubungula kwesikhathi esisodwa (ukuhlunga noma ukuhluka) kwenani elikhulu lamarekhodi akungeni kumemori eyabelwe lokhu.

Indlela yokubona

-> *
   && temp written > 0

Izincomo

Uma inani lememori elisetshenziswe umsebenzi lingadluli kakhulu inani elishiwo lepharamitha sebenza_mem, kufanelekile ukuyilungisa. Ungakwazi ngokushesha ekulungiseleleni wonke umuntu, noma ungadlula SET [LOCAL] ngesicelo/ukuthenga okuthile.

Isibonelo:

SHOW work_mem;
-- "16MB"

SELECT
  random()
FROM
  generate_series(1, 1000000)
ORDER BY
  1;

Amaresiphi emibuzo ye-SQL egulayo
[buka kokuthi explain.tensor.ru]

Ukulungisa:

SET work_mem = '128MB'; -- перед выполнением запроса

Amaresiphi emibuzo ye-SQL egulayo
[buka kokuthi explain.tensor.ru]

Ngezizathu ezisobala, uma inkumbulo kuphela isetshenziswa hhayi idiski, khona-ke umbuzo uzokwenziwa ngokushesha okukhulu. Ngesikhathi esifanayo, ingxenye yomthwalo ovela ku-HDD nayo isusiwe.

Kodwa udinga ukuqonda ukuthi ngeke ukwazi njalo ukwaba inkatho nenkumbulo eningi - ngeke nje yanele wonke umuntu.

#9: izibalo ezingabalulekile

Nini

Bathululela okuningi ku-database ngesikhathi esisodwa, kodwa babengenaso isikhathi sokuyixosha ANALYZE.

Indlela yokubona

-> Seq Scan | Bitmap Heap Scan | Index [Only] Scan [Backward]
   && ratio >> 10

Izincomo

Kwenze ANALYZE.

Lesi simo sichazwe kabanzi ku I-PostgreSQL Antipatterns: izibalo ziyikho konke.

#10: "kukhona okungahambanga kahle"

Nini

Kube nokulinda ukukhiya okubekwe yisicelo esiqhudelanayo, noma bekungekho izinsiza zehadiwe ye-CPU/hypervisor eyanele.

Indlela yokubona

-> *
   && (shared hit / 8K) + (shared read / 1K) < time / 1000
      -- RAM hit = 64MB/s, HDD read = 8MB/s
   && time > 100ms -- читали мало, но слишком долго

Izincomo

Sebenzisa kwangaphandle uhlelo lokuqapha iseva yokuvimbela noma ukusetshenziswa okungavamile kwensiza. Sesivele sikhulume ngenguqulo yethu yokuhlela le nqubo kumakhulu amaseva lapha и lapha.

Amaresiphi emibuzo ye-SQL egulayo
Amaresiphi emibuzo ye-SQL egulayo

Source: www.habr.com

Engeza amazwana