Girke-girke don tambayoyin SQL marasa lafiya

Watanni da suka gabata mun sanar bayyana.tensor.ru - jama'a sabis don tantancewa da hangen nesa tsare-tsaren tambaya zuwa PostgreSQL.

Kun riga kun yi amfani da shi fiye da sau 6000, amma fasali ɗaya mai amfani wanda wataƙila ba a lura da shi ba shine. alamu na tsari, wanda yayi kama da haka:

Girke-girke don tambayoyin SQL marasa lafiya

Saurara gare su, kuma buƙatunku za su “zama santsi da siliki.” 🙂

Amma da gaske, yawancin yanayi waɗanda ke sa buƙatar jinkirin da yunwar albarkatu su ne na hali kuma ana iya gane su ta hanyar tsari da bayanai na shirin.

A wannan yanayin, kowane mai haɓakawa ba dole ba ne ya nemi zaɓi na ingantawa da kansa, yana dogara kawai akan ƙwarewarsa - zamu iya gaya masa abin da ke faruwa a nan, menene zai iya zama dalili, kuma yadda ake tunkarar mafita. Abin da muka yi ke nan.

Girke-girke don tambayoyin SQL marasa lafiya

Bari mu dubi waɗannan lamuran - yadda aka ayyana su da irin shawarwarin da suke kaiwa.

Don mafi kyawun nutsar da kanku a cikin maudu'in, zaku iya fara sauraron toshe daidai daga rahoto na a PGConf.Russia 2020, sannan kawai a ci gaba zuwa cikakken bincike na kowane misali:

#1: index "karɓar bayanai"

Lokacin da ya taso

Nuna sabuwar daftari ga abokin ciniki "LLC Kolokolchik".

Yadda ake ganewa

-> Limit
   -> Sort
      -> Index [Only] Scan [Backward] | Bitmap Heap Scan

shawarwari

An yi amfani da Index fadada tare da nau'in filayen.

Alal misali:

CREATE TABLE tbl AS
SELECT
  generate_series(1, 100000) pk  -- 100K "фактов"
, (random() * 1000)::integer fk_cli; -- 1K разных внешних ключей

CREATE INDEX ON tbl(fk_cli); -- индекс для foreign key

SELECT
  *
FROM
  tbl
WHERE
  fk_cli = 1 -- отбор по конкретной связи
ORDER BY
  pk DESC -- хотим всего одну "последнюю" запись
LIMIT 1;

Girke-girke don tambayoyin SQL marasa lafiya
[duba bayanin.tensor.ru]

Nan da nan za ku iya lura cewa an cire bayanai sama da 100 daga cikin index, waɗanda aka jera su duka, sannan kawai ya rage.

Gyara:

DROP INDEX tbl_fk_cli_idx;
CREATE INDEX ON tbl(fk_cli, pk DESC); -- добавили ключ сортировки

Girke-girke don tambayoyin SQL marasa lafiya
[duba bayanin.tensor.ru]

Ko da akan irin wannan samfurin na farko - Sau 8.5 cikin sauri da ƙarancin karantawa sau 33. Ƙarin "gaskiya" da kuke da ita ga kowane ƙima, mafi yawan tasirin tasirin fk.

Na lura cewa irin wannan index zai yi aiki a matsayin "prefix" index ba mafi muni fiye da baya ga sauran queries tare da. fk, inda aka tsara ta pk babu kuma babu (zaku iya karantawa game da wannan a cikin labarina game da gano ma'auni marasa tasiri). Ciki har da, zai samar da al'ada bayyanannen tallafin maɓalli na ƙasashen waje akan wannan filin.

#2: index intersection (BitmapAnd)

Lokacin da ya taso

Nuna duk yarjejeniyoyin abokin ciniki "LLC Kolokolchik", wanda aka kammala a madadin "NAO Buttercup".

Yadda ake ganewa

-> BitmapAnd
   -> Bitmap Index Scan
   -> Bitmap Index Scan

shawarwari

ƙirƙiri hadaddiyar giyar ta filayen daga duka na asali ko fadada ɗaya daga cikin waɗanda ke da filayen daga na biyu.

Alal misali:

CREATE TABLE tbl AS
SELECT
  generate_series(1, 100000) pk      -- 100K "фактов"
, (random() *  100)::integer fk_org  -- 100 разных внешних ключей
, (random() * 1000)::integer fk_cli; -- 1K разных внешних ключей

CREATE INDEX ON tbl(fk_org); -- индекс для foreign key
CREATE INDEX ON tbl(fk_cli); -- индекс для foreign key

SELECT
  *
FROM
  tbl
WHERE
  (fk_org, fk_cli) = (1, 999); -- отбор по конкретной паре

Girke-girke don tambayoyin SQL marasa lafiya
[duba bayanin.tensor.ru]

Gyara:

DROP INDEX tbl_fk_org_idx;
CREATE INDEX ON tbl(fk_org, fk_cli);

Girke-girke don tambayoyin SQL marasa lafiya
[duba bayanin.tensor.ru]

Rikicin da ake samu anan ya yi kadan, tunda Bitmap Heap Scan yana da inganci da kan sa. Amma duk da haka Sau 7 cikin sauri da ƙarancin karantawa sau 2.5.

#3: Haɗa fihirisa (BitmapOr)

Lokacin da ya taso

Nuna farkon 20 mafi tsufa "mu" ko buƙatun da ba a sanya su ba don sarrafawa, tare da fifikon ku.

Yadda ake ganewa

-> BitmapOr
   -> Bitmap Index Scan
   -> Bitmap Index Scan

shawarwari

Amfani UNION [ALL] don haɗa abubuwan da ake buƙata don kowane OR-bulogin yanayi.

Alal misali:

CREATE TABLE tbl AS
SELECT
  generate_series(1, 100000) pk  -- 100K "фактов"
, CASE
    WHEN random() < 1::real/16 THEN NULL -- с вероятностью 1:16 запись "ничья"
    ELSE (random() * 100)::integer -- 100 разных внешних ключей
  END fk_own;

CREATE INDEX ON tbl(fk_own, pk); -- индекс с "вроде как подходящей" сортировкой

SELECT
  *
FROM
  tbl
WHERE
  fk_own = 1 OR -- свои
  fk_own IS NULL -- ... или "ничьи"
ORDER BY
  pk
, (fk_own = 1) DESC -- сначала "свои"
LIMIT 20;

Girke-girke don tambayoyin SQL marasa lafiya
[duba bayanin.tensor.ru]

Gyara:

(
  SELECT
    *
  FROM
    tbl
  WHERE
    fk_own = 1 -- сначала "свои" 20
  ORDER BY
    pk
  LIMIT 20
)
UNION ALL
(
  SELECT
    *
  FROM
    tbl
  WHERE
    fk_own IS NULL -- потом "ничьи" 20
  ORDER BY
    pk
  LIMIT 20
)
LIMIT 20; -- но всего - 20, больше и не надо

Girke-girke don tambayoyin SQL marasa lafiya
[duba bayanin.tensor.ru]

Mun yi amfani da gaskiyar cewa an karɓi duk bayanan 20 da ake buƙata nan da nan a farkon toshe, don haka na biyu, tare da mafi “tsada” Bitmap Heap Scan, ba a ma kashe shi ba - a ƙarshe. 22x sauri, 44x ƙarancin karantawa!

Ƙarin cikakken labari game da wannan hanyar ingantawa ta amfani da takamaiman misalai ana iya karantawa a cikin labarai PostgreSQL Antipatterns: JOINs masu cutarwa da ORs и PostgreSQL Antipatterns: tatsuniya na sake fasalin bincike da suna, ko "Ingantacciyar gaba da gaba".

Gabaɗaya sigar zaɓin da aka ba da oda bisa maɓalli da yawa (kuma ba kawai const/NULL biyu ba) an tattauna a cikin labarin SQL Yadda To: rubuta madauki na ɗan lokaci kai tsaye a cikin tambayar, ko "Mataki na farko na farko".

#4: Mun karanta abubuwa da yawa waɗanda ba dole ba

Lokacin da ya taso

A matsayinka na mai mulki, yana tasowa lokacin da kake son "haɗa wani tace" zuwa buƙatar da ta riga ta kasance.

"Kuma ba ku da guda ɗaya, amma tare da maɓallan uwar-lu'u-lu'u? " fim din "The Diamond Arm"

Misali, gyara aikin da ke sama, nuna buƙatun "masu mahimmanci" 20 na farko don aiki, ba tare da la'akari da manufarsu ba.

Yadda ake ganewa

-> Seq Scan | Bitmap Heap Scan | Index [Only] Scan [Backward]
   && 5 × rows < RRbF -- отфильтровано >80% прочитанного
   && loops × RRbF > 100 -- и при этом больше 100 записей суммарно

shawarwari

Ƙirƙiri [ƙari] na musamman index tare da yanayin INA ko haɗa ƙarin filaye a cikin fihirisar.

Idan yanayin tacewa ya kasance "a tsaye" don manufar ku - wato baya nufin fadadawa jerin dabi'u a nan gaba - yana da kyau a yi amfani da ma'anar WHERE. Matsayi daban-daban na boolean/enum sun dace da wannan rukunin.

Idan yanayin tacewa na iya ɗaukar ma'anoni daban-daban, to yana da kyau a fadada index tare da waɗannan filayen - kamar yadda a cikin halin da ake ciki tare da BitmapAnd sama.

Alal misali:

CREATE TABLE tbl AS
SELECT
  generate_series(1, 100000) pk -- 100K "фактов"
, CASE
    WHEN random() < 1::real/16 THEN NULL
    ELSE (random() * 100)::integer -- 100 разных внешних ключей
  END fk_own
, (random() < 1::real/50) critical; -- 1:50, что заявка "критичная"

CREATE INDEX ON tbl(pk);
CREATE INDEX ON tbl(fk_own, pk);

SELECT
  *
FROM
  tbl
WHERE
  critical
ORDER BY
  pk
LIMIT 20;

Girke-girke don tambayoyin SQL marasa lafiya
[duba bayanin.tensor.ru]

Gyara:

CREATE INDEX ON tbl(pk)
  WHERE critical; -- добавили "статичное" условие фильтрации

Girke-girke don tambayoyin SQL marasa lafiya
[duba bayanin.tensor.ru]

Kamar yadda kake gani, tacewa gaba ɗaya ya ɓace daga shirin, kuma buƙatar ta zama Sau 5 cikin sauri.

#5: tebur mai ban mamaki

Lokacin da ya taso

Ƙoƙari daban-daban don ƙirƙirar jerin gwanon sarrafa aikin ku, lokacin da yawancin sabuntawa / gogewa na rikodin akan tebur ya kai ga yanayin adadi mai yawa na rikodin "matattu".

Yadda ake ganewa

-> Seq Scan | Bitmap Heap Scan | Index [Only] Scan [Backward]
   && loops × (rows + RRbF) < (shared hit + shared read) × 8
      -- прочитано больше 1KB на каждую запись
   && shared hit + shared read > 64

shawarwari

Yi da hannu akai-akai VACUUM [FULL] ko samun isasshen horo akai-akai autovacuum ta hanyar daidaita sigoginsa, gami da don takamaiman tebur.

A mafi yawan lokuta, irin waɗannan matsalolin suna faruwa ne ta hanyar ƙayyadaddun ƙayyadaddun tambaya lokacin da ake kira daga dabarun kasuwanci kamar waɗanda aka tattauna a ciki PostgreSQL Antipatterns: yakar gungun “matattu”.

Amma kuna buƙatar fahimtar cewa ko da VACUUM FULL ba koyaushe yana taimakawa ba. Don irin waɗannan lokuta, yana da daraja sanin kanku tare da algorithm daga labarin DBA: lokacin da VACUUM ya kasa, muna tsaftace tebur da hannu.

#6: Karatu daga “tsakiyar” fihirisar

Lokacin da ya taso

Da alama mun karanta kadan, kuma komai an jera shi, kuma ba mu tace kowa fiye da haka ba - amma duk da haka muna karanta ƙarin shafuka fiye da yadda muke so.

Yadda ake ganewa

-> Index [Only] Scan [Backward]
   && loops × (rows + RRbF) < (shared hit + shared read) × 8
      -- прочитано больше 1KB на каждую запись
   && shared hit + shared read > 64

shawarwari

Dubi tsarin fihirisar da aka yi amfani da shi da maɓallan filayen da aka kayyade a cikin tambayar - mai yiwuwa Ba a ƙayyade wani ɓangare na fihirisar ba. Mafi mahimmanci za ku ƙirƙiri irin wannan fihirisar, amma ba tare da filayen prefix ko koyi maimaita dabi'unsu.

Alal misali:

CREATE TABLE tbl AS
SELECT
  generate_series(1, 100000) pk      -- 100K "фактов"
, (random() *  100)::integer fk_org  -- 100 разных внешних ключей
, (random() * 1000)::integer fk_cli; -- 1K разных внешних ключей

CREATE INDEX ON tbl(fk_org, fk_cli); -- все почти как в #2
-- только вот отдельный индекс по fk_cli мы уже посчитали лишним и удалили

SELECT
  *
FROM
  tbl
WHERE
  fk_cli = 999 -- а fk_org не задано, хотя стоит в индексе раньше
LIMIT 20;

Girke-girke don tambayoyin SQL marasa lafiya
[duba bayanin.tensor.ru]

Komai yana da kyau, ko da bisa ga index, amma yana da ko ta yaya m - ga kowane daga cikin 20 records karanta, dole ne mu cire 4 shafukan bayanai, 32KB kowane rikodin - wannan ba m? Kuma index sunan tbl_fk_org_fk_cli_idx tada hankali.

Gyara:

CREATE INDEX ON tbl(fk_cli);

Girke-girke don tambayoyin SQL marasa lafiya
[duba bayanin.tensor.ru]

Nan da nan - Sau 10 cikin sauri, kuma sau 4 ƙasa da karantawa!

Sauran misalan yanayi na rashin amfani da fihirisa za a iya gani a cikin labarin DBA: gano maƙasudai marasa amfani.

#7: CTE × CTE

Lokacin da ya taso

A nema ya ci "fat" CTE daga teburi daban-daban, sannan suka yanke shawarar yin hakan a tsakaninsu JOIN.

Shari'ar ta dace da nau'ikan da ke ƙasa v12 ko buƙatun tare da WITH MATERIALIZED.

Yadda ake ganewa

-> CTE Scan
   && loops > 10
   && loops × (rows + RRbF) > 10000
      -- слишком большое декартово произведение CTE

shawarwari

Yi nazarin buƙatar a hankali - kuma Ana buƙatar CTEs anan gaba ɗaya?? Idan eh, to Aiwatar da "kamus" a hstore/json bisa ga samfurin da aka bayyana a ciki PostgreSQL Antipatterns: bari mu buga nauyi JOIN tare da ƙamus.

#8: musanyawa zuwa faifai (an rubuta temp)

Lokacin da ya taso

Sarrafa lokaci ɗaya (ƙirƙira ko keɓancewa) na babban adadin rikodin bai dace da ƙwaƙwalwar ajiya da aka keɓe don wannan ba.

Yadda ake ganewa

-> *
   && temp written > 0

shawarwari

Idan adadin ƙwaƙwalwar ajiyar da aikin ke amfani da shi bai wuce ƙayyadadden ƙimar sigar ba aiki_mem, yana da kyau a gyara shi. Kuna iya nan da nan a cikin saitunan don kowa, ko kuna iya shiga SET [LOCAL] don takamaiman buƙatu/ma'amala.

Alal misali:

SHOW work_mem;
-- "16MB"

SELECT
  random()
FROM
  generate_series(1, 1000000)
ORDER BY
  1;

Girke-girke don tambayoyin SQL marasa lafiya
[duba bayanin.tensor.ru]

Gyara:

SET work_mem = '128MB'; -- перед выполнением запроса

Girke-girke don tambayoyin SQL marasa lafiya
[duba bayanin.tensor.ru]

Don dalilai masu ma'ana, idan an yi amfani da ƙwaƙwalwar ajiya kawai ba faifai ba, to za a aiwatar da tambayar da sauri. A lokaci guda kuma, ana cire ɓangaren kaya daga HDD.

Amma kuna buƙatar fahimtar cewa ba koyaushe za ku iya rarraba ƙuri'a da ƙwaƙwalwar ajiya ba - kawai ba za ku isa ga kowa ba.

#9: ƙididdiga marasa mahimmanci

Lokacin da ya taso

Sun zubar da yawa a cikin bayanan lokaci guda, amma ba su da lokacin fitar da su ANALYZE.

Yadda ake ganewa

-> Seq Scan | Bitmap Heap Scan | Index [Only] Scan [Backward]
   && ratio >> 10

shawarwari

Ci gaba da shi ANALYZE.

An bayyana wannan halin dalla-dalla a cikin PostgreSQL Antipatterns: statistics ne komai.

#10: "wani abu ya faru"

Lokacin da ya taso

An yi jira don kulle ta hanyar buƙatun gasa, ko kuma babu isassun kayan aikin CPU/mai haɓakawa.

Yadda ake ganewa

-> *
   && (shared hit / 8K) + (shared read / 1K) < time / 1000
      -- RAM hit = 64MB/s, HDD read = 8MB/s
   && time > 100ms -- читали мало, но слишком долго

shawarwari

Yi amfani da waje tsarin kulawa uwar garken don toshewa ko rashin amfani da albarkatu. Mun riga mun yi magana game da sigar mu na tsara wannan tsari don ɗaruruwan sabobin a nan и a nan.

Girke-girke don tambayoyin SQL marasa lafiya
Girke-girke don tambayoyin SQL marasa lafiya

source: www.habr.com

Add a comment