Optimizing database queries siv piv txwv ntawm B2B kev pabcuam rau cov neeg tsim khoom

Yuav ua li cas loj hlob 10 npaug ntawm cov lus nug rau hauv cov ntaub ntawv yam tsis tau tsiv mus rau cov neeg ua haujlwm tau txais txiaj ntsig ntau dua thiab tswj hwm qhov system ua haujlwm? Kuv yuav qhia koj tias peb tau ua li cas nrog kev poob qis hauv kev ua haujlwm ntawm peb cov ntaub ntawv, yuav ua li cas peb ua kom zoo dua SQL cov lus nug kom pab tau ntau tus neeg siv ntau li ntau tau thiab tsis nce tus nqi ntawm kev siv nyiaj txiag.

Kuv ua ib qho kev pabcuam rau kev tswj cov txheej txheem kev lag luam hauv cov tuam txhab tsim kho. Txog 3 txhiab lub tuam txhab ua haujlwm nrog peb. Ntau tshaj 10 txhiab tus neeg ua haujlwm nrog peb lub cev txhua hnub rau 4-10 teev. Nws daws ntau yam teeb meem ntawm kev npaj, ceeb toom, ceeb toom, validation... Peb siv PostgreSQL 9.6. Peb muaj txog 300 lub rooj hauv cov ntaub ntawv thiab txog li 200 lab cov lus nug (10 txhiab qhov sib txawv) tau txais txhua hnub. Qhov nruab nrab peb muaj 3-4 txhiab qhov kev thov ib ob, ntawm lub sijhawm nquag nquag tshaj li 10 txhiab thov ib ob. Feem ntau ntawm cov lus nug yog OLAP. Muaj tsawg dua ntxiv, hloov kho thiab tshem tawm, lub ntsiab lus ntawm OLTP load kuj yog lub teeb. Kuv tau muab tag nrho cov lej no kom koj tuaj yeem ntsuas qhov ntsuas ntawm peb qhov project thiab nkag siab tias peb qhov kev paub dhau los yuav muaj txiaj ntsig zoo rau koj.

Duab ib. Lyrical

Thaum peb pib txhim kho, peb tsis xav tiag tiag txog yam kev thauj khoom yuav poob rau hauv cov ntaub ntawv thiab peb yuav ua li cas yog tias lub server nres rub. Thaum tsim cov ntaub ntawv, peb ua raws li cov lus qhia dav dav thiab sim tsis txhob tua peb tus kheej hauv ko taw, tab sis mus dhau cov lus qhia dav dav xws li "tsis txhob siv tus qauv Tus nqi Attribute peb tsis tau mus hauv. Peb tsim raws li cov hauv paus ntsiab lus ntawm normalization, zam cov ntaub ntawv redundancy thiab tsis quav ntsej txog kev ceev cov lus nug. Thaum thawj cov neeg siv tuaj txog, peb ntsib teeb meem kev ua haujlwm. Raws li ib txwm muaj, peb tsis tau npaj txhij rau qhov no. Thawj qhov teeb meem tau ua kom yooj yim. Raws li txoj cai, txhua yam tau daws los ntawm kev ntxiv qhov ntsuas tshiab. Tab sis muaj ib lub sij hawm thaum cov thaj ua rau thaj uas yooj yim nres ua hauj lwm. Paub tias peb tsis muaj kev paub dhau los thiab nws nyuaj zuj zus rau peb kom nkag siab txog dab tsi ua rau muaj teeb meem, peb tau ntiav cov kws tshaj lij uas pab peb teeb tsa lub server kom raug, txuas kev saib xyuas, thiab qhia peb qhov twg los nrhiav kom tau txheeb cais.

Daim duab ob. Kev txheeb cais

Yog li peb muaj txog 10 txhiab cov lus nug sib txawv uas tau ua tiav ntawm peb cov ntaub ntawv ib hnub. Ntawm cov 10 txhiab no, muaj cov dab uas raug tua 2-3 lab lub sijhawm nrog lub sijhawm ua haujlwm nruab nrab ntawm 0.1-0.3 ms, thiab muaj cov lus nug nrog lub sijhawm nruab nrab ntawm 30 vib nas this uas hu ua 100 zaug hauv ib hnub.

Nws tsis tuaj yeem ua kom zoo dua tag nrho 10 txhiab cov lus nug, yog li peb txiav txim siab los txiav txim siab qhov twg los coj peb cov kev siv zog txhawm rau txhim kho kev ua haujlwm ntawm cov ntaub ntawv kom raug. Tom qab ob peb iterations, peb pib faib kev thov rau hauv hom.

TOP thov

Cov no yog cov lus nug hnyav tshaj plaws uas siv sijhawm ntau tshaj (tag nrho lub sijhawm). Cov no yog cov lus nug uas yog hu ua ntau zaus los yog cov lus nug uas siv sij hawm ntev heev los ua kom tiav (ntev thiab nquag nug tau zoo nyob rau hauv thawj iterations ntawm kev sib ntaus sib tua kom ceev). Yog li ntawd, tus neeg rau zaub mov siv sijhawm ntau tshaj plaws ntawm lawv qhov kev ua tiav. Ntxiv mus, nws yog ib qho tseem ceeb kom cais cov kev thov sab saum toj los ntawm tag nrho cov sijhawm ua tiav thiab cais los ntawm IO lub sijhawm. Cov txheej txheem rau optimizing cov lus nug no txawv me ntsis.

Qhov kev coj ua niaj hnub ntawm txhua lub tuam txhab yog ua haujlwm nrog TOP thov. Muaj ob peb ntawm lawv; optimizing txawm ib lo lus nug tuaj yeem tso tawm 5-10% ntawm cov peev txheej. Txawm li cas los xij, raws li qhov project matures, optimizing TOP queries dhau los ua ib qho kev ua haujlwm tsis tseem ceeb. Txhua txoj kev yooj yim tau ua tiav lawm, thiab qhov kev thov "hnyav" feem ntau siv "tsuas yog" 3-5% ntawm cov peev txheej. Yog tias cov lus nug TOP hauv tag nrho siv tsawg dua 30-40% ntawm lub sijhawm, feem ntau koj twb tau siv zog ua kom lawv ua haujlwm sai thiab nws yog lub sijhawm los mus rau optimizing cov lus nug los ntawm pab pawg tom ntej.
Nws tseem yuav tau teb cov lus nug ntawm pes tsawg cov lus nug saum toj kawg nkaus yuav tsum suav nrog hauv pab pawg no. Kuv feem ntau siv tsawg kawg yog 10, tab sis tsis ntau tshaj 20. Kuv sim ua kom ntseeg tau tias lub sijhawm ntawm thawj zaug thiab zaum kawg hauv pawg TOP sib txawv tsis ntau tshaj 10 zaug. Ntawd yog, yog tias cov lus nug ua tiav lub sijhawm poob qis los ntawm 1st qhov chaw mus rau 10th, tom qab ntawd kuv coj TOP-10, yog tias qhov poob qis dua, ces kuv nce cov pab pawg loj mus rau 15 lossis 20.
Optimizing database queries siv piv txwv ntawm B2B kev pabcuam rau cov neeg tsim khoom

Middle peasants

Cov no yog txhua qhov kev thov uas tuaj tam sim tom qab TOP, tshwj tsis yog qhov kawg ntawm 5-10%. Feem ntau, hauv kev ua kom zoo dua cov lus nug no yog lub sijhawm los ua kom cov server ua haujlwm tau zoo. Cov kev thov no tuaj yeem hnyav txog li 80%. Tab sis txawm tias lawv feem ntau tshaj 50%, ces nws yog lub sijhawm los saib xyuas lawv kom zoo dua.

Tail

Raws li tau hais, cov lus nug no tuaj txog thaum kawg thiab siv 5-10% ntawm lub sijhawm. Koj tuaj yeem hnov ​​​​qab txog lawv tsuas yog tias koj tsis siv cov cuab yeej tshawb nrhiav tsis siv neeg, ces optimizing lawv kuj tseem pheej yig.

Yuav ntsuas txhua pab pawg li cas?

Kuv siv cov lus nug SQL uas pab ua qhov kev ntsuam xyuas rau PostgreSQL (Kuv paub tseeb tias cov lus nug zoo sib xws tuaj yeem sau rau ntau lwm DBMSs)

SQL lus nug los kwv yees qhov loj ntawm TOP-MEDIUM-TAIL pawg

SELECT sum(time_top) AS sum_top, sum(time_medium) AS sum_medium, sum(time_tail) AS sum_tail
FROM
(
  SELECT CASE WHEN rn <= 20              THEN tt_percent ELSE 0 END AS time_top,
         CASE WHEN rn > 20 AND rn <= 800 THEN tt_percent ELSE 0 END AS time_medium,
         CASE WHEN rn > 800              THEN tt_percent ELSE 0 END AS time_tail
  FROM (
    SELECT total_time / (SELECT sum(total_time) FROM pg_stat_statements) * 100 AS tt_percent, query,
    ROW_NUMBER () OVER (ORDER BY total_time DESC) AS rn
    FROM pg_stat_statements
    ORDER BY total_time DESC
  ) AS t
)
AS ts

Qhov tshwm sim ntawm cov lus nug yog peb kab, txhua qhov muaj feem pua ​​​​ntawm lub sijhawm nws siv los ua cov lus nug los ntawm pab pawg no. Hauv qhov kev thov muaj ob tus lej (hauv kuv qhov teeb meem nws yog 20 thiab 800) uas cais kev thov los ntawm ib pab pawg los ntawm lwm tus.

Qhov no yog li cas qhov sib koom ntawm kev thov kwv yees sib piv thaum lub sij hawm optimization ua hauj lwm pib thiab tam sim no.

Optimizing database queries siv piv txwv ntawm B2B kev pabcuam rau cov neeg tsim khoom

Daim duab qhia tau hais tias feem ntawm TOP thov tau txo qis, tab sis "cov neeg ua liaj ua teb nruab nrab" tau nce.
Thaum xub thawj, TOP thov suav nrog kev ua tsis ncaj ncees. Sij hawm dhau mus, cov kab mob me nyuam yaus ploj mus, qhov feem ntawm TOP thov txo qis, thiab yuav tsum tau siv zog ntau dua los ua kom cov kev thov nyuaj.

Txhawm rau kom tau txais cov ntawv thov peb siv cov lus thov hauv qab no

SELECT * FROM (
  SELECT ROW_NUMBER () OVER (ORDER BY total_time DESC) AS rn, total_time / (SELECT sum(total_time) FROM pg_stat_statements) * 100 AS tt_percent, query
  FROM pg_stat_statements
  ORDER BY total_time DESC
) AS T
WHERE
rn <= 20 -- TOP
-- rn > 20 AND rn <= 800 -- MEDIUM
-- rn > 800  -- TAIL

Nov yog cov npe ntawm cov txheej txheem siv ntau tshaj plaws uas tau pab peb ua kom cov lus nug TOP:

  • Kev tsim kho tshiab ntawm lub system, piv txwv li, rov ua haujlwm ntawm kev ceeb toom logic siv cov lus broker tsis yog cov lus nug ib ntus mus rau cov ntaub ntawv.
  • Ntxiv lossis hloov indexes
  • Rewriting ORM cov lus nug rau SQL ntshiab
  • Rewriting tub nkeeg cov ntaub ntawv loading logic
  • Caching los ntawm cov ntaub ntawv denormalization. Piv txwv li, peb muaj lub rooj sib txuas Kev Xa -> Invoice -> Thov -> Daim Ntawv Thov. Ntawd yog, txhua qhov kev xa khoom yog txuam nrog daim ntawv thov los ntawm lwm lub rooj. Txhawm rau kom tsis txhob txuas tag nrho cov lus hauv txhua qhov kev thov, peb muab qhov txuas mus rau qhov kev thov hauv lub rooj xa khoom.
  • Caching cov ntxhuav zoo li qub nrog cov phau ntawv siv thiab tsis tshua hloov cov ntxhuav hauv qhov kev nco.

Qee lub sij hawm cov kev hloov pauv tau ua rau muaj txiaj ntsig zoo dua qub, tab sis lawv tau muab 5-10% ntawm qhov system thauj khoom thiab tau tsim nyog. Nyob rau tib lub sijhawm, cov pa tawm tau me dua thiab me dua, thiab xav tau kev kho dua tshiab thiab hnyav dua.

Tom qab ntawd peb tig peb lub siab mus rau pawg thib ob ntawm kev thov - pab pawg neeg peasants nruab nrab. Muaj ntau ntau cov lus nug hauv nws thiab nws zoo li tias nws yuav siv sijhawm ntau los tshuaj xyuas tag nrho pawg. Txawm li cas los xij, feem ntau cov lus nug tau dhau los ua qhov yooj yim heev los txhim kho, thiab ntau qhov teeb meem tau rov ua dua kaum ob zaug hauv ntau qhov sib txawv. Nov yog cov piv txwv ntawm qee qhov kev ua kom zoo tshaj plaws uas peb tau thov rau ntau ntau cov lus nug zoo sib xws thiab txhua pab pawg ntawm cov lus nug optimized unloaded cov ntaub ntawv los ntawm 3-5%.

  • Hloov chaw ntawm kev tshuaj xyuas qhov muaj cov ntaub ntawv siv COUNT thiab tag nrho cov lus scan, EXISTS tau pib siv
  • Tau tshem ntawm DISTINCT (tsis muaj daim ntawv qhia dav dav, tab sis qee zaum koj tuaj yeem tshem tau yooj yim los ntawm kev ua kom nrawm dua li 10-100 zaug).

    Piv txwv li, es tsis txhob muaj lus nug los xaiv tag nrho cov tsav tsheb los ntawm lub rooj loj ntawm kev xa khoom (DELIVERY)

    SELECT DISTINCT P.ID, P.FIRST_NAME, P.LAST_NAME
    FROM DELIVERY D JOIN PERSON P ON D.DRIVER_ID = P.ID
    

    ua ib qho lus nug ntawm lub rooj me me PERSON

    SELECT P.ID, P.FIRST_NAME, P.LAST_NAME
    FROM PERSON
    WHERE EXISTS(SELECT D.ID FROM DELIVERY WHERE D.DRIVER_ID = P.ID)
    

    Nws yuav zoo li tias peb siv cov lus sib txuas sib txuas, tab sis nws muab qhov nrawm dua li 10 zaug.

  • Feem ntau, COUNT raug tso tseg tag nrho thiab
    hloov los ntawm kev xam tus nqi kwv yees
  • es tsis txhob
    UPPER(s) LIKE JOHN%’ 
    

    siv

    s ILIKE β€œJohn%”
    

Txhua qhov kev thov tshwj xeeb qee zaum tau nrawm los ntawm 3-1000 zaug. Txawm hais tias qhov kev ua tau zoo, thaum xub thawj nws zoo li peb tias tsis muaj qhov taw tes rau kev ua kom zoo dua cov lus nug uas siv 10 ms kom tiav, yog ib qho ntawm 3rd puas hnyav tshaj plaws cov lus nug, thiab siv ntau pua ntawm ib feem pua ​​​​ntawm tag nrho cov ntaub ntawv thauj khoom lub sijhawm. Tab sis los ntawm kev siv tib daim ntawv qhia rau ib pawg ntawm cov lus nug ntawm tib hom, peb yeej rov qab ob peb feem pua. Yuav kom tsis txhob nkim sij hawm manually tshuaj xyuas tag nrho ntau pua cov lus nug, peb tau sau ob peb tsab ntawv yooj yim uas siv cov lus qhia tsis tu ncua los nrhiav cov lus nug ntawm tib hom. Yog li ntawd, cia li tshawb nrhiav pab pawg ntawm cov lus nug tau tso cai rau peb ntxiv dag zog rau peb qhov kev ua tau zoo nrog kev siv zog me ntsis.

Raws li qhov tshwm sim, peb tau ua haujlwm ntawm tib lub kho vajtse rau peb xyoos tam sim no. Qhov nruab nrab txhua hnub load yog hais txog 30%, nyob rau hauv peaks nws mus txog 70%. Tus naj npawb ntawm kev thov, nrog rau tus naj npawb ntawm cov neeg siv, tau nce kwv yees li 10 npaug. Thiab tag nrho cov no ua tsaug rau kev saib xyuas tas li ntawm cov pab pawg ntawm TOP-MEDIUM thov. Sai li qhov kev thov tshiab tshwm sim hauv pawg TOP, peb tam sim ntawd txheeb xyuas nws thiab sim ua kom nws nrawm dua. Peb tshuaj xyuas cov pab pawg MEDIUM ib zaug ib lub lim tiam uas siv cov lus nug tshawb xyuas cov ntawv. Yog tias peb tuaj hla cov lus nug tshiab uas peb twb paub yuav ua li cas kom zoo, peb hloov lawv sai sai. Qee lub sij hawm peb pom txoj kev ua kom zoo dua tshiab uas tuaj yeem siv rau ntau cov lus nug ib zaug.

Raws li peb cov kev kwv yees, cov neeg rau zaub mov tam sim no yuav tiv taus qhov nce ntawm cov neeg siv los ntawm lwm 3-5 zaug. Muaj tseeb, peb muaj ib qho ntxiv ace ntawm peb lub tes tsho - peb tseem tsis tau hloov SELECT cov lus nug rau daim iav, raws li tau pom zoo. Tab sis peb tsis ua qhov no kom paub meej, vim peb xav ua ntej tag nrho cov muaj peev xwm ntawm "ntse" optimization ua ntej tig rau "hnyav artillery".
Ib qho tseem ceeb ntawm kev ua haujlwm ua tiav tuaj yeem hais kom siv qhov ntsuas ntsug. Yuav ib lub server muaj zog dua es tsis txhob nkim sijhawm ntawm cov kws tshaj lij. Cov neeg rau zaub mov yuav tsis raug nqi ntau npaum li cas, tshwj xeeb tshaj yog vim peb tseem tsis tau tag nrho cov kev txwv ntawm ntsug scaling. Txawm li cas los xij, tsuas yog tus naj npawb ntawm kev thov nce 10 npaug. Tau ntau xyoo, kev ua haujlwm ntawm qhov system tau nce thiab tam sim no muaj ntau hom kev thov. Ua tsaug rau caching, kev ua haujlwm uas muaj nyob rau hauv kev thov tsawg dua, thiab kev thov tau zoo dua. Qhov no txhais tau tias koj tuaj yeem ruaj ntseg sib ntxiv los ntawm lwm 5 kom tau txais qhov sib npaug ntawm qhov nrawm. Yog li, raws li kev kwv yees kwv yees tshaj plaws, peb tuaj yeem hais tias qhov nrawm yog 50 zaug lossis ntau dua. Vertically swinging ib tus neeg rau zaub mov yuav raug nqi 50 npaug ntau dua. Tshwj xeeb tshaj yog xav tias ib zaug optimization yog nqa tawm nws ua hauj lwm txhua lub sij hawm, thiab cov nqi rau cov neeg rau zaub mov xauj los txhua hli.

Tau qhov twg los: www.hab.com

Ntxiv ib saib