Oyang'anira masauzande ambiri ochokera kumaofesi ogulitsa m'dziko lonselo amalemba
Chifukwa chake, sizodabwitsa kuti, kusanthulanso mafunso "olemera" pa imodzi mwazosunga zodzaza kwambiri - zathu.
Komanso, kufufuza kwina kunasonyeza chitsanzo chochititsa chidwi kukhathamiritsa koyamba kenako ndikuwonongeka kwa magwiridwe antchito funsani ndi kuwongolera motsatizana ndi magulu angapo, omwe adachita ndi zolinga zabwino.
0: Kodi wosuta amafuna chiyani?
[KDPV
Kodi wogwiritsa ntchito nthawi zambiri amatanthauza chiyani akamanena zakusaka "mwachangu" ndi dzina? Sizikuwoneka ngati "woona mtima" kusaka kwamtundu wocheperako ... LIKE '%ΡΠΎΠ·Π°%'
- chifukwa ndiye zotsatira zikuphatikizapo osati 'Π ΠΎΠ·Π°Π»ΠΈΡ'
ΠΈ 'ΠΠ°Π³Π°Π·ΠΈΠ½ Π ΠΎΠ·Π°'
koma 'ΠΡΠΎΠ·Π°'
ndipo ngakhale 'ΠΠΎΠΌ ΠΠ΅Π΄Π° ΠΠΎΡΠΎΠ·Π°'
.
Wogwiritsa akuganiza pamlingo watsiku ndi tsiku womwe mungamupatse fufuzani poyambira mawu m'mutu ndikuwapangitsa kukhala oyenera kwambiri imayamba ndi adalowa. Ndipo inu mudzachita izo pafupifupi nthawi yomweyo - pazolowera zamkati.
1: kuchepetsa ntchito
Ndipo koposa zonse, munthu sangalowe mwachindunji 'ΡΠΎΠ· ΠΌΠ°Π³Π°Π·'
, kotero kuti muyenera kufufuza liwu lililonse ndi prefix. Ayi, ndikosavuta kuti wogwiritsa ntchito ayankhe mwachangu mawu omaliza kusiyana ndi "kutanthauzira" mwadala - yang'anani momwe makina osakira amachitira izi.
Ayi, kulondola kupanga zofunikira pavutoli ndikoposa theka la yankho. Nthawi zina kusanthula nkhani mosamala
Kodi wopanga mapulogalamu amachita chiyani?
1.0: injini yosakira kunja
O, kusaka ndikovuta, sindikufuna kuchita kalikonse - tiyeni tipereke kwa ma devops! Aloleni atumize injini yosakira kunja kwa nkhokwe: Sphinx, ElasticSearch,...
Njira yogwirira ntchito, ngakhale yogwira ntchito molimbika malinga ndi kulunzanitsa ndi liwiro la kusintha. Koma osati kwa ife, popeza kufufuzidwa kumachitidwa kwa kasitomala aliyense mkati mwa dongosolo la data yake. Ndipo deta ili ndi kusiyana kwakukulu - ndipo ngati woyang'anira tsopano walowa khadi 'ΠΠ°Π³Π°Π·ΠΈΠ½ Π ΠΎΠ·Π°'
, ndiye pambuyo pa masekondi 5-10 akhoza kukumbukira kale kuti anaiwala kusonyeza imelo yake kumeneko ndipo akufuna kupeza izo ndi kukonza.
Chifukwa chake - tiyeni fufuzani "mwachindunji mu database". Mwamwayi, PostgreSQL imatilola kuchita izi, osati njira imodzi yokha - tiziyang'ana.
1.1: "woona mtima" substring
Timamatira ku mawu oti "substring". Koma pakufufuza kwa index ndi chingwe chaching'ono (komanso ndi mawu okhazikika!) pali zabwino kwambiri
Tiyeni tiyese kutenga mbale zotsatirazi kuti chitsanzocho chikhale chosavuta:
CREATE TABLE firms(
id
serial
PRIMARY KEY
, name
text
);
Timayika ma rekodi 7.8 miliyoni a mabungwe enieni kumeneko ndikuwalozera:
CREATE EXTENSION pg_trgm;
CREATE INDEX ON firms USING gin(lower(name) gin_trgm_ops);
Tiyeni tiyang'ane zolemba 10 zoyamba zakusaka pakati pa mizere:
SELECT
*
FROM
firms
WHERE
lower(name) ~ ('(^|s)' || 'ΡΠΎΠ·Π°')
ORDER BY
lower(name) ~ ('^' || 'ΡΠΎΠ·Π°') DESC -- ΡΠ½Π°ΡΠ°Π»Π° "Π½Π°ΡΠΈΠ½Π°ΡΡΠΈΠ΅ΡΡ Π½Π°"
, lower(name) -- ΠΎΡΡΠ°Π»ΡΠ½ΠΎΠ΅ ΠΏΠΎ Π°Π»ΡΠ°Π²ΠΈΡΡ
LIMIT 10;
Chabwino, ndiye ... 26ms, 31MB werengani zambiri ndi zolemba zopitilira 1.7K zosefedwa - za 10 zofufuzidwa. Mtengo wamtengo wapatali ndi wokwera kwambiri, kodi pali china chake chothandiza?
1.2: kusaka ndi mawu? Ndi FTS!
Zowonadi, PostgreSQL imapereka mphamvu kwambiri
CREATE INDEX ON firms USING gin(to_tsvector('simple'::regconfig, lower(name)));
SELECT
*
FROM
firms
WHERE
to_tsvector('simple'::regconfig, lower(name)) @@ to_tsquery('simple', 'ΡΠΎΠ·Π°:*')
ORDER BY
lower(name) ~ ('^' || 'ΡΠΎΠ·Π°') DESC
, lower(name)
LIMIT 10;
Apa kufanana kwa kufunsa mafunso kunatithandiza pang'ono, kudula nthawi pakati kuti 11ms. Ndipo tidayenera kuwerenga ka 1.5 kuchepera - palimodzi 20MB. Koma apa, zochepa, zimakhala bwino, chifukwa kuchuluka kwa voliyumu yomwe timawerenga, kumapangitsa kuti pakhale mwayi wopeza cache, ndipo tsamba lililonse lowonjezera la deta lomwe limawerengedwa kuchokera ku diski ndilo "mabuleki" omwe angathe kupempha.
1.3: mukadakonda?
Pempho lapitalo ndilabwino kwa aliyense, koma pokhapokha mutakoka kambirimbiri patsiku, lidzabwera 2TB werengani deta. Muzochitika zabwino, kuchokera pamtima, koma ngati mulibe mwayi, ndiye kuchokera pa disk. Choncho tiyeni tiyese kuzichepetsa.
Tiyeni tikumbukire zomwe wosuta akufuna kuwona choyamba, "chomwe chimayamba ndi ...". Kotero izi ziri mu mawonekedwe ake oyera text_pattern_ops
! Ndipo pokhapokha ngati "tilibe okwanira" mpaka zolemba 10 zomwe tikufuna, ndiye kuti tidzamaliza kuwawerenga pogwiritsa ntchito kufufuza kwa FTS:
CREATE INDEX ON firms(lower(name) text_pattern_ops);
SELECT
*
FROM
firms
WHERE
lower(name) LIKE ('ΡΠΎΠ·Π°' || '%')
LIMIT 10;
Kuchita bwino kwambiri - kwathunthu 0.05ms ndi kupitirira pang'ono 100KB werengani! Kungoti tinayiwala mtundu ndi dzinakuti wogwiritsa ntchito asatayike pazotsatira zake:
SELECT
*
FROM
firms
WHERE
lower(name) LIKE ('ΡΠΎΠ·Π°' || '%')
ORDER BY
lower(name)
LIMIT 10;
O, chinachake sichilinso chokongola kwambiri - zikuwoneka ngati pali ndondomeko, koma kusanja ntchentche kumadutsa ...
1.4: "malizani ndi fayilo"
Koma pali index yomwe imakulolani kuti mufufuze mosiyanasiyana ndikugwiritsabe ntchito kusanja moyenera - btree wamba!
CREATE INDEX ON firms(lower(name));
Pempho lokhalo liyenera "kusonkhanitsidwa pamanja":
SELECT
*
FROM
firms
WHERE
lower(name) >= 'ΡΠΎΠ·Π°' AND
lower(name) <= ('ΡΠΎΠ·Π°' || chr(65535)) -- Π΄Π»Ρ UTF8, Π΄Π»Ρ ΠΎΠ΄Π½ΠΎΠ±Π°ΠΉΡΠΎΠ²ΡΡ
- chr(255)
ORDER BY
lower(name)
LIMIT 10;
Zabwino kwambiri - kusanja kumagwira ntchito, ndipo kugwiritsa ntchito zida kumakhalabe "kosawoneka bwino", nthawi masauzande ambiri kuposa FTS "yoyera".! Zomwe zatsala ndikuziyika pamodzi kukhala pempho limodzi:
(
SELECT
*
FROM
firms
WHERE
lower(name) >= 'ΡΠΎΠ·Π°' AND
lower(name) <= ('ΡΠΎΠ·Π°' || chr(65535)) -- Π΄Π»Ρ UTF8, Π΄Π»Ρ ΠΎΠ΄Π½ΠΎΠ±Π°ΠΉΡΠΎΠ²ΡΡ
ΠΊΠΎΠ΄ΠΈΡΠΎΠ²ΠΎΠΊ - chr(255)
ORDER BY
lower(name)
LIMIT 10
)
UNION ALL
(
SELECT
*
FROM
firms
WHERE
to_tsvector('simple'::regconfig, lower(name)) @@ to_tsquery('simple', 'ΡΠΎΠ·Π°:*') AND
lower(name) NOT LIKE ('ΡΠΎΠ·Π°' || '%') -- "Π½Π°ΡΠΈΠ½Π°ΡΡΠΈΠ΅ΡΡ Π½Π°" ΠΌΡ ΡΠΆΠ΅ Π½Π°ΡΠ»ΠΈ Π²ΡΡΠ΅
ORDER BY
lower(name) ~ ('^' || 'ΡΠΎΠ·Π°') DESC -- ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΠ΅ΠΌ ΡΡ ΠΆΠ΅ ΡΠΎΡΡΠΈΡΠΎΠ²ΠΊΡ, ΡΡΠΎΠ±Ρ ΠΠ ΠΏΠΎΠΉΡΠΈ ΠΏΠΎ btree-ΠΈΠ½Π΄Π΅ΠΊΡΡ
, lower(name)
LIMIT 10
)
LIMIT 10;
Dziwani kuti subquery yachiwiri ikuchitidwa kokha ngati woyambayo abweranso mocheperapo kuposa mmene ankayembekezera komaliza LIMIT
chiwerengero cha mizere. Ndikulankhula za njira iyi yokwaniritsira mafunso
Chifukwa chake inde, tili ndi zonse btree ndi gin patebulo, koma powerengera zikuwonekera zosakwana 10% zopempha zimafika pakuchita chipika chachiwiri. Ndiko kuti, ndi zofooka zotere zomwe zimadziwika pasadakhale za ntchitoyi, tidatha kuchepetsa kugwiritsa ntchito zida zonse za seva pafupifupi nthawi chikwi!
1.5 *: titha kuchita popanda fayilo
Pamwambapa LIKE
Tinaletsedwa kugwiritsa ntchito kusanja kolakwika. Koma itha "kukhazikitsidwa panjira yoyenera" pofotokoza wogwiritsa ntchito USING:
Mwachikhazikitso zimaganiziridwa
ASC
. Kuphatikiza apo, mutha kutchula dzina la opareshoni yamtundu wina mu chiganizoUSING
. Wogwiritsa ntchito mtundu akuyenera kukhala membala wa ochepera kapena wamkulu kuposa wa banja lina la ogwiritsira ntchito mitengo ya B.ASC
kawirikawiri zofananaUSING <
ΠΈDESC
kawirikawiri zofananaUSING >
.
Kwa ife, "zochepa" ndi ~<~
:
SELECT
*
FROM
firms
WHERE
lower(name) LIKE ('ΡΠΎΠ·Π°' || '%')
ORDER BY
lower(name) USING ~<~
LIMIT 10;
2: momwe zopempha zimawawa
Tsopano tikusiya pempho lathu kuti "simmer" kwa miyezi isanu ndi umodzi kapena chaka, ndipo tikudabwa kuti tikupezanso "pamwamba" ndi zizindikiro za "kupopera" kukumbukira tsiku ndi tsiku (ma buffers adagawana kugunda) mkati 5.5TB - ndiko kuti, kuposa momwe zinalili poyamba.
Ayi, ndithudi, bizinesi yathu yakula ndipo ntchito yathu yakula, koma osati ndi kuchuluka komweko! Izi zikutanthawuza kuti china chake ndi nsomba apa - tiyeni tiganizire.
2.1: kubadwa kwa paging
Panthawi ina, gulu lina lachitukuko linkafuna kuti litheke "kulumpha" kuchokera pakusaka kofulumira kupita ku registry ndi zofanana, koma zotsatira zowonjezera. Kodi kaundula wopanda masamba ndi chiyani? Tiyeni tiwononge!
( ... LIMIT <N> + 10)
UNION ALL
( ... LIMIT <N> + 10)
LIMIT 10 OFFSET <N>;
Tsopano zinali zotheka kuwonetsa kaundula wa zotsatira zosaka ndikutsitsa "tsamba ndi tsamba" popanda kupsinjika kwa wopanga.
Inde, kwenikweni, pa tsamba lililonse lotsatira la deta zambiri zimawerengedwa (zonse kuyambira nthawi yapitayi, zomwe tidzazitaya, kuphatikizapo "mchira" wofunikira) - ndiye kuti, iyi ndi antipattern yomveka bwino. Koma zingakhale zolondola kwambiri kuyambitsa kusaka mobwerezabwereza kuchokera ku kiyi yosungidwa mu mawonekedwe, koma nthawi ina.
2.2: Ndikufuna china chake chachilendo
Panthawi ina, wopanga adafuna kusiyanitsa zitsanzo zotsatiridwa ndi deta kuchokera patebulo lina, pomwe pempho lonse lapitalo linatumizidwa ku CTE:
WITH q AS (
...
LIMIT <N> + 10
)
SELECT
*
, (SELECT ...) sub_query -- ΠΊΠ°ΠΊΠΎΠΉ-ΡΠΎ Π·Π°ΠΏΡΠΎΡ ΠΊ ΡΠ²ΡΠ·Π°Π½Π½ΠΎΠΉ ΡΠ°Π±Π»ΠΈΡΠ΅
FROM
q
LIMIT 10 OFFSET <N>;
Ndipo ngakhale zili choncho, sizoyipa, chifukwa subquery imawunikidwa pama rekodi 10 okha, ngati sichoncho ...
2.3: DIISTINCT ndi yopanda nzeru komanso yopanda chifundo
Penapake pakusintha kotereku kuchokera ku 2nd subquery anasochera NOT LIKE
chikhalidwe. Zikuwonekeratu kuti zitatha izi UNION ALL
anayamba kubwerera zolemba zina kawiri - choyamba chopezeka kumayambiriro kwa mzere, ndiyeno kachiwiri - kumayambiriro kwa mawu oyambirira a mzerewu. M'malire, zolemba zonse za 2nd subquery zitha kufanana ndi zolemba zoyambirira.
Kodi wopanga mapulogalamu amachita chiyani m'malo mofunafuna chifukwa chake?.. Palibe funso!
- pawiri kukula zitsanzo zoyambirira
- gwiritsani ntchito DISTINCTkuti mupeze zitsanzo za mzere uliwonse
WITH q AS (
( ... LIMIT <2 * N> + 10)
UNION ALL
( ... LIMIT <2 * N> + 10)
LIMIT <2 * N> + 10
)
SELECT DISTINCT
*
, (SELECT ...) sub_query
FROM
q
LIMIT 10 OFFSET <N>;
Izi ndizo, zikuwonekeratu kuti zotsatira zake, pamapeto pake, ndizofanana, koma mwayi "wowuluka" mu 2nd CTE subquery wakhala wapamwamba kwambiri, ndipo ngakhale popanda izi, momveka bwino kwambiri.
Koma ichi si chinthu chomvetsa chisoni kwambiri. Popeza wopanga mapulogalamu adafunsa kuti asankhe DISTINCT
osati zachindunji, koma za magawo onse nthawi imodzi zolemba, ndiye gawo la sub_query - zotsatira za subquery - zidaphatikizidwa pamenepo. Tsopano, kuti achite DISTINCT
, database idayenera kuchitidwa kale osati ma subqueries 10, koma onse <2 * N> + 10!
2.4: mgwirizano koposa zonse!
Chifukwa chake, opanga adakhalapo - sanavutike, chifukwa wogwiritsa ntchitoyo analibe kuleza mtima kokwanira "kusintha" zolembera kuti zikhale zofunikira za N ndi kuchepa kwanthawi yayitali pakulandila "tsamba" lililonse.
Mpaka opanga madipatimenti ena adabwera kwa iwo ndipo adafuna kugwiritsa ntchito njira yabwino ngati imeneyi kusaka mobwerezabwereza - ndiko kuti, timatenga chidutswa kuchokera ku zitsanzo zina, zosefera ndi zina zowonjezera, jambulani zotsatira, ndiye chidutswa chotsatira (chomwe mwathu chimapindula ndi kuwonjezeka kwa N), ndi zina zotero mpaka titadzaza chinsalu.
Kawirikawiri, mu chitsanzo chogwidwa N adafika pamtengo pafupifupi 17K, ndipo tsiku limodzi lokha osachepera 4K ya zopempha zoterezi zinachitidwa "pamphepete mwa unyolo". Omaliza a iwo adafufuzidwa molimba mtima ndi 1GB ya kukumbukira pabwereza...
Chiwerengero
Source: www.habr.com