å
šåœã®å¶æ¥æã®æ°å人ã®ãããŒãžã£ãŒãèšé²
ãããã£ãŠãæãè² è·ã®é«ãããŒã¿ããŒã¹ã® XNUMX ã€ã§ããç§ãã¡ã®ããŒã¿ããŒã¹ã§ãéããã¯ãšãªãå床åæããããšã¯é©ãã¹ãããšã§ã¯ãããŸããã
ããã«ããããªã調æ»ã«ãããèå³æ·±ãäŸãæããã«ãªããŸãã æåã«æé©åãè¡ããã次ã«ããã©ãŒãã³ã¹ãäœäžããŸãã ãªã¯ãšã¹ãã¯è€æ°ã®ããŒã ã«ãã£ãŠé 次æ¹è¯ãããåããŒã ã¯æåã®æå³ã ããæã£ãŠè¡åããŸããã
0ïŒãŠãŒã¶ãŒã¯äœãæãã§ããã®ãïŒ
[KDPV
ãŠãŒã¶ãŒãååã«ãããã¯ã€ãã¯ãæ€çŽ¢ã«ã€ããŠè©±ããšããéåžžã¯äœãæå³ããŸãã? 次ã®ãããªéšåæååããæ£çŽã«ãæ€çŽ¢ãããšå€æããããšã¯ã»ãšãã©ãããŸããã ... LIKE '%ÑПза%'
- ãªããªãããã®çµæã«ã¯æ¬¡ã®ãã®ãå«ãŸããã ãã§ã¯ãããŸãã 'РПзалОÑ'
О 'ÐагазОМ РПза'
ããã 'ÐÑПза'
ãšãã 'ÐПЌ ÐеЎа ÐПÑПза'
.
ãŠãŒã¶ãŒã¯ãæ¥åžžã¬ãã«ã§ãããªããèªåã«æäŸããŠããããã®ãšæ³å®ããŠããŸãã åèªã®å
é ã§æ€çŽ¢ãã ã¿ã€ãã«ã«å«ããŠãããé¢é£æ§ã®é«ããã®ã«ããŸã ããå§ãŸã å
¥ããŸããã ãããŠããªãã¯ãããããã§ããã ã»ãŒç¬æã« - ã€ã³ã¿ãŒãªãã¢å
¥åã®å Žåã
1: ã¿ã¹ã¯ãå¶éãã
ããã«èšãã°ã人ãå
·äœçã«å
¥ãããã§ã¯ãããŸããã 'ÑПз Ќагаз'
, ãã®ãããååèªãæ¥é èŸã§æ€çŽ¢ããå¿
èŠããããŸãã ãããããŠãŒã¶ãŒã«ãšã£ãŠã¯ãåã®åèªãæå³çã«ãéå°æå®ãããããããæåŸã®åèªã®ç°¡åãªãã³ãã«å¿çããæ¹ãã¯ããã«ç°¡åã§ããæ€çŽ¢ãšã³ãžã³ããããã©ã®ããã«åŠçããããèŠãŠãã ããã
äžè¬çã«ã æ£ãã åé¡ã®èŠä»¶ãå®åŒåããããšã¯ã解決çã®åå以äžãå ããŸãã å Žåã«ãã£ãŠã¯æ
éãªãŠãŒã¹ã±ãŒã¹åæ
æœè±¡éçºè ã¯äœãããŸãã?
1.0: å€éšæ€çŽ¢ãšã³ãžã³
ãããæ€çŽ¢ã¯é£ãããäœãããããªããDevOps ã«ä»»ããŸããã! SphinxãElasticSearch ãªã©ã®æ€çŽ¢ãšã³ãžã³ãããŒã¿ããŒã¹ã®å€éšã«å°å ¥ãããŸãã
åæãšå€æŽã®é床ã®ç¹ã§åŽåéçŽçã§ã¯ãããŸãããå®çšçãªãªãã·ã§ã³ã§ãã ããããç§ãã¡ã®å Žåã¯ããã§ã¯ãããŸãããæ€çŽ¢ã¯åã¯ã©ã€ã¢ã³ãã®ã¢ã«ãŠã³ãããŒã¿ã®æ çµã¿å
ã§ã®ã¿å®è¡ãããããã§ãã ãããŠãããŒã¿ã¯ããªãé«ãå€åæ§ãæã£ãŠããŸã - ãããŠãããŒãžã£ãŒãã«ãŒããå
¥åããå Žå 'ÐагазОМ РПза'
, ãã®åŸã5ã10ç§åŸã«ã圌ã¯ããã«èªåã®é»åã¡ãŒã«ãæå®ããã®ãå¿ããããšããã§ã«æãåºãããããèŠã€ããŠä¿®æ£ããããšèããŠããå¯èœæ§ããããŸãã
ãããã£ãŠãããŸããã ãããŒã¿ããŒã¹ãçŽæ¥ãæ€çŽ¢ããã 幞ããªããšã«ãPostgreSQL ã§ã¯ãããå¯èœã§ãããXNUMX ã€ã®ãªãã·ã§ã³ã ãã§ã¯ãªããããããèŠãŠãããŸãã
1.1: ãæ£çŽãªãéšåæåå
ç§ãã¡ã¯ãéšåæååããšããèšèã«åºå·ããŠããŸãã ããããéšåæååã«ãã (ããã«ã¯æ£èŠè¡šçŸã«ãã) ã€ã³ããã¯ã¹æ€çŽ¢ã«ã€ããŠã¯ãåªããæ©èœããããŸãã
ã¢ãã«ãåçŽåããããã«æ¬¡ã®ãã¬ãŒããåãäžããŠã¿ãŸãããã
CREATE TABLE firms(
id
serial
PRIMARY KEY
, name
text
);
å®éã®çµç¹ã® 7.8 äžä»¶ã®ã¬ã³ãŒããããã«ã¢ããããŒãããã€ã³ããã¯ã¹ãä»ããŸãã
CREATE EXTENSION pg_trgm;
CREATE INDEX ON firms USING gin(lower(name) gin_trgm_ops);
ç·åœ¢æ€çŽ¢ã®ããã«æåã® 10 ã¬ã³ãŒããæ¢ããŠã¿ãŸãããã
SELECT
*
FROM
firms
WHERE
lower(name) ~ ('(^|s)' || 'ÑПза')
ORDER BY
lower(name) ~ ('^' || 'ÑПза') DESC -- ÑМаÑала "МаÑОМаÑÑОеÑÑ ÐœÐ°"
, lower(name) -- ПÑÑалÑМПе пП алÑавОÑÑ
LIMIT 10;
ãŸãããããª... 26ããªç§ã31MB èªã¿åãããŒã¿ãš 1.7K ãè¶ ãããã£ã«ã¿ãªã³ã°ãããã¬ã³ãŒã - 10 件ã®æ€çŽ¢æžã¿ã¬ã³ãŒãã è«žçµè²»ãé«ãããã®ã§ããããã£ãšå¹ççãªæ¹æ³ã¯ãªãã®ã§ãããã?
1.2: ããã¹ãã§æ€çŽ¢ããŸãã? FTSã ãïŒ
å®éãPostgreSQL ã¯éåžžã«åŒ·åãªæ©èœãæäŸããŸãã
CREATE INDEX ON firms USING gin(to_tsvector('simple'::regconfig, lower(name)));
SELECT
*
FROM
firms
WHERE
to_tsvector('simple'::regconfig, lower(name)) @@ to_tsquery('simple', 'ÑПза:*')
ORDER BY
lower(name) ~ ('^' || 'ÑПза') DESC
, lower(name)
LIMIT 10;
ããã§ã¯ã¯ãšãªå®è¡ã®äžŠååãå°ã圹ã«ç«ã¡ãæéãååã«ççž®ã§ããŸããã 11ããªç§ã ãããŠãèªãå¿ èŠããã£ãã®ã¯ 1.5 åã® XNUMX ã§ãã - åèš 20MBã ãã ããããã§ã¯ãå°ãªãã»ã©è¯ãã®ã§ããèªã¿åãããªã¥ãŒã ã倧ããã»ã©ããã£ãã·ã¥ ãã¹ãçºçããå¯èœæ§ãé«ããªãããã£ã¹ã¯ããèªã¿åãããããŒã¿ã®äœåãªããŒãžããªã¯ãšã¹ãã«å¯Ÿããæœåšçãªããã¬ãŒããã«ãªãããã§ãã
1.3: ãŸã æ°ã«å ¥ã£ãŠããŸãã?
åã®ãé¡ãã¯èª°ã«ãšã£ãŠãè¯ãããšã§ãããXNUMXæ¥ã«XNUMXäžååŒããå Žåã«ã®ã¿å¶ããŸã 2TB ããŒã¿ãèªã¿åããŸãã æè¯ã®å Žåã¯ã¡ã¢ãªããã§ãããéãæªããã°ãã£ã¹ã¯ããã§ãã ããã§ãããã«å°ããããŠã¿ãŸãã
ãŠãŒã¶ãŒãèŠãããã®ãæãåºãã ãŸããâŠããå§ãŸããã®ãã ããã¯æãçŽç²ãªåœ¢ã§ã text_pattern_ops
ïŒ ãããŠãæ¢ããŠããæ倧 10 ã¬ã³ãŒããã足ããªããå Žåã«ã®ã¿ãFTS æ€çŽ¢ã䜿çšããŠãããã®èªã¿åããå®äºããå¿
èŠããããŸãã
CREATE INDEX ON firms(lower(name) text_pattern_ops);
SELECT
*
FROM
firms
WHERE
lower(name) LIKE ('ÑПза' || '%')
LIMIT 10;
åªããããã©ãŒãã³ã¹ - åèš 0.05msãš100KB匷 èªãïŒ ç§éã ããå¿ããŠã ååé ãŠãŒã¶ãŒãçµæã«è¿·ããªãããã«:
SELECT
*
FROM
firms
WHERE
lower(name) LIKE ('ÑПза' || '%')
ORDER BY
lower(name)
LIMIT 10;
ãããäœããããããã»ã©çŸãããããŸãã - ã€ã³ããã¯ã¹ãããããã§ããã䞊ã¹æ¿ãã¯ãããéãéããŠããŸã... ãã¡ãããããã¯ãã§ã«åã®ãªãã·ã§ã³ãããäœåãå¹æçã§ãã...
1.4: ããããã§ä»äžããã
ãã ããç¯å²ã§æ€çŽ¢ããéåžžã©ãããœãŒãã䜿çšã§ããã€ã³ããã¯ã¹ããããŸãã éåžžã®BããªãŒ!
CREATE INDEX ON firms(lower(name));
ããã«å¯Ÿãããªã¯ãšã¹ãã®ã¿ããæåã§åéãããå¿ èŠããããŸãã
SELECT
*
FROM
firms
WHERE
lower(name) >= 'ÑПза' AND
lower(name) <= ('ÑПза' || chr(65535)) -- ÐŽÐ»Ñ UTF8, ÐŽÐ»Ñ ÐŸÐŽÐœÐŸÐ±Ð°Ð¹ÑПвÑÑ
- chr(255)
ORDER BY
lower(name)
LIMIT 10;
çŽ æŽããã - ä»åãã¯æ©èœãããªãœãŒã¹ã®æ¶è²»ã¯ã埮èŠçããªãŸãŸã§ãã ãçŽç²ãªãFTS ãããæ°ååå¹æçïŒ æ®ã£ãŠããã®ã¯ãããã XNUMX ã€ã®ãªã¯ãšã¹ãã«ãŸãšããã ãã§ãã
(
SELECT
*
FROM
firms
WHERE
lower(name) >= 'ÑПза' AND
lower(name) <= ('ÑПза' || chr(65535)) -- ÐŽÐ»Ñ UTF8, ÐŽÐ»Ñ ÐŸÐŽÐœÐŸÐ±Ð°Ð¹ÑПвÑÑ
кПЎОÑПвПк - chr(255)
ORDER BY
lower(name)
LIMIT 10
)
UNION ALL
(
SELECT
*
FROM
firms
WHERE
to_tsvector('simple'::regconfig, lower(name)) @@ to_tsquery('simple', 'ÑПза:*') AND
lower(name) NOT LIKE ('ÑПза' || '%') -- "МаÑОМаÑÑОеÑÑ ÐœÐ°" ÐŒÑ Ñже МаÑлО вÑÑе
ORDER BY
lower(name) ~ ('^' || 'ÑПза') DESC -- ОÑпПлÑзÑеЌ ÑÑ Ð¶Ðµ ÑПÑÑОÑПвкÑ, ÑÑÐŸÐ±Ñ ÐРпПйÑО пП btree-ОМЎекÑÑ
, lower(name)
LIMIT 10
)
LIMIT 10;
XNUMX çªç®ã®ãµãã¯ãšãªãå®è¡ãããããšã«æ³šæããŠãã ãã æåã®çµæãäºæ³ãããå°ãªãã£ãå Žåã®ã¿ æåŸã® LIMIT
è¡æ°ã ç§ã¯ã¯ãšãªæé©åã®ãã®æ¹æ³ã«ã€ããŠè©±ããŠããŸã
ã¯ããçŸåšã§ã¯ btree ãš gin ã®äž¡æ¹ãããŒãã«ã«ãããŸãããçµ±èšçã«ã¯æ¬¡ã®ããšãå€æããŸããã 10 çªç®ã®ãããã¯ã®å®è¡ã«å°éãããªã¯ãšã¹ã㯠XNUMX% æªæºã§ãã ã€ãŸãããã®ãããªäžè¬çãªã¿ã¹ã¯ã®å¶éãäºåã«ç¥ã£ãŠããããšã§ããµãŒã㌠ãªãœãŒã¹ã®ç·æ¶è²»éãã»ãŒ XNUMX åã® XNUMX ã«åæžããããšãã§ããŸããã
1.5*: ãã¡ã€ã«ãªãã§ãå¯èœ
äž LIKE
ééã£ã䞊ã¹æ¿ãã䜿çšããããšã¯é²æ¢ãããŸããã ãã ããUSING æŒç®åãæå®ããããšã§ãæ£ãããã¹ã«èšå®ãã§ããŸãã
ããã©ã«ãã§ã¯æ¬¡ã®ããã«æ³å®ãããŸã
ASC
ã ããã«ãå¥ã§ç¹å®ã®äžŠã¹æ¿ãæŒç®åã®ååãæå®ã§ããŸããUSING
ã ãœãŒãæŒç®åã¯ãB ããªãŒæŒç®åã®äžéšã®ãã¡ããªãŒã®ãããå°ããããŸãã¯ããã倧ãããã¡ã³ããŒã§ããå¿ èŠããããŸããASC
éåžžåçUSING <
ОDESC
éåžžåçUSING >
.
ç§ãã¡ã®å Žåããå°ãªãããšã¯ ~<~
:
SELECT
*
FROM
firms
WHERE
lower(name) LIKE ('ÑПза' || '%')
ORDER BY
lower(name) USING ~<~
LIMIT 10;
2: ãªã¯ãšã¹ããã©ã®ããã«æªåããã
ããŠãç§ãã¡ã¯ãç ®è©°ããããšãããªã¯ãšã¹ãã XNUMX ãæãŸã㯠XNUMX 幎éæŸã£ãŠãããšãæ¯æ¥ã®ã¡ã¢ãªã®åèšããã³ãã³ã°ãã®ææšã衚瀺ããããããåã³ãããããã«ããããšã«é©ããŸããããããã¡å ±æãããïŒã§ 5.5TB - ã€ãŸããåœåãããããã«å¢å ãââãŸããã
ãããããã¡ãããç§ãã¡ã®ããžãã¹ã¯æé·ããä»äºéã¯å¢å ããŸããããåãéã§ã¯ãããŸããã ããã¯ãããã«äœãæªããç¹ãããããšãæå³ããŸãããããç解ããŸãããã
2.1: ããŒãžã³ã°ã®èªç
ããæç¹ã§ãå¥ã®éçºããŒã ã¯ãçŽ æ©ãæ·»ãåæ€çŽ¢ããã¬ãžã¹ããªã«ããžã£ã³ããããŠãåãã ãæ¡åŒµãããçµæã衚瀺ã§ããããã«ããããšèããŠããŸããã ããŒãž ããã²ãŒã·ã§ã³ã®ãªãã¬ãžã¹ããªãšã¯äœã§ãã? ãã¡ããã¡ãã«ããŸãããïŒ
( ... LIMIT <N> + 10)
UNION ALL
( ... LIMIT <N> + 10)
LIMIT 10 OFFSET <N>;
éçºè ã«ãšã£ãŠã¹ãã¬ã¹ãªãããããŒãžããšãã«èªã¿èŸŒãŸããŠæ€çŽ¢çµæã®ã¬ãžã¹ããªã衚瀺ã§ããããã«ãªããŸããã
ãã¡ããå®éã«ã¯ã åŸç¶ã®ããŒã¿ ããŒãžããšã«ãããã«å€ãã®ããŒã¿ãèªã¿åãããŸã (ååã®ãã¹ãŠãç Žæ£ããå¿ èŠãªãå°Ÿéšããå ãããã®) - ã€ãŸããããã¯æãããªã¢ã³ããã¿ãŒã³ã§ãã ãã ãã次ã®å埩æã«ã€ã³ã¿ãŒãã§ã€ã¹ã«ä¿åãããŠããããŒããæ€çŽ¢ãéå§ããæ¹ãæ£ç¢ºã§ãããããã«ã€ããŠã¯ãŸãå¥ã®æ©äŒã«èª¬æããŸãã
2.2: ãšããŸããã¯ãªãã®ã欲ãã
ããæç¹ã§éçºè ãæãã§ããã®ã¯ã åŸããããµã³ãã«ãããŒã¿ã§å€æ§åãã åã®ãªã¯ãšã¹ãå šäœã CTE ã«éä¿¡ãããå¥ã®ããŒãã«ãã:
WITH q AS (
...
LIMIT <N> + 10
)
SELECT
*
, (SELECT ...) sub_query -- какПй-ÑП запÑÐŸÑ Ðº ÑвÑзаММПй ÑаблОÑе
FROM
q
LIMIT 10 OFFSET <N>;
ããã§ãããµãã¯ãšãªã¯è¿ããã 10 ã¬ã³ãŒãã«å¯ŸããŠã®ã¿è©äŸ¡ããããããæªãã¯ãããŸããã
2.3: DISTINCT ã¯ç¡æå³ã§ç¡æ æ²ã§ãã
2 çªç®ã®ãµãã¯ãšãªããã®ãã®ãããªé²åã®éçšã®ã©ãã㧠倱ã£ã NOT LIKE
æ¡ä»¶ã ãã®åŸã¯æããã§ã UNION ALL
æ»ãå§ãã ããã€ãã®ãšã³ããªã XNUMX åãããŸã - æåã«è¡ã®å
é ã§èŠã€ããã次ã«åã³ - ãã®è¡ã®æåã®åèªã®å
é ã§èŠã€ãããŸãã å¶éå
ã§ã¯ã2 çªç®ã®ãµãã¯ãšãªã®ãã¹ãŠã®ã¬ã³ãŒããæåã®ãµãã¯ãšãªã®ã¬ã³ãŒããšäžèŽããå¯èœæ§ããããŸãã
éçºè ã¯åå ãæ¢ã代ããã«äœãããŸãã?... çåã¯ãããŸããã
- ãµã€ãºãXNUMXåã«ãã ãªãªãžãã«ãµã³ãã«
- DISTINCT ãé©çšããåè¡ã®ã€ã³ã¹ã¿ã³ã¹ã XNUMX ã€ã ãååŸããã«ã¯
WITH q AS (
( ... LIMIT <2 * N> + 10)
UNION ALL
( ... LIMIT <2 * N> + 10)
LIMIT <2 * N> + 10
)
SELECT DISTINCT
*
, (SELECT ...) sub_query
FROM
q
LIMIT 10 OFFSET <N>;
ã€ãŸããçµæãæçµçã«ã¯ãŸã£ããåãã§ããããšã¯æããã§ããã2 çªç®ã® CTE ãµãã¯ãšãªã«ãé£ã¶ãå¯èœæ§ãã¯ããã«é«ããªã£ãŠãããããããªããŠãã æããã«èªã¿ããããªããŸãã.
ããããããã¯æãæ²ããããšã§ã¯ãããŸããã éçºè
ããéžæãæ±ããããããã DISTINCT
ç¹å®ã®ãã£ãŒã«ãã§ã¯ãªãããã¹ãŠã®ãã£ãŒã«ããäžåºŠã« ã¬ã³ãŒããè¿œå ãããšããµãã¯ãšãªã®çµæã§ãã sub_query ãã£ãŒã«ããèªåçã«ããã«çµã¿èŸŒãŸããŸããã ããŠãå®è¡ããã«ã¯ DISTINCT
ãããŒã¿ããŒã¹ã¯ãã§ã«å®è¡ãããŠããå¿
èŠããããŸãã 10 åã®ãµãã¯ãšãªã§ã¯ãªãããã¹ãŠ <2 * N> + 10!
2.4: äœãããåå!
ãããã£ãŠãéçºè ã¯çãç¶ããŸããããŠãŒã¶ãŒã¯æããã«ãåŸç¶ã®åãããŒãžãã®åä¿¡ãæ ¢æ§çã«é ããªããã¬ãžã¹ããªãå€§å¹ ãª N å€ã«ã調æŽãããã®ã«ååãªå¿èåãæã£ãŠããªãã£ãã®ã§ãéçºè ã¯æ°ã«ããŸããã§ããã
å¥ã®éšéã®éçºè ããã£ãŠæ¥ãŠããã®ãããªäŸ¿å©ãªæ¹æ³ã䜿ãããããŸã§ã¯ å埩æ€çŽ¢çš - ã€ãŸãããããµã³ãã«ããéšåãåãåºããè¿œå ã®æ¡ä»¶ã§ãã£ã«ã¿ãªã³ã°ããçµæãæç»ãã次ã«æ¬¡ã®éšå (ãã®å Žå㯠N ãå¢ããããšã§å®çŸãããŸã) ãç»é¢ãæºãããŸã§ç¹°ãè¿ããŸãã
äžè¬ã«ãæç²ãããæšæ¬ã§ã¯ã Nã¯ã»ãŒ17Kã®å€ã«éããŸãããããŠãããã 4 æ¥ã§å°ãªããšã XNUMX 件ã®ãã®ãããªãªã¯ãšã¹ããããã§ãŒã³ã«æ²¿ã£ãŠãå®è¡ãããŸããã ãããã®æåŸã®éšåã¯å€§èã«ã¹ãã£ã³ãããŸãã å埩ããšã« 1GB ã®ã¡ã¢ãª...
åèšã§
åºæïŒ habr.com