Siv tag nrho cov yam ntxwv ntawm indexes hauv PostgreSQL

Siv tag nrho cov yam ntxwv ntawm indexes hauv PostgreSQL
Nyob rau hauv lub ntiaj teb Postgres, indexes yog qhov tseem ceeb rau kev ua haujlwm zoo ntawm cov ntaub ntawv khaws cia (hu ua "heap"). Postgres tsis txhawb kev sib koom ua ke rau nws, thiab MVCC architecture ua rau koj mus txog ntau yam ntawm tib tuple. Yog li ntawd, nws yog ib qho tseem ceeb heev uas yuav tsum muaj peev xwm tsim thiab tswj xyuas cov txiaj ntsig zoo los txhawb cov ntawv thov.

Nov yog qee cov lus qhia rau optimizing thiab txhim kho kev siv cov indexes.

Nco tseg: cov lus nug hauv qab no ua haujlwm ntawm qhov tsis hloov kho pagila sample database.

Siv Cov Ntawv Pov Thawj

Cia peb saib ntawm qhov kev thov kom rho tawm email chaw nyob rau cov neeg siv tsis siv. Rooj customer muaj ib kem active, thiab cov lus nug yog yooj yim:

pagila=# EXPLAIN SELECT email FROM customer WHERE active=0;
                        QUERY PLAN
-----------------------------------------------------------
 Seq Scan on customer  (cost=0.00..16.49 rows=15 width=32)
   Filter: (active = 0)
(2 rows)

Cov lus nug invokes tag nrho lub rooj scan sequence customer. Cia peb tsim qhov ntsuas ntawm ib kab active:

pagila=# CREATE INDEX idx_cust1 ON customer(active);
CREATE INDEX
pagila=# EXPLAIN SELECT email FROM customer WHERE active=0;
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Index Scan using idx_cust1 on customer  (cost=0.28..12.29 rows=15 width=32)
   Index Cond: (active = 0)
(2 rows)

Nws tau pab, qhov kev tshuaj ntsuam tom ntej tig mus rau "index scan". Qhov no txhais tau tias Postgres yuav luam theej qhov ntsuas "idx_cust1", thiab tom qab ntawd txuas ntxiv tshawb nrhiav lub rooj heap los nyeem cov txiaj ntsig ntawm lwm kab (hauv qhov no, kab email) uas cov lus nug xav tau.

Npog indexes tau qhia hauv PostgreSQL 11. Lawv tso cai rau koj suav nrog ib lossis ntau kab ntxiv hauv qhov ntsuas nws tus kheej - lawv cov txiaj ntsig tau khaws cia hauv cov ntaub ntawv ntsuas ntsuas.

Yog tias peb coj kom zoo dua ntawm qhov tshwj xeeb no thiab ntxiv tus nqi email hauv qhov ntsuas, ces Postgres yuav tsis tas yuav tshawb nrhiav lub rooj heap rau tus nqi. email. Cia peb saib seb qhov no yuav ua haujlwm li cas:

pagila=# CREATE INDEX idx_cust2 ON customer(active) INCLUDE (email);
CREATE INDEX
pagila=# EXPLAIN SELECT email FROM customer WHERE active=0;
                                    QUERY PLAN
----------------------------------------------------------------------------------
 Index Only Scan using idx_cust2 on customer  (cost=0.28..12.29 rows=15 width=32)
   Index Cond: (active = 0)
(2 rows)

Β«Index Only Scan' qhia peb tias cov lus nug tam sim no tsuas yog xav tau qhov ntsuas, uas yuav pab kom tsis txhob tag nrho cov disk I / O nyeem cov lus heap.

Npog qhov ntsuas tam sim no tsuas yog muaj rau B-ntoo. Txawm li cas los xij, nyob rau hauv cov ntaub ntawv no, kev siv zog tu yuav siab dua.

Siv ib nrab Index

Ib nrab indexes tsuas yog ib pawg ntawm cov kab hauv ib lub rooj. Qhov no txuag qhov loj ntawm indexes thiab ua rau scans sai dua.

Cia peb hais tias peb xav tau cov npe ntawm peb cov neeg siv khoom email chaw nyob hauv California. Qhov kev thov yuav zoo li no:

SELECT c.email FROM customer c
JOIN address a ON c.address_id = a.address_id
WHERE a.district = 'California';
which has a query plan that involves scanning both the tables that are joined:
pagila=# EXPLAIN SELECT c.email FROM customer c
pagila-# JOIN address a ON c.address_id = a.address_id
pagila-# WHERE a.district = 'California';
                              QUERY PLAN
----------------------------------------------------------------------
 Hash Join  (cost=15.65..32.22 rows=9 width=32)
   Hash Cond: (c.address_id = a.address_id)
   ->  Seq Scan on customer c  (cost=0.00..14.99 rows=599 width=34)
   ->  Hash  (cost=15.54..15.54 rows=9 width=4)
         ->  Seq Scan on address a  (cost=0.00..15.54 rows=9 width=4)
               Filter: (district = 'California'::text)
(6 rows)

Dab tsi cov indexes zoo tib yam yuav muab rau peb:

pagila=# CREATE INDEX idx_address1 ON address(district);
CREATE INDEX
pagila=# EXPLAIN SELECT c.email FROM customer c
pagila-# JOIN address a ON c.address_id = a.address_id
pagila-# WHERE a.district = 'California';
                                      QUERY PLAN
---------------------------------------------------------------------------------------
 Hash Join  (cost=12.98..29.55 rows=9 width=32)
   Hash Cond: (c.address_id = a.address_id)
   ->  Seq Scan on customer c  (cost=0.00..14.99 rows=599 width=34)
   ->  Hash  (cost=12.87..12.87 rows=9 width=4)
         ->  Bitmap Heap Scan on address a  (cost=4.34..12.87 rows=9 width=4)
               Recheck Cond: (district = 'California'::text)
               ->  Bitmap Index Scan on idx_address1  (cost=0.00..4.34 rows=9 width=0)
                     Index Cond: (district = 'California'::text)
(8 rows)

Luam theej duab address tau hloov los ntawm index scan idx_address1thiab ces scan lub heap address.

Txij li qhov no yog cov lus nug nquag thiab yuav tsum tau ua kom zoo, peb tuaj yeem siv qhov ntsuas ib nrab, uas ntsuas tsuas yog cov kab nrog qhov chaw nyob hauv cheeb tsam. β€˜California’:

pagila=# CREATE INDEX idx_address2 ON address(address_id) WHERE district='California';
CREATE INDEX
pagila=# EXPLAIN SELECT c.email FROM customer c
pagila-# JOIN address a ON c.address_id = a.address_id
pagila-# WHERE a.district = 'California';
                                           QUERY PLAN
------------------------------------------------------------------------------------------------
 Hash Join  (cost=12.38..28.96 rows=9 width=32)
   Hash Cond: (c.address_id = a.address_id)
   ->  Seq Scan on customer c  (cost=0.00..14.99 rows=599 width=34)
   ->  Hash  (cost=12.27..12.27 rows=9 width=4)
         ->  Index Only Scan using idx_address2 on address a  (cost=0.14..12.27 rows=9 width=4)
(5 rows)

Tam sim no cov lus nug tsuas yog nyeem idx_address2 thiab tsis kov lub rooj address.

Siv Multi-Value Indexs

Qee kab yuav tsum tau indexed yuav tsis muaj hom ntaub ntawv scalar. Kem hom zoo li jsonb, arrays ΠΈ tsvector muaj cov khoom sib xyaw lossis ntau yam nqi. Yog tias koj xav tau txheeb xyuas cov kab ntawv zoo li no, feem ntau koj yuav tsum tau tshawb nrhiav los ntawm txhua tus neeg muaj txiaj ntsig hauv cov kab ntawv.

Cia peb sim nrhiav cov npe ntawm txhua cov yeeb yaj kiab uas muaj kev txiav los ntawm kev siv tsis tau tiav. Rooj film muaj ib kab ntawv hu ua special_features. Yog tias cov yeeb yaj kiab muaj qhov "tshwj xeeb tshwj xeeb", ces kab ntawv muaj cov ntsiab lus raws li cov ntawv nyeem Behind The Scenes. Txhawm rau tshawb xyuas tag nrho cov yeeb yaj kiab zoo li no, peb yuav tsum xaiv txhua kab nrog "Behind The Scenes" thaum yam array nqi special_features:

SELECT title FROM film WHERE special_features @> '{"Behind The Scenes"}';

Nesting operator @> xyuas yog tias sab xis yog ib qho ntawm sab laug.

Thov kev npaj:

pagila=# EXPLAIN SELECT title FROM film
pagila-# WHERE special_features @> '{"Behind The Scenes"}';
                           QUERY PLAN
-----------------------------------------------------------------
 Seq Scan on film  (cost=0.00..67.50 rows=5 width=15)
   Filter: (special_features @> '{"Behind The Scenes"}'::text[])
(2 rows)

Uas thov kom tag nrho heap scan nrog tus nqi ntawm 67.

Cia peb saib seb qhov B-ntoo Performance index pab peb li cas:

pagila=# CREATE INDEX idx_film1 ON film(special_features);
CREATE INDEX
pagila=# EXPLAIN SELECT title FROM film
pagila-# WHERE special_features @> '{"Behind The Scenes"}';
                           QUERY PLAN
-----------------------------------------------------------------
 Seq Scan on film  (cost=0.00..67.50 rows=5 width=15)
   Filter: (special_features @> '{"Behind The Scenes"}'::text[])
(2 rows)

Qhov Performance index tseem tsis tau txiav txim siab. B-ntoo qhov ntsuas tsis paub txog qhov muaj nyob ntawm tus kheej cov ntsiab lus hauv qhov ntsuas qhov ntsuas.

Peb xav tau GIN index.

pagila=# CREATE INDEX idx_film2 ON film USING GIN(special_features);
CREATE INDEX
pagila=# EXPLAIN SELECT title FROM film
pagila-# WHERE special_features @> '{"Behind The Scenes"}';
                                QUERY PLAN
---------------------------------------------------------------------------
 Bitmap Heap Scan on film  (cost=8.04..23.58 rows=5 width=15)
   Recheck Cond: (special_features @> '{"Behind The Scenes"}'::text[])
   ->  Bitmap Index Scan on idx_film2  (cost=0.00..8.04 rows=5 width=0)
         Index Cond: (special_features @> '{"Behind The Scenes"}'::text[])
(4 rows)

GIN Performance index txhawb nqa daim ntawv qhia ib qho txiaj ntsig tawm tsam cov txiaj ntsig sib xyaw ua ke, ua rau cov lus nug cov nqi uas ntau dua li ib nrab.

Tau tshem ntawm duplicate indexes

Indexs sib sau ua ke dhau sijhawm, thiab qee zaum qhov ntsuas tshiab yuav muaj tib lub ntsiab lus raws li ib qho ntawm cov dhau los. Koj tuaj yeem siv cov catalog saib kom tau txais tib neeg-nyeem tau SQL cov ntsiab lus ntawm indexes. pg_indexes. Koj tuaj yeem pom cov ntsiab lus zoo ib yam:

 SELECT array_agg(indexname) AS indexes, replace(indexdef, indexname, '') AS defn
    FROM pg_indexes
GROUP BY defn
  HAVING count(*) > 1;
And here’s the result when run on the stock pagila database:
pagila=#   SELECT array_agg(indexname) AS indexes, replace(indexdef, indexname, '') AS defn
pagila-#     FROM pg_indexes
pagila-# GROUP BY defn
pagila-#   HAVING count(*) > 1;
                                indexes                                 |                                defn
------------------------------------------------------------------------+------------------------------------------------------------------
 {payment_p2017_01_customer_id_idx,idx_fk_payment_p2017_01_customer_id} | CREATE INDEX  ON public.payment_p2017_01 USING btree (customer_id
 {payment_p2017_02_customer_id_idx,idx_fk_payment_p2017_02_customer_id} | CREATE INDEX  ON public.payment_p2017_02 USING btree (customer_id
 {payment_p2017_03_customer_id_idx,idx_fk_payment_p2017_03_customer_id} | CREATE INDEX  ON public.payment_p2017_03 USING btree (customer_id
 {idx_fk_payment_p2017_04_customer_id,payment_p2017_04_customer_id_idx} | CREATE INDEX  ON public.payment_p2017_04 USING btree (customer_id
 {payment_p2017_05_customer_id_idx,idx_fk_payment_p2017_05_customer_id} | CREATE INDEX  ON public.payment_p2017_05 USING btree (customer_id
 {idx_fk_payment_p2017_06_customer_id,payment_p2017_06_customer_id_idx} | CREATE INDEX  ON public.payment_p2017_06 USING btree (customer_id
(6 rows)

Superset Index

Nws tuaj yeem tshwm sim uas koj xaus nrog ntau qhov ntsuas, ib qho ntawm qhov ntsuas tus superset ntawm kab uas ntsuas lwm qhov ntsuas. Qhov no tej zaum yuav los yog tsis xav tau - lub superset yuav ua rau qhov ntsuas ntsuas nkaus xwb, uas yog qhov zoo, tab sis nws yuav siv ntau qhov chaw, lossis cov lus nug uas lub superset tau npaj los ua kom zoo dua yog tsis siv lawm.

Yog tias koj xav tau automate lub ntsiab lus ntawm cov indexes, koj tuaj yeem pib nrog pg_index los ntawm lub rooj pg_catalog.

Cov indexes tsis siv

Raws li cov ntawv thov uas siv databases hloov zuj zus, yog li ua cov lus nug lawv siv. Indexs ntxiv ua ntej yuav tsis siv los ntawm cov lus nug. Txhua zaus ib qho kev ntsuas ntsuas, nws raug cim los ntawm tus tswj xyuas kev txheeb cais, thiab hauv qhov system catalog saib pg_stat_user_indexes koj tuaj yeem pom tus nqi idx_scan, uas yog ib tug cumulative counter. Taug qab tus nqi no nyob rau ib lub sijhawm (hais ib hlis) yuav muab lub tswv yim zoo uas cov indexes tsis raug siv thiab tuaj yeem poob.

Ntawm no yog cov lus nug kom tau txais qhov ntsuas tam sim no ntawm txhua qhov ntsuas hauv qhov schema β€˜public’:

SELECT relname, indexrelname, idx_scan
FROM   pg_catalog.pg_stat_user_indexes
WHERE  schemaname = 'public';
with output like this:
pagila=# SELECT relname, indexrelname, idx_scan
pagila-# FROM   pg_catalog.pg_stat_user_indexes
pagila-# WHERE  schemaname = 'public'
pagila-# LIMIT  10;
    relname    |    indexrelname    | idx_scan
---------------+--------------------+----------
 customer      | customer_pkey      |    32093
 actor         | actor_pkey         |     5462
 address       | address_pkey       |      660
 category      | category_pkey      |     1000
 city          | city_pkey          |      609
 country       | country_pkey       |      604
 film_actor    | film_actor_pkey    |        0
 film_category | film_category_pkey |        0
 film          | film_pkey          |    11043
 inventory     | inventory_pkey     |    16048
(10 rows)

Rebuilding indexes nrog tsawg xauv

Indexs feem ntau yuav tsum tau rov ua dua, piv txwv li thaum lawv ua plab, thiab kev tsim kho tuaj yeem ua kom nrawm dua. Tsis tas li ntawd cov indexes tuaj yeem ua tsis raug. Kev hloov pauv qhov ntsuas ntsuas kuj tseem yuav tsum tau rov tsim kho nws.

Pab kom parallel index creation

Hauv PostgreSQL 11, tsim qhov B-Tree index yog concurrent. Txhawm rau kom ua tiav cov txheej txheem tsim, ntau tus neeg ua haujlwm sib luag tuaj yeem siv tau. Txawm li cas los xij, xyuas kom meej tias cov kev xaiv configuration no raug teeb tsa kom raug:

SET max_parallel_workers = 32;
SET max_parallel_maintenance_workers = 16;

Lub neej ntawd muaj nuj nqis tsawg dhau. Qhov zoo tshaj plaws, cov lej no yuav tsum nce ntxiv nrog rau tus lej ntawm cov processor cores. Nyeem ntxiv hauv cov ntaub ntawv.

Background index creation

Koj tuaj yeem tsim qhov ntsuas hauv keeb kwm yav dhau los siv qhov kev xaiv CONCURRENTLY pab pawg CREATE INDEX:

pagila=# CREATE INDEX CONCURRENTLY idx_address1 ON address(district);
CREATE INDEX

Cov txheej txheem tsim qhov ntsuas no txawv ntawm qhov ib txwm muaj nyob rau hauv uas nws tsis tas yuav tsum muaj lub xauv ntawm lub rooj, thiab yog li tsis thaiv kev sau ntawv. Ntawm qhov tod tes, nws yuav siv sijhawm ntau dua thiab siv nyiaj ntau dua.

Postgres muab ntau qhov yooj yim rau kev tsim cov indexes thiab txoj hauv kev los daws cov teeb meem tshwj xeeb, nrog rau txoj hauv kev los tswj cov ntaub ntawv nyob rau hauv rooj plaub koj daim ntawv thov loj hlob tawg. Peb cia siab tias cov lus qhia no yuav pab tau koj kom tau txais koj cov lus nug sai thiab koj cov ntaub ntawv npaj txhij los ntsuas.

Tau qhov twg los: www.hab.com

Ntxiv ib saib