Kutumia huduma zote za faharisi katika PostgreSQL

Kutumia huduma zote za faharisi katika PostgreSQL
Katika ulimwengu wa Postgres, faharasa ni muhimu kwa usogezaji bora wa hifadhi ya hifadhidata (inayoitwa "lundo"). Postgres haitumii nguzo kwa ajili yake, na usanifu wa MVCC hukufanya uishie na matoleo mengi ya nakala sawa. Kwa hiyo, ni muhimu sana kuwa na uwezo wa kuunda na kudumisha fahirisi za ufanisi ili kusaidia programu.

Hapa kuna vidokezo vya kuboresha na kuboresha matumizi ya faharisi.

Kumbuka: hoja zilizoonyeshwa hapa chini zinafanya kazi kwenye ambayo haijabadilishwa pagila sampuli ya hifadhidata.

Kwa kutumia Fahirisi za Kufunika

Hebu tuangalie ombi la kutoa anwani za barua pepe kwa watumiaji wasiofanya kazi. Jedwali customer kuna safu active, na swali ni rahisi:

pagila=# EXPLAIN SELECT email FROM customer WHERE active=0;
                        QUERY PLAN
-----------------------------------------------------------
 Seq Scan on customer  (cost=0.00..16.49 rows=15 width=32)
   Filter: (active = 0)
(2 rows)

Hoja inaomba mlolongo kamili wa uchanganuzi wa jedwali customer. Hebu tuunde fahirisi kwenye safu active:

pagila=# CREATE INDEX idx_cust1 ON customer(active);
CREATE INDEX
pagila=# EXPLAIN SELECT email FROM customer WHERE active=0;
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Index Scan using idx_cust1 on customer  (cost=0.28..12.29 rows=15 width=32)
   Index Cond: (active = 0)
(2 rows)

Ilisaidia, skanisho iliyofuata ikageuka kuwa "index scan". Hii inamaanisha kuwa Postgres itachanganua faharisi "idx_cust1", na kisha endelea kutafuta lundo la jedwali ili kusoma maadili ya safu zingine (katika kesi hii, safu wima email) ambayo swali linahitaji.

Faharisi za kufunika huletwa katika PostgreSQL 11. Wanakuruhusu kujumuisha safu wima moja au zaidi kwenye faharisi yenyewe - maadili yao yamehifadhiwa kwenye duka la data la faharisi.

Ikiwa tungechukua fursa ya kipengele hiki na kuongeza thamani ya barua pepe ndani ya faharasa, basi Postgres haitahitaji kutafuta lundo la jedwali kwa thamani hiyo. email. Wacha tuone ikiwa hii itafanya kazi:

pagila=# CREATE INDEX idx_cust2 ON customer(active) INCLUDE (email);
CREATE INDEX
pagila=# EXPLAIN SELECT email FROM customer WHERE active=0;
                                    QUERY PLAN
----------------------------------------------------------------------------------
 Index Only Scan using idx_cust2 on customer  (cost=0.28..12.29 rows=15 width=32)
   Index Cond: (active = 0)
(2 rows)

Β«Index Only Scan' inatuambia kuwa swala sasa linahitaji tu faharasa, ambayo husaidia kuzuia diski zote za I/O kusoma lundo la jedwali.

Faharasa za kufunika zinapatikana kwa miti B pekee kwa sasa. Hata hivyo, katika kesi hii, jitihada za matengenezo zitakuwa za juu.

Kwa kutumia Fahirisi za Sehemu

Faharasa kiasi zinaonyesha kikundi kidogo tu cha safu mlalo katika jedwali. Hii huhifadhi saizi ya faharasa na kufanya uchanganuzi haraka.

Hebu tuseme tunataka kupata orodha ya anwani za barua pepe za wateja wetu huko California. Ombi litakuwa kama hii:

SELECT c.email FROM customer c
JOIN address a ON c.address_id = a.address_id
WHERE a.district = 'California';
which has a query plan that involves scanning both the tables that are joined:
pagila=# EXPLAIN SELECT c.email FROM customer c
pagila-# JOIN address a ON c.address_id = a.address_id
pagila-# WHERE a.district = 'California';
                              QUERY PLAN
----------------------------------------------------------------------
 Hash Join  (cost=15.65..32.22 rows=9 width=32)
   Hash Cond: (c.address_id = a.address_id)
   ->  Seq Scan on customer c  (cost=0.00..14.99 rows=599 width=34)
   ->  Hash  (cost=15.54..15.54 rows=9 width=4)
         ->  Seq Scan on address a  (cost=0.00..15.54 rows=9 width=4)
               Filter: (district = 'California'::text)
(6 rows)

Ni faharisi gani za kawaida zitatupa:

pagila=# CREATE INDEX idx_address1 ON address(district);
CREATE INDEX
pagila=# EXPLAIN SELECT c.email FROM customer c
pagila-# JOIN address a ON c.address_id = a.address_id
pagila-# WHERE a.district = 'California';
                                      QUERY PLAN
---------------------------------------------------------------------------------------
 Hash Join  (cost=12.98..29.55 rows=9 width=32)
   Hash Cond: (c.address_id = a.address_id)
   ->  Seq Scan on customer c  (cost=0.00..14.99 rows=599 width=34)
   ->  Hash  (cost=12.87..12.87 rows=9 width=4)
         ->  Bitmap Heap Scan on address a  (cost=4.34..12.87 rows=9 width=4)
               Recheck Cond: (district = 'California'::text)
               ->  Bitmap Index Scan on idx_address1  (cost=0.00..4.34 rows=9 width=0)
                     Index Cond: (district = 'California'::text)
(8 rows)

Scan address imebadilishwa na tambazo la faharasa idx_address1na kisha kukagua lundo address.

Kwa kuwa hili ni swali la mara kwa mara na linahitaji kuboreshwa, tunaweza kutumia faharasa ya sehemu, ambayo huashiria safu zile tu zenye anwani ambazo wilaya β€˜California’:

pagila=# CREATE INDEX idx_address2 ON address(address_id) WHERE district='California';
CREATE INDEX
pagila=# EXPLAIN SELECT c.email FROM customer c
pagila-# JOIN address a ON c.address_id = a.address_id
pagila-# WHERE a.district = 'California';
                                           QUERY PLAN
------------------------------------------------------------------------------------------------
 Hash Join  (cost=12.38..28.96 rows=9 width=32)
   Hash Cond: (c.address_id = a.address_id)
   ->  Seq Scan on customer c  (cost=0.00..14.99 rows=599 width=34)
   ->  Hash  (cost=12.27..12.27 rows=9 width=4)
         ->  Index Only Scan using idx_address2 on address a  (cost=0.14..12.27 rows=9 width=4)
(5 rows)

Sasa swala linasoma tu idx_address2 na haigusi meza address.

Kwa kutumia Fahirisi za Thamani nyingi

Baadhi ya safu wima zitakazoorodheshwa huenda zisiwe na aina ya data ya kadiri. Aina za safu kama jsonb, arrays ΠΈ tsvector vyenye thamani nyingi au nyingi. Ikiwa unahitaji kuorodhesha safu wima kama hizo, kawaida lazima utafute maadili yote ya kibinafsi kwenye safu wima hizo.

Hebu tujaribu kutafuta vichwa vya filamu zote zilizo na kupunguzwa kutoka kwa uchezaji ambao haukufanikiwa. Jedwali film kuna safu ya maandishi inayoitwa special_features. Ikiwa filamu ina "sifa maalum", basi safu wima ina kipengele kama safu ya maandishi Behind The Scenes. Ili kutafuta filamu kama hizo, tunahitaji kuchagua safu mlalo zote na "Nyuma ya Pazia" wakati yoyote maadili ya safu special_features:

SELECT title FROM film WHERE special_features @> '{"Behind The Scenes"}';

Opereta wa nesting @> huangalia ikiwa upande wa kulia ni sehemu ndogo ya upande wa kushoto.

Omba mpango:

pagila=# EXPLAIN SELECT title FROM film
pagila-# WHERE special_features @> '{"Behind The Scenes"}';
                           QUERY PLAN
-----------------------------------------------------------------
 Seq Scan on film  (cost=0.00..67.50 rows=5 width=15)
   Filter: (special_features @> '{"Behind The Scenes"}'::text[])
(2 rows)

Ambayo inaomba uchunguzi kamili wa lundo kwa gharama ya 67.

Wacha tuone ikiwa faharasa ya kawaida ya miti B inatusaidia:

pagila=# CREATE INDEX idx_film1 ON film(special_features);
CREATE INDEX
pagila=# EXPLAIN SELECT title FROM film
pagila-# WHERE special_features @> '{"Behind The Scenes"}';
                           QUERY PLAN
-----------------------------------------------------------------
 Seq Scan on film  (cost=0.00..67.50 rows=5 width=15)
   Filter: (special_features @> '{"Behind The Scenes"}'::text[])
(2 rows)

Index haikuzingatiwa hata. Faharasa ya miti B haifahamu kuwepo kwa vipengele mahususi katika thamani zilizoorodheshwa.

Tunahitaji index ya GIN.

pagila=# CREATE INDEX idx_film2 ON film USING GIN(special_features);
CREATE INDEX
pagila=# EXPLAIN SELECT title FROM film
pagila-# WHERE special_features @> '{"Behind The Scenes"}';
                                QUERY PLAN
---------------------------------------------------------------------------
 Bitmap Heap Scan on film  (cost=8.04..23.58 rows=5 width=15)
   Recheck Cond: (special_features @> '{"Behind The Scenes"}'::text[])
   ->  Bitmap Index Scan on idx_film2  (cost=0.00..8.04 rows=5 width=0)
         Index Cond: (special_features @> '{"Behind The Scenes"}'::text[])
(4 rows)

Faharasa ya GIN inasaidia kuchora thamani moja dhidi ya thamani za mchanganyiko zilizoorodheshwa, na kusababisha gharama ya mpango wa hoja ambayo ni zaidi ya nusu.

Kuondoa faharasa rudufu

Fahirisi hujilimbikiza kwa muda, na wakati mwingine faharasa mpya inaweza kuwa na ufafanuzi sawa na mojawapo ya zile zilizopita. Unaweza kutumia mwonekano wa katalogi kupata faharasa za SQL zinazoweza kusomeka na binadamu. pg_indexes. Unaweza pia kupata ufafanuzi sawa kwa urahisi:

 SELECT array_agg(indexname) AS indexes, replace(indexdef, indexname, '') AS defn
    FROM pg_indexes
GROUP BY defn
  HAVING count(*) > 1;
And here’s the result when run on the stock pagila database:
pagila=#   SELECT array_agg(indexname) AS indexes, replace(indexdef, indexname, '') AS defn
pagila-#     FROM pg_indexes
pagila-# GROUP BY defn
pagila-#   HAVING count(*) > 1;
                                indexes                                 |                                defn
------------------------------------------------------------------------+------------------------------------------------------------------
 {payment_p2017_01_customer_id_idx,idx_fk_payment_p2017_01_customer_id} | CREATE INDEX  ON public.payment_p2017_01 USING btree (customer_id
 {payment_p2017_02_customer_id_idx,idx_fk_payment_p2017_02_customer_id} | CREATE INDEX  ON public.payment_p2017_02 USING btree (customer_id
 {payment_p2017_03_customer_id_idx,idx_fk_payment_p2017_03_customer_id} | CREATE INDEX  ON public.payment_p2017_03 USING btree (customer_id
 {idx_fk_payment_p2017_04_customer_id,payment_p2017_04_customer_id_idx} | CREATE INDEX  ON public.payment_p2017_04 USING btree (customer_id
 {payment_p2017_05_customer_id_idx,idx_fk_payment_p2017_05_customer_id} | CREATE INDEX  ON public.payment_p2017_05 USING btree (customer_id
 {idx_fk_payment_p2017_06_customer_id,payment_p2017_06_customer_id_idx} | CREATE INDEX  ON public.payment_p2017_06 USING btree (customer_id
(6 rows)

Superset Indexes

Inaweza kutokea kwamba utaishia na faharasa nyingi, moja ambayo inaashiria safu kubwa ya safu ambazo huashiria faharisi zingine. Hii inaweza au isipendekeβ€”seti kuu inaweza kusababisha uchanganuzi wa faharasa pekee, ambayo ni nzuri, lakini inaweza kuchukua nafasi nyingi sana, au hoja ambayo kifaa kikuu kilikusudiwa kuboresha haitumiki tena.

Ikiwa unahitaji kubinafsisha ufafanuzi wa faharisi kama hizo, unaweza kuanza na pg_index kutoka kwa meza pg_catalog.

Fahirisi zisizotumika

Kadiri programu zinazotumia hifadhidata zinavyobadilika, ndivyo na maswali wanayotumia. Faharasa zilizoongezwa mapema haziwezi kutumiwa tena na swali lolote. Kila wakati faharasa inapochanganuliwa, inawekwa alama na msimamizi wa takwimu, na katika mwonekano wa katalogi ya mfumo pg_stat_user_indexes unaweza kuona thamani idx_scan, ambayo ni kaunta limbikizo. Kufuatilia thamani hii kwa muda fulani (sema kwa mwezi) kutatoa wazo nzuri la ni faharasa zipi hazitumiki na zinaweza kuondolewa.

Hapa kuna swali la kupata hesabu za sasa za skanisho za faharisi zote kwenye schema β€˜public’:

SELECT relname, indexrelname, idx_scan
FROM   pg_catalog.pg_stat_user_indexes
WHERE  schemaname = 'public';
with output like this:
pagila=# SELECT relname, indexrelname, idx_scan
pagila-# FROM   pg_catalog.pg_stat_user_indexes
pagila-# WHERE  schemaname = 'public'
pagila-# LIMIT  10;
    relname    |    indexrelname    | idx_scan
---------------+--------------------+----------
 customer      | customer_pkey      |    32093
 actor         | actor_pkey         |     5462
 address       | address_pkey       |      660
 category      | category_pkey      |     1000
 city          | city_pkey          |      609
 country       | country_pkey       |      604
 film_actor    | film_actor_pkey    |        0
 film_category | film_category_pkey |        0
 film          | film_pkey          |    11043
 inventory     | inventory_pkey     |    16048
(10 rows)

Kuunda upya faharasa na kufuli chache

Fahirisi mara nyingi zinahitaji kujengwa upya, kwa mfano zinapovimba, na kujenga upya kunaweza kuharakisha utambazaji. Pia faharisi zinaweza kuharibika. Kubadilisha vigezo vya faharisi kunaweza pia kuhitaji kuijenga upya.

Washa uundaji wa faharasa sambamba

Katika PostgreSQL 11, kuunda faharisi ya B-Tree ni sawa. Ili kuharakisha mchakato wa uumbaji, wafanyakazi kadhaa wa sambamba wanaweza kutumika. Walakini, hakikisha chaguzi hizi za usanidi zimewekwa kwa usahihi:

SET max_parallel_workers = 32;
SET max_parallel_maintenance_workers = 16;

Thamani chaguo-msingi ni ndogo sana. Kwa kweli, nambari hizi zinapaswa kuongezeka pamoja na idadi ya cores za processor. Soma zaidi katika nyaraka.

Uundaji wa faharasa ya usuli

Unaweza kuunda faharisi chinichini kwa kutumia chaguo CONCURRENTLY timu CREATE INDEX:

pagila=# CREATE INDEX CONCURRENTLY idx_address1 ON address(district);
CREATE INDEX

Utaratibu huu wa uundaji wa index hutofautiana na ule wa kawaida kwa kuwa hauhitaji kufuli kwenye meza, na kwa hivyo hauzuii shughuli za uandishi. Kwa upande mwingine, inachukua muda zaidi na hutumia rasilimali zaidi.

Postgres hutoa unyumbulifu mwingi wa kuunda faharasa na njia za kushughulikia kesi zozote maalum, na pia njia za kudhibiti hifadhidata ikiwa programu yako itakua kwa kasi. Tunatumahi vidokezo hivi vitakusaidia kupata maswali yako haraka na hifadhidata yako tayari kuongezwa.

Chanzo: mapenzi.com

Kuongeza maoni