Katika ulimwengu wa Postgres, faharasa ni muhimu kwa usogezaji bora wa hifadhi ya hifadhidata (inayoitwa "lundo"). Postgres haitumii nguzo kwa ajili yake, na usanifu wa MVCC hukufanya uishie na matoleo mengi ya nakala sawa. Kwa hiyo, ni muhimu sana kuwa na uwezo wa kuunda na kudumisha fahirisi za ufanisi ili kusaidia programu.
Hapa kuna vidokezo vya kuboresha na kuboresha matumizi ya faharisi.
Kumbuka: hoja zilizoonyeshwa hapa chini zinafanya kazi kwenye ambayo haijabadilishwa
Kwa kutumia Fahirisi za Kufunika
Hebu tuangalie ombi la kutoa anwani za barua pepe kwa watumiaji wasiofanya kazi. Jedwali customer
kuna safu active
, na swali ni rahisi:
pagila=# EXPLAIN SELECT email FROM customer WHERE active=0;
QUERY PLAN
-----------------------------------------------------------
Seq Scan on customer (cost=0.00..16.49 rows=15 width=32)
Filter: (active = 0)
(2 rows)
Hoja inaomba mlolongo kamili wa uchanganuzi wa jedwali customer
. Hebu tuunde fahirisi kwenye safu active
:
pagila=# CREATE INDEX idx_cust1 ON customer(active);
CREATE INDEX
pagila=# EXPLAIN SELECT email FROM customer WHERE active=0;
QUERY PLAN
-----------------------------------------------------------------------------
Index Scan using idx_cust1 on customer (cost=0.28..12.29 rows=15 width=32)
Index Cond: (active = 0)
(2 rows)
Ilisaidia, skanisho iliyofuata ikageuka kuwa "index scan
". Hii inamaanisha kuwa Postgres itachanganua faharisi "idx_cust1
", na kisha endelea kutafuta lundo la jedwali ili kusoma maadili ya safu zingine (katika kesi hii, safu wima email
) ambayo swali linahitaji.
Faharisi za kufunika huletwa katika PostgreSQL 11. Wanakuruhusu kujumuisha safu wima moja au zaidi kwenye faharisi yenyewe - maadili yao yamehifadhiwa kwenye duka la data la faharisi.
Ikiwa tungechukua fursa ya kipengele hiki na kuongeza thamani ya barua pepe ndani ya faharasa, basi Postgres haitahitaji kutafuta lundo la jedwali kwa thamani hiyo. email
. Wacha tuone ikiwa hii itafanya kazi:
pagila=# CREATE INDEX idx_cust2 ON customer(active) INCLUDE (email);
CREATE INDEX
pagila=# EXPLAIN SELECT email FROM customer WHERE active=0;
QUERY PLAN
----------------------------------------------------------------------------------
Index Only Scan using idx_cust2 on customer (cost=0.28..12.29 rows=15 width=32)
Index Cond: (active = 0)
(2 rows)
Β«Index Only Scan
' inatuambia kuwa swala sasa linahitaji tu faharasa, ambayo husaidia kuzuia diski zote za I/O kusoma lundo la jedwali.
Faharasa za kufunika zinapatikana kwa miti B pekee kwa sasa. Hata hivyo, katika kesi hii, jitihada za matengenezo zitakuwa za juu.
Kwa kutumia Fahirisi za Sehemu
Faharasa kiasi zinaonyesha kikundi kidogo tu cha safu mlalo katika jedwali. Hii huhifadhi saizi ya faharasa na kufanya uchanganuzi haraka.
Hebu tuseme tunataka kupata orodha ya anwani za barua pepe za wateja wetu huko California. Ombi litakuwa kama hii:
SELECT c.email FROM customer c
JOIN address a ON c.address_id = a.address_id
WHERE a.district = 'California';
which has a query plan that involves scanning both the tables that are joined:
pagila=# EXPLAIN SELECT c.email FROM customer c
pagila-# JOIN address a ON c.address_id = a.address_id
pagila-# WHERE a.district = 'California';
QUERY PLAN
----------------------------------------------------------------------
Hash Join (cost=15.65..32.22 rows=9 width=32)
Hash Cond: (c.address_id = a.address_id)
-> Seq Scan on customer c (cost=0.00..14.99 rows=599 width=34)
-> Hash (cost=15.54..15.54 rows=9 width=4)
-> Seq Scan on address a (cost=0.00..15.54 rows=9 width=4)
Filter: (district = 'California'::text)
(6 rows)
Ni faharisi gani za kawaida zitatupa:
pagila=# CREATE INDEX idx_address1 ON address(district);
CREATE INDEX
pagila=# EXPLAIN SELECT c.email FROM customer c
pagila-# JOIN address a ON c.address_id = a.address_id
pagila-# WHERE a.district = 'California';
QUERY PLAN
---------------------------------------------------------------------------------------
Hash Join (cost=12.98..29.55 rows=9 width=32)
Hash Cond: (c.address_id = a.address_id)
-> Seq Scan on customer c (cost=0.00..14.99 rows=599 width=34)
-> Hash (cost=12.87..12.87 rows=9 width=4)
-> Bitmap Heap Scan on address a (cost=4.34..12.87 rows=9 width=4)
Recheck Cond: (district = 'California'::text)
-> Bitmap Index Scan on idx_address1 (cost=0.00..4.34 rows=9 width=0)
Index Cond: (district = 'California'::text)
(8 rows)
Scan address
imebadilishwa na tambazo la faharasa idx_address1
na kisha kukagua lundo address
.
Kwa kuwa hili ni swali la mara kwa mara na linahitaji kuboreshwa, tunaweza kutumia faharasa ya sehemu, ambayo huashiria safu zile tu zenye anwani ambazo wilaya βCaliforniaβ
:
pagila=# CREATE INDEX idx_address2 ON address(address_id) WHERE district='California';
CREATE INDEX
pagila=# EXPLAIN SELECT c.email FROM customer c
pagila-# JOIN address a ON c.address_id = a.address_id
pagila-# WHERE a.district = 'California';
QUERY PLAN
------------------------------------------------------------------------------------------------
Hash Join (cost=12.38..28.96 rows=9 width=32)
Hash Cond: (c.address_id = a.address_id)
-> Seq Scan on customer c (cost=0.00..14.99 rows=599 width=34)
-> Hash (cost=12.27..12.27 rows=9 width=4)
-> Index Only Scan using idx_address2 on address a (cost=0.14..12.27 rows=9 width=4)
(5 rows)
Sasa swala linasoma tu idx_address2
na haigusi meza address
.
Kwa kutumia Fahirisi za Thamani nyingi
Baadhi ya safu wima zitakazoorodheshwa huenda zisiwe na aina ya data ya kadiri. Aina za safu kama jsonb
, arrays
ΠΈ tsvector
vyenye thamani nyingi au nyingi. Ikiwa unahitaji kuorodhesha safu wima kama hizo, kawaida lazima utafute maadili yote ya kibinafsi kwenye safu wima hizo.
Hebu tujaribu kutafuta vichwa vya filamu zote zilizo na kupunguzwa kutoka kwa uchezaji ambao haukufanikiwa. Jedwali film
kuna safu ya maandishi inayoitwa special_features
. Ikiwa filamu ina "sifa maalum", basi safu wima ina kipengele kama safu ya maandishi Behind The Scenes
. Ili kutafuta filamu kama hizo, tunahitaji kuchagua safu mlalo zote na "Nyuma ya Pazia" wakati yoyote maadili ya safu special_features
:
SELECT title FROM film WHERE special_features @> '{"Behind The Scenes"}';
Opereta wa nesting @>
huangalia ikiwa upande wa kulia ni sehemu ndogo ya upande wa kushoto.
Omba mpango:
pagila=# EXPLAIN SELECT title FROM film
pagila-# WHERE special_features @> '{"Behind The Scenes"}';
QUERY PLAN
-----------------------------------------------------------------
Seq Scan on film (cost=0.00..67.50 rows=5 width=15)
Filter: (special_features @> '{"Behind The Scenes"}'::text[])
(2 rows)
Ambayo inaomba uchunguzi kamili wa lundo kwa gharama ya 67.
Wacha tuone ikiwa faharasa ya kawaida ya miti B inatusaidia:
pagila=# CREATE INDEX idx_film1 ON film(special_features);
CREATE INDEX
pagila=# EXPLAIN SELECT title FROM film
pagila-# WHERE special_features @> '{"Behind The Scenes"}';
QUERY PLAN
-----------------------------------------------------------------
Seq Scan on film (cost=0.00..67.50 rows=5 width=15)
Filter: (special_features @> '{"Behind The Scenes"}'::text[])
(2 rows)
Index haikuzingatiwa hata. Faharasa ya miti B haifahamu kuwepo kwa vipengele mahususi katika thamani zilizoorodheshwa.
Tunahitaji index ya GIN.
pagila=# CREATE INDEX idx_film2 ON film USING GIN(special_features);
CREATE INDEX
pagila=# EXPLAIN SELECT title FROM film
pagila-# WHERE special_features @> '{"Behind The Scenes"}';
QUERY PLAN
---------------------------------------------------------------------------
Bitmap Heap Scan on film (cost=8.04..23.58 rows=5 width=15)
Recheck Cond: (special_features @> '{"Behind The Scenes"}'::text[])
-> Bitmap Index Scan on idx_film2 (cost=0.00..8.04 rows=5 width=0)
Index Cond: (special_features @> '{"Behind The Scenes"}'::text[])
(4 rows)
Faharasa ya GIN inasaidia kuchora thamani moja dhidi ya thamani za mchanganyiko zilizoorodheshwa, na kusababisha gharama ya mpango wa hoja ambayo ni zaidi ya nusu.
Kuondoa faharasa rudufu
Fahirisi hujilimbikiza kwa muda, na wakati mwingine faharasa mpya inaweza kuwa na ufafanuzi sawa na mojawapo ya zile zilizopita. Unaweza kutumia mwonekano wa katalogi kupata faharasa za SQL zinazoweza kusomeka na binadamu. pg_indexes
. Unaweza pia kupata ufafanuzi sawa kwa urahisi:
SELECT array_agg(indexname) AS indexes, replace(indexdef, indexname, '') AS defn
FROM pg_indexes
GROUP BY defn
HAVING count(*) > 1;
And hereβs the result when run on the stock pagila database:
pagila=# SELECT array_agg(indexname) AS indexes, replace(indexdef, indexname, '') AS defn
pagila-# FROM pg_indexes
pagila-# GROUP BY defn
pagila-# HAVING count(*) > 1;
indexes | defn
------------------------------------------------------------------------+------------------------------------------------------------------
{payment_p2017_01_customer_id_idx,idx_fk_payment_p2017_01_customer_id} | CREATE INDEX ON public.payment_p2017_01 USING btree (customer_id
{payment_p2017_02_customer_id_idx,idx_fk_payment_p2017_02_customer_id} | CREATE INDEX ON public.payment_p2017_02 USING btree (customer_id
{payment_p2017_03_customer_id_idx,idx_fk_payment_p2017_03_customer_id} | CREATE INDEX ON public.payment_p2017_03 USING btree (customer_id
{idx_fk_payment_p2017_04_customer_id,payment_p2017_04_customer_id_idx} | CREATE INDEX ON public.payment_p2017_04 USING btree (customer_id
{payment_p2017_05_customer_id_idx,idx_fk_payment_p2017_05_customer_id} | CREATE INDEX ON public.payment_p2017_05 USING btree (customer_id
{idx_fk_payment_p2017_06_customer_id,payment_p2017_06_customer_id_idx} | CREATE INDEX ON public.payment_p2017_06 USING btree (customer_id
(6 rows)
Superset Indexes
Inaweza kutokea kwamba utaishia na faharasa nyingi, moja ambayo inaashiria safu kubwa ya safu ambazo huashiria faharisi zingine. Hii inaweza au isipendekeβseti kuu inaweza kusababisha uchanganuzi wa faharasa pekee, ambayo ni nzuri, lakini inaweza kuchukua nafasi nyingi sana, au hoja ambayo kifaa kikuu kilikusudiwa kuboresha haitumiki tena.
Ikiwa unahitaji kubinafsisha ufafanuzi wa faharisi kama hizo, unaweza kuanza na pg_catalog
.
Fahirisi zisizotumika
Kadiri programu zinazotumia hifadhidata zinavyobadilika, ndivyo na maswali wanayotumia. Faharasa zilizoongezwa mapema haziwezi kutumiwa tena na swali lolote. Kila wakati faharasa inapochanganuliwa, inawekwa alama na msimamizi wa takwimu, na katika mwonekano wa katalogi ya mfumo pg_stat_user_indexes
unaweza kuona thamani idx_scan
, ambayo ni kaunta limbikizo. Kufuatilia thamani hii kwa muda fulani (sema kwa mwezi) kutatoa wazo nzuri la ni faharasa zipi hazitumiki na zinaweza kuondolewa.
Hapa kuna swali la kupata hesabu za sasa za skanisho za faharisi zote kwenye schema βpublicβ
:
SELECT relname, indexrelname, idx_scan
FROM pg_catalog.pg_stat_user_indexes
WHERE schemaname = 'public';
with output like this:
pagila=# SELECT relname, indexrelname, idx_scan
pagila-# FROM pg_catalog.pg_stat_user_indexes
pagila-# WHERE schemaname = 'public'
pagila-# LIMIT 10;
relname | indexrelname | idx_scan
---------------+--------------------+----------
customer | customer_pkey | 32093
actor | actor_pkey | 5462
address | address_pkey | 660
category | category_pkey | 1000
city | city_pkey | 609
country | country_pkey | 604
film_actor | film_actor_pkey | 0
film_category | film_category_pkey | 0
film | film_pkey | 11043
inventory | inventory_pkey | 16048
(10 rows)
Kuunda upya faharasa na kufuli chache
Fahirisi mara nyingi zinahitaji kujengwa upya, kwa mfano zinapovimba, na kujenga upya kunaweza kuharakisha utambazaji. Pia faharisi zinaweza kuharibika. Kubadilisha vigezo vya faharisi kunaweza pia kuhitaji kuijenga upya.
Washa uundaji wa faharasa sambamba
Katika PostgreSQL 11, kuunda faharisi ya B-Tree ni sawa. Ili kuharakisha mchakato wa uumbaji, wafanyakazi kadhaa wa sambamba wanaweza kutumika. Walakini, hakikisha chaguzi hizi za usanidi zimewekwa kwa usahihi:
SET max_parallel_workers = 32;
SET max_parallel_maintenance_workers = 16;
Thamani chaguo-msingi ni ndogo sana. Kwa kweli, nambari hizi zinapaswa kuongezeka pamoja na idadi ya cores za processor. Soma zaidi katika
Uundaji wa faharasa ya usuli
Unaweza kuunda faharisi chinichini kwa kutumia chaguo CONCURRENTLY
timu CREATE INDEX
:
pagila=# CREATE INDEX CONCURRENTLY idx_address1 ON address(district);
CREATE INDEX
Utaratibu huu wa uundaji wa index hutofautiana na ule wa kawaida kwa kuwa hauhitaji kufuli kwenye meza, na kwa hivyo hauzuii shughuli za uandishi. Kwa upande mwingine, inachukua muda zaidi na hutumia rasilimali zaidi.
Postgres hutoa unyumbulifu mwingi wa kuunda faharasa na njia za kushughulikia kesi zozote maalum, na pia njia za kudhibiti hifadhidata ikiwa programu yako itakua kwa kasi. Tunatumahi vidokezo hivi vitakusaidia kupata maswali yako haraka na hifadhidata yako tayari kuongezwa.
Chanzo: mapenzi.com