PostgreSQL Antipatterns: Dictionary Hit Heavy JOIN
A tẹsiwaju lori lẹsẹsẹ awọn nkan ti o yasọtọ si iwadii awọn ọna ti a ko mọ diẹ lati mu ilọsiwaju iṣẹ ti “ti o dabi ẹnipe o rọrun” awọn ibeere PostgreSQL:
Ṣugbọn nigbagbogbo laisi rẹ, ibeere naa wa jade lati jẹ iṣelọpọ pupọ diẹ sii ju pẹlu rẹ lọ. Nitorina loni a yoo gbiyanju yọ awọn oluşewadi-lekoko JOIN - lilo a dictionary.
Bibẹrẹ pẹlu PostgreSQL 12, diẹ ninu awọn ipo ti a ṣalaye ni isalẹ le tun ṣe ni iyatọ diẹ nitori aiyipada ti kii-materialization CTE. Iwa yii le tun pada nipa sisọ bọtini naa pato MATERIALIZED.
Ọpọlọpọ awọn "awọn otitọ" ni awọn ọrọ ti o lopin
Jẹ ki a mu iṣẹ-ṣiṣe ohun elo gidi kan - a nilo lati ṣafihan atokọ kan ti nwọle awọn ifiranṣẹ tabi awọn iṣẹ ṣiṣe pẹlu awọn olufiranṣẹ:
25.01 | Иванов И.И. | Подготовить описание нового алгоритма.
22.01 | Иванов И.И. | Написать статью на Хабр: жизнь без JOIN.
20.01 | Петров П.П. | Помочь оптимизировать запрос.
18.01 | Иванов И.И. | Написать статью на Хабр: JOIN с учетом распределения данных.
16.01 | Петров П.П. | Помочь оптимизировать запрос.
Ni agbaye áljẹbrà, awọn onkọwe iṣẹ yẹ ki o pin kaakiri laarin gbogbo awọn oṣiṣẹ ti ajo wa, ṣugbọn ni otitọ Awọn iṣẹ-ṣiṣe wa, gẹgẹbi ofin, lati nọmba eniyan ti o ni opin ti iṣẹtọ - "lati isakoso" soke awọn logalomomoise tabi "lati subcontractors" lati adugbo awọn apa (oluyanju, apẹẹrẹ, tita, ...).
Jẹ ki a gba pe ninu eto wa ti awọn eniyan 1000, awọn onkọwe 20 nikan (nigbagbogbo paapaa kere si) ṣeto awọn iṣẹ ṣiṣe fun oṣere kan pato ati Jẹ ki a lo imọ koko-ọrọ yiilati mu yara ibeere “ibile”.
monomono akosile
-- сотрудники
CREATE TABLE person AS
SELECT
id
, repeat(chr(ascii('a') + (id % 26)), (id % 32) + 1) "name"
, '2000-01-01'::date - (random() * 1e4)::integer birth_date
FROM
generate_series(1, 1000) id;
ALTER TABLE person ADD PRIMARY KEY(id);
-- задачи с указанным распределением
CREATE TABLE task AS
WITH aid AS (
SELECT
id
, array_agg((random() * 999)::integer + 1) aids
FROM
generate_series(1, 1000) id
, generate_series(1, 20)
GROUP BY
1
)
SELECT
*
FROM
(
SELECT
id
, '2020-01-01'::date - (random() * 1e3)::integer task_date
, (random() * 999)::integer + 1 owner_id
FROM
generate_series(1, 100000) id
) T
, LATERAL(
SELECT
aids[(random() * (array_length(aids, 1) - 1))::integer + 1] author_id
FROM
aid
WHERE
id = T.owner_id
LIMIT 1
) a;
ALTER TABLE task ADD PRIMARY KEY(id);
CREATE INDEX ON task(owner_id, task_date);
CREATE INDEX ON task(author_id);
Jẹ ki a ṣe afihan awọn iṣẹ-ṣiṣe 100 ti o kẹhin fun alaṣẹ kan pato:
SELECT
task.*
, person.name
FROM
task
LEFT JOIN
person
ON person.id = task.author_id
WHERE
owner_id = 777
ORDER BY
task_date DESC
LIMIT 100;
O wa jade pe 1/3 lapapọ akoko ati 3/4 kika Awọn oju-iwe ti data ni a ṣe nikan lati wa fun onkọwe ni igba 100 - fun iṣẹ ṣiṣejade kọọkan. Ṣugbọn a mọ pe laarin awọn ọgọọgọrun wọnyi nikan 20 o yatọ si - Ṣe o ṣee ṣe lati lo imọ yii?
hstore-itumọ
Jẹ ki a lo anfani hstore iru lati ṣe ipilẹṣẹ iye bọtini “itumọ-itumọ” kan:
CREATE EXTENSION hstore
A kan nilo lati fi ID onkọwe ati orukọ rẹ sinu iwe-itumọ naa ki a le ṣe jade nipa lilo bọtini yii:
-- формируем целевую выборку
WITH T AS (
SELECT
*
FROM
task
WHERE
owner_id = 777
ORDER BY
task_date DESC
LIMIT 100
)
-- формируем словарь для уникальных значений
, dict AS (
SELECT
hstore( -- hstore(keys::text[], values::text[])
array_agg(id)::text[]
, array_agg(name)::text[]
)
FROM
person
WHERE
id = ANY(ARRAY(
SELECT DISTINCT
author_id
FROM
T
))
)
-- получаем связанные значения словаря
SELECT
*
, (TABLE dict) -> author_id::text -- hstore -> key
FROM
T;
Lo lori gbigba alaye nipa awọn eniyan Awọn akoko 2 kere si ati awọn akoko 7 kere si kika data! Ni afikun si "fokabulari", ohun ti o tun ṣe iranlọwọ fun wa lati ṣaṣeyọri awọn abajade wọnyi olopobobo igbasilẹ igbapada lati tabili ni kan nikan kọja lilo = ANY(ARRAY(...)).
Awọn titẹ sii Table: Serialization ati Deserialization
Ṣugbọn kini ti a ba nilo lati fipamọ kii ṣe aaye ọrọ kan nikan, ṣugbọn gbogbo titẹ sii ninu iwe-itumọ? Ni idi eyi, agbara PostgreSQL yoo ran wa lọwọ tọju titẹsi tabili bi iye kan:
...
, dict AS (
SELECT
hstore(
array_agg(id)::text[]
, array_agg(p)::text[] -- магия #1
)
FROM
person p
WHERE
...
)
SELECT
*
, (((TABLE dict) -> author_id::text)::person).* -- магия #2
FROM
T;
Jẹ ki a wo ohun ti n ṣẹlẹ nibi:
A gba p bi inagijẹ si titẹsi tabili eniyan kikun ó sì kó ọ̀pọ̀lọpọ̀ wọn jọ.
yi awọn orun ti awọn gbigbasilẹ ti a recast si akojọpọ awọn gbolohun ọrọ (eniyan [] :: ọrọ[]) lati fi sii sinu iwe-itumọ hstore gẹgẹbi titobi awọn iye.
Nigba ti a ba gba igbasilẹ ti o jọmọ, a fa lati inu iwe-itumọ nipasẹ bọtini bi okun ọrọ.
A nilo ọrọ tan-sinu kan tabili iru iye eniyan (fun kọọkan tabili a iru ti kanna orukọ laifọwọyi da).
"Fagun" igbasilẹ ti a tẹ sinu awọn ọwọn ni lilo (...).*.
json dictionary
Ṣugbọn iru ẹtan bi a ti lo loke kii yoo ṣiṣẹ ti ko ba si iru tabili ti o baamu lati ṣe “simẹnti”. Gangan ipo kanna yoo dide, ati pe ti a ba gbiyanju lati lo a CTE kana, ko "gidi" tabili.
...
, p AS ( -- это уже CTE
SELECT
*
FROM
person
WHERE
...
)
, dict AS (
SELECT
json_object( -- теперь это уже json
array_agg(id)::text[]
, array_agg(row_to_json(p))::text[] -- и внутри json для каждой строки
)
FROM
p
)
SELECT
*
FROM
T
, LATERAL(
SELECT
*
FROM
json_to_record(
((TABLE dict) ->> author_id::text)::json -- извлекли из словаря как json
) AS j(name text, birth_date date) -- заполнили нужную нам структуру
) j;
O yẹ ki o ṣe akiyesi pe nigba ti n ṣalaye eto ibi-afẹde, a ko le ṣe atokọ gbogbo awọn aaye ti okun orisun, ṣugbọn awọn ti a nilo gaan. Ti a ba ni tabili "abinibi", lẹhinna o dara lati lo iṣẹ naa json_populate_record.
A tun wọle si iwe-itumọ lẹẹkan, ṣugbọn json-[de] serialization owo ti wa ni oyimbo ga, nitorina, o jẹ reasonable lati lo yi ọna nikan ni awọn igba miiran nigbati awọn "otito" CTE Scan fihan ara buru.
Iṣẹ ṣiṣe idanwo
Nitorinaa, a ni awọn ọna meji lati serialize data sinu iwe-itumọ - hstore/json_object. Ni afikun, awọn akojọpọ awọn bọtini ati awọn iye funrara wọn tun le ṣe ipilẹṣẹ ni awọn ọna meji, pẹlu iyipada inu tabi ita si ọrọ: array_agg (i :: ọrọ) / array_agg (i): ọrọ[].
Jẹ ki ká ṣayẹwo awọn ndin ti o yatọ si orisi ti serialization lilo a odasaka sintetiki apẹẹrẹ - serialize o yatọ si awọn nọmba ti awọn bọtini:
WITH dict AS (
SELECT
hstore(
array_agg(i::text)
, array_agg(i::text)
)
FROM
generate_series(1, ...) i
)
TABLE dict;
Akosile igbelewọn: serialization
WITH T AS (
SELECT
*
, (
SELECT
regexp_replace(ea[array_length(ea, 1)], '^Execution Time: (d+.d+) ms$', '1')::real et
FROM
(
SELECT
array_agg(el) ea
FROM
dblink('port= ' || current_setting('port') || ' dbname=' || current_database(), $$
explain analyze
WITH dict AS (
SELECT
hstore(
array_agg(i::text)
, array_agg(i::text)
)
FROM
generate_series(1, $$ || (1 << v) || $$) i
)
TABLE dict
$$) T(el text)
) T
) et
FROM
generate_series(0, 19) v
, LATERAL generate_series(1, 7) i
ORDER BY
1, 2
)
SELECT
v
, avg(et)::numeric(32,3)
FROM
T
GROUP BY
1
ORDER BY
1;
Lori PostgreSQL 11, to iwọn iwe-itumọ ti awọn bọtini 2^12 serialization to json gba kere akoko. Ni idi eyi, ti o munadoko julọ ni apapo json_object ati iyipada iru "inu". array_agg(i::text).
Bayi jẹ ki a gbiyanju lati ka iye bọtini kọọkan ni igba 8 - lẹhinna, ti o ko ba wọle si iwe-itumọ, kilode ti o nilo?
Iwe afọwọkọ igbelewọn: kika lati inu iwe-itumọ
WITH T AS (
SELECT
*
, (
SELECT
regexp_replace(ea[array_length(ea, 1)], '^Execution Time: (d+.d+) ms$', '1')::real et
FROM
(
SELECT
array_agg(el) ea
FROM
dblink('port= ' || current_setting('port') || ' dbname=' || current_database(), $$
explain analyze
WITH dict AS (
SELECT
json_object(
array_agg(i::text)
, array_agg(i::text)
)
FROM
generate_series(1, $$ || (1 << v) || $$) i
)
SELECT
(TABLE dict) -> (i % ($$ || (1 << v) || $$) + 1)::text
FROM
generate_series(1, $$ || (1 << (v + 3)) || $$) i
$$) T(el text)
) T
) et
FROM
generate_series(0, 19) v
, LATERAL generate_series(1, 7) i
ORDER BY
1, 2
)
SELECT
v
, avg(et)::numeric(32,3)
FROM
T
GROUP BY
1
ORDER BY
1;
Ati... tẹlẹ isunmọ pẹlu 2 ^ 6 awọn bọtini, kika lati json dictionary bẹrẹ lati padanu ọpọ igba kika lati hstore, fun jsonb kanna ṣẹlẹ ni 2^9.
Awọn ipari ipari:
ti o ba nilo lati ṣe Darapọ mọ ọpọlọpọ awọn igbasilẹ atunwi - o dara lati lo "itumọ" ti tabili
ti iwe-itumọ rẹ ba nireti kekere ati awọn ti o yoo ko ka Elo lati o - o le lo json[b]
ni gbogbo awọn ọran miiran hstore + array_agg (i :: ọrọ) yoo jẹ diẹ munadoko