Nā Antipatterns PostgreSQL: e paʻi i ka JOIN kaumaha me kahi puke wehewehe
Ke hoʻomau nei mākou i ke ʻano o nā ʻatikala i hoʻolaʻa ʻia i ke aʻo ʻana i nā ala ʻike liʻiliʻi e hoʻomaikaʻi ai i ka hana o nā nīnau PostgreSQL "me he mea maʻalahi."
Mai noʻonoʻo ʻaʻole wau makemake nui iā JOIN ... :)
Akā pinepine me ka ʻole o ia mea, ua ʻoi aku ka maikaʻi o ka noi ma mua o ia. No laila i kēia lā e hoʻāʻo mākou e hoʻopau i nā kumu waiwai-intensive JOIN - ka hoʻohana ʻana i ka puke wehewehe ʻōlelo.
E hoʻomaka ana me PostgreSQL 12, hiki ke hana hou ʻia kekahi o nā kūlana i hōʻike ʻia ma lalo nei ma muli o paʻamau non-materialization CTE. Hiki ke hoʻihoʻi ʻia kēia ʻano ma ke kuhikuhi ʻana i ke kī MATERIALIZED.
Nui nā "ʻoiaʻiʻo" ma kahi huaʻōlelo palena
E lawe mākou i kahi hana noi maoli - pono mākou e hōʻike i kahi papa inoa nā memo e hiki mai ana a i ʻole nā hana ikaika me nā mea hoʻouna:
25.01 | Иванов И.И. | Подготовить описание нового алгоритма.
22.01 | Иванов И.И. | Написать статью на Хабр: жизнь без JOIN.
20.01 | Петров П.П. | Помочь оптимизировать запрос.
18.01 | Иванов И.И. | Написать статью на Хабр: JOIN с учетом распределения данных.
16.01 | Петров П.П. | Помочь оптимизировать запрос.
I loko o ka honua abstract, pono e māhele like ʻia nā mea kākau hana ma waena o nā limahana āpau o kā mākou hui, akā ʻoiaʻiʻo hele mai nā hana, ma ke ʻano he kānāwai, mai ka heluna liʻiliʻi o ka poʻe - "mai ka hoʻokele" a i ka hierarchy a i ʻole "mai nā subcontractors" mai nā keʻena e pili ana (nā loiloi, nā mea hoʻolālā, ke kūʻai aku, ...).
E ʻae kākou i loko o kā mākou hui o 1000 poʻe, he 20 wale nō mea kākau (ʻoi aku ka liʻiliʻi) i hoʻonohonoho i nā hana no kēlā me kēia mea hana a E hoʻohana kākou i kēia ʻike kumuhanae wikiwiki i ka nīnau "kuʻuna".
Mea hana palapala
-- сотрудники
CREATE TABLE person AS
SELECT
id
, repeat(chr(ascii('a') + (id % 26)), (id % 32) + 1) "name"
, '2000-01-01'::date - (random() * 1e4)::integer birth_date
FROM
generate_series(1, 1000) id;
ALTER TABLE person ADD PRIMARY KEY(id);
-- задачи с указанным распределением
CREATE TABLE task AS
WITH aid AS (
SELECT
id
, array_agg((random() * 999)::integer + 1) aids
FROM
generate_series(1, 1000) id
, generate_series(1, 20)
GROUP BY
1
)
SELECT
*
FROM
(
SELECT
id
, '2020-01-01'::date - (random() * 1e3)::integer task_date
, (random() * 999)::integer + 1 owner_id
FROM
generate_series(1, 100000) id
) T
, LATERAL(
SELECT
aids[(random() * (array_length(aids, 1) - 1))::integer + 1] author_id
FROM
aid
WHERE
id = T.owner_id
LIMIT 1
) a;
ALTER TABLE task ADD PRIMARY KEY(id);
CREATE INDEX ON task(owner_id, task_date);
CREATE INDEX ON task(author_id);
E hōʻike i nā hana 100 hope loa no kahi mea hoʻokō kikoʻī:
SELECT
task.*
, person.name
FROM
task
LEFT JOIN
person
ON person.id = task.author_id
WHERE
owner_id = 777
ORDER BY
task_date DESC
LIMIT 100;
Ke hoʻololi nei ia 1/3 ka nui o ka manawa a me 3/4 heluhelu ua hana ʻia nā ʻaoʻao o ka ʻikepili no ka ʻimi ʻana i ka mea kākau 100 mau manawa - no kēlā me kēia hana hoʻopuka. Akā ʻike mākou i waena o kēia mau haneli he 20 wale no okoa - Hiki ke hoʻohana i kēia ʻike?
hstore-huaolelo
E hoʻohana pono kākou ʻano hale kūʻai e hana i ka waiwai kī "papa ʻōlelo":
CREATE EXTENSION hstore
Pono mākou e kau i ka ID o ka mea kākau a me kona inoa i loko o ka puke wehewehe'ōlelo i hiki iā mākou ke unuhi me kēia kī:
-- формируем целевую выборку
WITH T AS (
SELECT
*
FROM
task
WHERE
owner_id = 777
ORDER BY
task_date DESC
LIMIT 100
)
-- формируем словарь для уникальных значений
, dict AS (
SELECT
hstore( -- hstore(keys::text[], values::text[])
array_agg(id)::text[]
, array_agg(name)::text[]
)
FROM
person
WHERE
id = ANY(ARRAY(
SELECT DISTINCT
author_id
FROM
T
))
)
-- получаем связанные значения словаря
SELECT
*
, (TABLE dict) -> author_id::text -- hstore -> key
FROM
T;
Hoʻohana ʻia no ka loaʻa ʻana o ka ʻike e pili ana i nā kānaka 2 manawa liʻiliʻi ka manawa a me 7 manawa liʻiliʻi i ka ʻikepili heluhelu! Ma waho aʻe o ka "hua'ōlelo", ʻo ka mea i kōkua iā mākou e hoʻokō i kēia mau hopena ka lawe ʻana i nā moʻolelo nui mai ka papaʻaina i hoʻokahi pā hoʻohana = ANY(ARRAY(...)).
Nā Palapala Papa: Serialization a Deserialization
Akā, pehea inā pono mākou e mālama ʻaʻole i hoʻokahi kahua kikokikona, akā i kahi komo holoʻokoʻa i loko o ka puke wehewehe? I kēia hihia, hiki iā PostgreSQL ke kōkua iā mākou e mālama i ke komo papa ʻaina ma ke ʻano he waiwai hoʻokahi:
...
, dict AS (
SELECT
hstore(
array_agg(id)::text[]
, array_agg(p)::text[] -- магия #1
)
FROM
person p
WHERE
...
)
SELECT
*
, (((TABLE dict) -> author_id::text)::person).* -- магия #2
FROM
T;
E nānā kākou i ka mea i hana ʻia ma ʻaneʻi:
Lawe mākou p ma ke ʻano he inoa inoa no ka hoʻokomo papa ʻaina kanaka piha a houluulu ae la ia lakou.
keia ua hoʻopaʻa hou ʻia ke ʻano o nā leo i kahi ʻano o nā kaula kikokikona (kanaka []:: kikokikona[]) e waiho i loko o ka puke wehewehe ʻōlelo hstore ma ke ʻano he ʻano waiwai.
Ke loaʻa iā mākou kahi moʻolelo pili, mākou huki ʻia mai ka puke wehewehe ʻōlelo ma ke kī ma ke ano he kaula kikokikona.
Pono mākou i kikokikona lilo i waiwai ʻano pākaukau kanaka (no kēlā me kēia pākaukau i hoʻokumu ʻia ke ʻano o ka inoa like).
"E hoʻonui" i ka moʻolelo i paʻi ʻia i loko o nā kolamu me ka hoʻohana ʻana (...).*.
json puke wehewehe
Akā ʻaʻole e holo ka hana hoʻopunipuni e like me kā mākou i noi ai ma luna inā ʻaʻohe ʻano papaʻaina kūpono e hana i ka "casting". E kū mai ke kūlana like, a inā mākou e ho'āʻo e hoʻohana he lālani CTE, ʻaʻole he papaʻaina "ʻoiaʻiʻo"..
...
, p AS ( -- это уже CTE
SELECT
*
FROM
person
WHERE
...
)
, dict AS (
SELECT
json_object( -- теперь это уже json
array_agg(id)::text[]
, array_agg(row_to_json(p))::text[] -- и внутри json для каждой строки
)
FROM
p
)
SELECT
*
FROM
T
, LATERAL(
SELECT
*
FROM
json_to_record(
((TABLE dict) ->> author_id::text)::json -- извлекли из словаря как json
) AS j(name text, birth_date date) -- заполнили нужную нам структуру
) j;
Pono e hoʻomaopopo ʻia i ka wehewehe ʻana i ka hoʻolālā pahuhopu, ʻaʻole hiki iā mākou ke papa inoa i nā kahua āpau o ke kaula kumu, akā ʻo nā mea wale nō mākou e pono ai. Inā loaʻa iā mākou kahi papaʻaina "ʻōiwi", a laila ʻoi aku ka maikaʻi o ka hoʻohana ʻana i ka hana json_populate_record.
Loaʻa mākou i ka puke wehewehe ʻōlelo i hoʻokahi manawa, akā json-[de] kiʻekiʻe nā kumukūʻai serialization, no laila, kūpono ke hoʻohana ʻana i kēia ʻano hana i kekahi mau hihia ke hōʻike ʻia ka "hoʻopono" CTE Scan iā ia iho.
Ke hoao ana i ka hana
No laila, ua loaʻa iā mākou ʻelua ala e hoʻokaʻawale i ka ʻikepili i loko o kahi puke wehewehe ʻōlelo - hstore/json_object. Eia kekahi, hiki ke hana ʻia nā ʻāpana o nā kī a me nā waiwai iā lākou iho i ʻelua ala, me ka hoʻololi ʻana i loko a i waho paha i ka kikokikona: array_agg(i::text) / array_agg(i)::text[].
E nānā kākou i ka pono o nā ʻano serialization me ka hoʻohana ʻana i kahi hiʻohiʻona synthetic maʻemaʻe - hoʻokaʻawale i nā helu like ʻole o nā kī:
WITH dict AS (
SELECT
hstore(
array_agg(i::text)
, array_agg(i::text)
)
FROM
generate_series(1, ...) i
)
TABLE dict;
Palapala loiloi: serialization
WITH T AS (
SELECT
*
, (
SELECT
regexp_replace(ea[array_length(ea, 1)], '^Execution Time: (d+.d+) ms$', '1')::real et
FROM
(
SELECT
array_agg(el) ea
FROM
dblink('port= ' || current_setting('port') || ' dbname=' || current_database(), $$
explain analyze
WITH dict AS (
SELECT
hstore(
array_agg(i::text)
, array_agg(i::text)
)
FROM
generate_series(1, $$ || (1 << v) || $$) i
)
TABLE dict
$$) T(el text)
) T
) et
FROM
generate_series(0, 19) v
, LATERAL generate_series(1, 7) i
ORDER BY
1, 2
)
SELECT
v
, avg(et)::numeric(32,3)
FROM
T
GROUP BY
1
ORDER BY
1;
Ma PostgreSQL 11, a hiki i kahi puke wehewehe ʻōlelo o 2^12 kī liʻiliʻi ka manawa o ka serialization iā json. I kēia hihia, ʻoi aku ka maikaʻi o ka hui ʻana o ka json_object a me ka hoʻololi ʻano "loko". array_agg(i::text).
I kēia manawa e hoʻāʻo kāua e heluhelu i ka waiwai o kēlā me kēia kī 8 mau manawa - ma hope o nā mea a pau, inā ʻaʻole ʻoe e komo i ka puke wehewehe ʻōlelo, no ke aha e pono ai?
Palapala loiloi: heluhelu mai ka puke wehewehe ʻōlelo
WITH T AS (
SELECT
*
, (
SELECT
regexp_replace(ea[array_length(ea, 1)], '^Execution Time: (d+.d+) ms$', '1')::real et
FROM
(
SELECT
array_agg(el) ea
FROM
dblink('port= ' || current_setting('port') || ' dbname=' || current_database(), $$
explain analyze
WITH dict AS (
SELECT
json_object(
array_agg(i::text)
, array_agg(i::text)
)
FROM
generate_series(1, $$ || (1 << v) || $$) i
)
SELECT
(TABLE dict) -> (i % ($$ || (1 << v) || $$) + 1)::text
FROM
generate_series(1, $$ || (1 << (v + 3)) || $$) i
$$) T(el text)
) T
) et
FROM
generate_series(0, 19) v
, LATERAL generate_series(1, 7) i
ORDER BY
1, 2
)
SELECT
v
, avg(et)::numeric(32,3)
FROM
T
GROUP BY
1
ORDER BY
1;
A... ua kokoke me nā kī 2^6, e hoʻomaka ana ka heluhelu ʻana mai kahi puke wehewehe ʻōlelo json i nā manawa he nui Heluhelu mai ka hstore, no ka jsonb ua like ia ma 2^9.
Nā hopena hope loa:
inā pono ʻoe e hana E HUI me nā moʻolelo hou - ʻoi aku ka maikaʻi o ka hoʻohana ʻana i ka "papapalapala" o ka papaʻaina
inā manaʻo ʻia kāu puke wehewehe liʻiliʻi a ʻaʻole ʻoe e heluhelu nui mai ia mea - hiki iā ʻoe ke hoʻohana i ka json[b]
i nā hihia ʻē aʻe a pau hstore + array_agg(i:: kikokikona) e ʻoi aku ka maikaʻi