ื ื™ืงื•ื™ ืจืฉื•ืžื•ืช ืฉื™ื‘ื•ื˜ ืžื˜ื‘ืœื” ืœืœื PK

ื™ืฉ ืžืฆื‘ื™ื ืฉื‘ื”ื ืœื˜ื‘ืœื” ืœืœื ืžืคืชื— ืจืืฉื™ ืื• ืื™ื ื“ืงืก ื™ื™ื—ื•ื“ื™ ืื—ืจ, ืขืงื‘ ืคื™ืงื•ื—, ื›ืœื•ืœื™ื ืฉื™ื‘ื•ื˜ื™ื ืฉืœืžื™ื ืฉืœ ืจืฉื•ืžื•ืช ืฉื›ื‘ืจ ืงื™ื™ืžื•ืช.

ื ื™ืงื•ื™ ืจืฉื•ืžื•ืช ืฉื™ื‘ื•ื˜ ืžื˜ื‘ืœื” ืœืœื PK

ืœื“ื•ื’ืžื”, ื”ืขืจื›ื™ื ืฉืœ ืžื“ื“ ื›ืจื•ื ื•ืœื•ื’ื™ ื ื›ืชื‘ื™ื ืœืชื•ืš PostgreSQL ื‘ืืžืฆืขื•ืช ื–ืจื COPY, ื•ืื– ื™ืฉ ื›ืฉืœ ืคืชืื•ืžื™, ื•ื—ืœืง ืžื”ื ืชื•ื ื™ื ื”ื–ื”ื™ื ืœื—ืœื•ื˜ื™ืŸ ืžื’ื™ืข ืฉื•ื‘.

ืื™ืš ื ืคื˜ืจื™ื ืžื”ืžืื’ืจ ืžืฉื‘ื˜ื™ื ืžื™ื•ืชืจื™ื?

ื›ืฉืค"ืง ืื™ื ื• ืขื•ื–ืจ

ื”ื“ืจืš ื”ืงืœื” ื‘ื™ื•ืชืจ ื”ื™ื ืœืžื ื•ืข ืžืฆื‘ ื›ื–ื” ืžืœื›ืชื—ื™ืœื”. ืœื“ื•ื’ืžื”, ื’ืœื’ืœ ืืช PRIMARY KEY. ืื‘ืœ ื–ื” ืœื ืชืžื™ื“ ืืคืฉืจื™ ื‘ืœื™ ืœื”ื’ื“ื™ืœ ืืช ื ืคื— ื”ื ืชื•ื ื™ื ื”ืžืื•ื—ืกื ื™ื.

ืœื“ื•ื’ืžื”, ืื ื”ื“ื™ื•ืง ืฉืœ ืžืขืจื›ืช ื”ืžืงื•ืจ ื’ื‘ื•ื” ืžื”ื“ื™ื•ืง ืฉืœ ื”ืฉื“ื” ื‘ืžืกื“ ื”ื ืชื•ื ื™ื:

metric   | ts                  | data
--------------------------------------------------
cpu.busy | 2019-12-20 00:00:00 | {"value" : 12.34}
cpu.busy | 2019-12-20 00:00:01 | {"value" : 10}
cpu.busy | 2019-12-20 00:00:01 | {"value" : 11.2}
cpu.busy | 2019-12-20 00:00:03 | {"value" : 15.7}

ื”ืื ืฉืžืชื ืœื‘? ื”ืกืคื™ืจื” ืœืื—ื•ืจ ื‘ืžืงื•ื 00:00:02 ื ืจืฉืžื” ื‘ืžืกื“ ื”ื ืชื•ื ื™ื ืขื ts ืฉื ื™ื” ืงื•ื“ื ืœื›ืŸ, ืื‘ืœ ื ืฉืืจื” ื“ื™ ืชืงืคื” ืžื ืงื•ื“ืช ืžื‘ื˜ ืฉืœ ื™ื™ืฉื•ื (ืื—ืจื™ ื”ื›ืœ, ืขืจื›ื™ ื”ื ืชื•ื ื™ื ืฉื•ื ื™ื!).

ื›ืžื•ื‘ืŸ ืฉืืชื” ื™ื›ื•ืœ ืœืขืฉื•ืช ืืช ื–ื” PK(ืžื˜ืจื™, ts) - ืื‘ืœ ืื– ื ืงื‘ืœ ื”ืชื ื’ืฉื•ื™ื•ืช ื”ื›ื ืกื” ืขื‘ื•ืจ ื ืชื•ื ื™ื ื—ื•ืงื™ื™ื.

ื™ื›ื•ืœ ืœืขืฉื•ืช PK(metric, ts, data) - ืื‘ืœ ื–ื” ื™ื’ื“ื™ืœ ืžืื•ื“ ืืช ื”ื ืคื— ืฉืœื•, ืฉืœื ื ืฉืชืžืฉ ื‘ื•.

ืœื›ืŸ, ื”ืืคืฉืจื•ืช ื”ื ื›ื•ื ื” ื‘ื™ื•ืชืจ ื”ื™ื ืœืขืฉื•ืช ืื™ื ื“ืงืก ืจื’ื™ืœ ืฉืื™ื ื• ื™ื™ื—ื•ื“ื™ (ืžื“ื“, ts) ื•ืœื”ืชืžื•ื“ื“ ืขื ื‘ืขื™ื•ืช ืœืื—ืจ ืžืขืฉื” ืื ื”ืŸ ืื›ืŸ ืžืชืขื•ืจืจื•ืช.

"ื”ืžืœื—ืžื” ื”ืงืœื•ื ื™ืช ื”ื—ืœื”"

ืงืจืชื” ืื™ื–ื•ืฉื”ื™ ืชืื•ื ื”, ื•ืขื›ืฉื™ื• ืื ื—ื ื• ืฆืจื™ื›ื™ื ืœื”ืฉืžื™ื“ ืืช ืจื™ืฉื•ืžื™ ื”ืฉื™ื‘ื•ื˜ ืžื”ืฉื•ืœื—ืŸ.

ื ื™ืงื•ื™ ืจืฉื•ืžื•ืช ืฉื™ื‘ื•ื˜ ืžื˜ื‘ืœื” ืœืœื PK

ื‘ื•ืื• ื ืขืฆื‘ ืืช ื”ื ืชื•ื ื™ื ื”ืžืงื•ืจื™ื™ื:

CREATE TABLE tbl(k text, v integer);

INSERT INTO tbl
VALUES
  ('a', 1)
, ('a', 3)
, ('b', 2)
, ('b', 2) -- oops!
, ('c', 3)
, ('c', 3) -- oops!!
, ('c', 3) -- oops!!
, ('d', 4)
, ('e', 5)
;

ื›ืืŸ ื”ื™ื“ ืฉืœื ื• ืจืขื“ื” ืฉืœื•ืฉ ืคืขืžื™ื, Ctrl+V ื ืชืงืขื”, ื•ืขื›ืฉื™ื•...

ืจืืฉื™ืช, ื‘ื•ืื• ื ื‘ื™ืŸ ืฉื”ืฉื•ืœื—ืŸ ืฉืœื ื• ื™ื›ื•ืœ ืœื”ื™ื•ืช ื’ื“ื•ืœ ืžืื•ื“, ืื– ืื—ืจื™ ืฉืžืฆืื ื• ืืช ื›ืœ ื”ืฉื™ื‘ื•ื˜ื™ื, ืžื•ืžืœืฅ ืœื ื• ืžืžืฉ "ืœืชืงื•ืข ื‘ืืฆื‘ืข" ื›ื“ื™ ืœืžื—ื•ืง ืจืฉื•ืžื•ืช ืกืคืฆื™ืคื™ื•ืช ืžื‘ืœื™ ืœื—ืคืฉ ืื•ืชืŸ ืžื—ื“ืฉ.

ื•ื™ืฉ ื“ืจืš ื›ื–ื• - ื–ื• ื”ืชื™ื™ื—ืกื•ืช ืขืœ ื™ื“ื™ ctid, ื”ืžื–ื”ื” ื”ืคื™ื–ื™ ืฉืœ ืจืฉื•ืžื” ืกืคืฆื™ืคื™ืช.

ื›ืœื•ืžืจ, ืงื•ื“ื ื›ืœ, ืื ื—ื ื• ืฆืจื™ื›ื™ื ืœืืกื•ืฃ ืืช ctid ืฉืœ ืจืฉื•ืžื•ืช ื‘ื”ืงืฉืจ ืฉืœ ื”ืชื•ื›ืŸ ื”ืžืœื ืฉืœ ืฉื•ืจืช ื”ื˜ื‘ืœื”. ื”ืืคืฉืจื•ืช ื”ืคืฉื•ื˜ื” ื‘ื™ื•ืชืจ ื”ื™ื ืœื”ื˜ื™ืœ ืืช ื›ืœ ื”ืฉื•ืจื” ืœื˜ืงืกื˜:

SELECT
  T::text
, array_agg(ctid) ctids
FROM
  tbl T
GROUP BY
  1;

t     | ctids
---------------------------------
(e,5) | {"(0,9)"}
(d,4) | {"(0,8)"}
(c,3) | {"(0,5)","(0,6)","(0,7)"}
(b,2) | {"(0,3)","(0,4)"}
(a,3) | {"(0,2)"}
(a,1) | {"(0,1)"}

ื”ืื ืืคืฉืจ ืœื ืœืœื”ืง?ื‘ืื•ืคืŸ ืขืงืจื•ื ื™, ื–ื” ืืคืฉืจื™ ื‘ืจื•ื‘ ื”ืžืงืจื™ื. ืขื“ ืฉืชืชื—ื™ืœ ืœื”ืฉืชืžืฉ ื‘ืฉื“ื•ืช ื‘ื˜ื‘ืœื” ื–ื• ืกื•ื’ื™ื ืœืœื ืžืคืขื™ืœ ืฉื•ื•ื™ื•ืŸ:

CREATE TABLE tbl(k text, v integer, x point);
SELECT
  array_agg(ctid) ctids
FROM
  tbl T
GROUP BY
  T;
-- ERROR:  could not identify an equality operator for type tbl

ื›ืŸ, ืื ื—ื ื• ืจื•ืื™ื ืžื™ื“ ืฉืื ื™ืฉ ื™ื•ืชืจ ืžืขืจืš ืื—ื“ ื‘ืžืขืจืš, ื›ืœ ืืœื” ื”ื ืฉื™ื‘ื•ื˜ื™ื. ื‘ื•ืื• ืคืฉื•ื˜ ื ืฉืื™ืจ ืื•ืชื:

SELECT
  unnest(ctids[2:])
FROM
  (
    SELECT
      array_agg(ctid) ctids
    FROM
      tbl T
    GROUP BY
      T::text
  ) T;

unnest
------
(0,6)
(0,7)
(0,4)

ืœืžื™ ืฉืื•ื”ื‘ ืœื›ืชื•ื‘ ืงืฆืจ ื™ื•ืชืจืืชื” ื™ื›ื•ืœ ื’ื ืœื›ืชื•ื‘ ืืช ื–ื” ื›ืš:

SELECT
  unnest((array_agg(ctid))[2:])
FROM
  tbl T
GROUP BY
  T::text;

ืžื›ื™ื•ื•ืŸ ืฉื”ืขืจืš ืฉืœ ื”ืžื—ืจื•ื–ืช ื‘ืกื™ื“ืจื” ืขืฆืžื” ืœื ืžืขื ื™ื™ืŸ ืื•ืชื ื•, ืคืฉื•ื˜ ื–ืจืงื ื• ืื•ืชื• ืžื”ืขืžื•ื“ื•ืช ื”ืžื•ื—ื–ืจื•ืช ืฉืœ ื”ืžืฉื ื”.

ื ื•ืชืจ ืจืง ืขื•ื“ ืžืขื˜ ืœืขืฉื•ืช - ืœื’ืจื•ื ืœ-DELETE ืœื”ืฉืชืžืฉ ื‘ืกื˜ ืฉืงื™ื‘ืœื ื•:

DELETE FROM
  tbl
WHERE
  ctid = ANY(ARRAY(
    SELECT
      unnest(ctids[2:])
    FROM
      (
        SELECT
          array_agg(ctid) ctids
        FROM
          tbl T
        GROUP BY
          T::text
      ) T
  )::tid[]);

ื‘ื•ืื• ื ื‘ื“ื•ืง ืืช ืขืฆืžื ื•:

ื ื™ืงื•ื™ ืจืฉื•ืžื•ืช ืฉื™ื‘ื•ื˜ ืžื˜ื‘ืœื” ืœืœื PK
[ื”ืกืชื›ืœ ื‘-explain.tensor.ru]

ื›ืŸ, ื”ื›ืœ ื ื›ื•ืŸ: 3 ื”ืจืฉื•ืžื•ืช ืฉืœื ื• ื ื‘ื—ืจื• ืœ-Seq Scan ื”ื™ื—ื™ื“ื” ืฉืœ ื›ืœ ื”ื˜ื‘ืœื”, ื•ื”ืฆื•ืžืช Delete ืฉื™ืžืฉ ืœื—ื™ืคื•ืฉ ื ืชื•ื ื™ื ืžืขื‘ืจ ื‘ื•ื“ื“ ืขื Tid Scan:

->  Tid Scan on tbl (actual time=0.050..0.051 rows=3 loops=1)
      TID Cond: (ctid = ANY ($0))

ืื ื ื™ืงื™ืช ื”ืจื‘ื” ืจืฉื•ืžื•ืช, ืืœ ืชืฉื›ื— ืœื”ืคืขื™ืœ ืืช VACUUM ANALYZE.

ื‘ื•ืื• ื ื‘ื“ื•ืง ืื ื™ืฉ ื˜ื‘ืœื” ื’ื“ื•ืœื” ื™ื•ืชืจ ื•ืขื ืžืกืคืจ ื’ื“ื•ืœ ื™ื•ืชืจ ืฉืœ ื›ืคื™ืœื•ื™ื•ืช:

TRUNCATE TABLE tbl;

INSERT INTO tbl
SELECT
  chr(ascii('a'::text) + (random() * 26)::integer) k -- a..z
, (random() * 100)::integer v -- 0..99
FROM
  generate_series(1, 10000) i;

ื ื™ืงื•ื™ ืจืฉื•ืžื•ืช ืฉื™ื‘ื•ื˜ ืžื˜ื‘ืœื” ืœืœื PK
[ื”ืกืชื›ืœ ื‘-explain.tensor.ru]

ืื–, ื”ืฉื™ื˜ื” ืขื•ื‘ื“ืช ื‘ื”ืฆืœื—ื”, ืื‘ืœ ื™ืฉ ืœื”ืฉืชืžืฉ ื‘ื” ื‘ื–ื”ื™ืจื•ืช ืžืกื•ื™ืžืช. ื›ื™ ืขืœ ื›ืœ ืจืฉื•ืžื” ืฉื ืžื—ืงืช, ื™ืฉ ื“ืฃ ื ืชื•ื ื™ื ืื—ื“ ืฉื ืงืจื ื‘-Tid Scan, ื•ืื—ื“ ื‘-Delete.

ืžืงื•ืจ: www.habr.com

ื”ื•ืกืคืช ืชื’ื•ื‘ื”