Share bayanan clone daga tebur ba tare da PK ba

Akwai yanayi lokacin zuwa tebur ba tare da maɓalli na farko ba ko wasu fihirisar keɓantacce, saboda sa ido, an haɗa cikakkun abubuwan da aka riga aka ambata.

Share bayanan clone daga tebur ba tare da PK ba

Misali, ana rubuta ƙimar ma'auni na lokaci-lokaci a cikin PostgreSQL ta amfani da rafin COPY, sannan akwai gazawar kwatsam, kuma wani ɓangare na bayanan iri ɗaya ya sake zuwa.

Yadda za a kawar da bayanan bayanan da ba dole ba clones?

Lokacin da PK ba mai taimako ba ne

Hanya mafi sauki ita ce hana irin wannan yanayin faruwa tun da farko. Misali, mirgine maɓalli na PRIMARY. Amma wannan ba koyaushe yana yiwuwa ba tare da ƙara yawan bayanan da aka adana ba.

Misali, idan daidaiton tsarin tushen ya fi ingancin filin da ke cikin rumbun adana bayanai:

metric   | ts                  | data
--------------------------------------------------
cpu.busy | 2019-12-20 00:00:00 | {"value" : 12.34}
cpu.busy | 2019-12-20 00:00:01 | {"value" : 10}
cpu.busy | 2019-12-20 00:00:01 | {"value" : 11.2}
cpu.busy | 2019-12-20 00:00:03 | {"value" : 15.7}

Kun lura? The Countdown maimakon 00:00:02 an yi rikodin a cikin database tare da ts da biyu a baya, amma ya kasance mai inganci daga ra'ayi na aikace-aikacen (bayan haka, ƙimar bayanan sun bambanta!).

Tabbas zaka iya PK (metric, ts) - amma za mu sami rikice-rikicen shigarwa don ingantaccen bayanai.

Iya yi PK (metric, ts, bayanai) - amma wannan zai kara girma sosai, wanda ba za mu yi amfani da shi ba.

Sabili da haka, zaɓi mafi daidai shine yin ƙididdiga na yau da kullun marasa ƙima (metric, ts) da kuma magance matsalolin bayan gaskiyar idan sun taso.

"Yakin clonic ya fara"

Wani irin haɗari ya faru, kuma yanzu dole ne mu lalata bayanan clone daga tebur.

Share bayanan clone daga tebur ba tare da PK ba

Bari mu gwada ainihin bayanan:

CREATE TABLE tbl(k text, v integer);

INSERT INTO tbl
VALUES
  ('a', 1)
, ('a', 3)
, ('b', 2)
, ('b', 2) -- oops!
, ('c', 3)
, ('c', 3) -- oops!!
, ('c', 3) -- oops!!
, ('d', 4)
, ('e', 5)
;

Anan hannunmu yayi rawar jiki sau uku, Ctrl+V ya makale, kuma yanzu...

Da farko, bari mu fahimci cewa tebur ɗinmu na iya zama babba, don haka bayan mun sami duk clones, yana da kyau mu fara “ƙulla yatsa” don sharewa. takamaiman bayanai ba tare da sake bincika su ba.

Kuma akwai irin wannan hanya - wannan magana ta ctid, mai gano zahiri na takamaiman rikodin.

Wato, da farko, muna buƙatar tattara ctid na records a cikin mahallin cikakken abun ciki na jere na tebur. Zaɓin mafi sauƙi shine a jefa layin gaba ɗaya cikin rubutu:

SELECT
  T::text
, array_agg(ctid) ctids
FROM
  tbl T
GROUP BY
  1;

t     | ctids
---------------------------------
(e,5) | {"(0,9)"}
(d,4) | {"(0,8)"}
(c,3) | {"(0,5)","(0,6)","(0,7)"}
(b,2) | {"(0,3)","(0,4)"}
(a,3) | {"(0,2)"}
(a,1) | {"(0,1)"}

Shin zai yiwu ba a jefa ba?A ka'ida, yana yiwuwa a mafi yawan lokuta. Har sai kun fara amfani da filayen cikin wannan tebur iri ba tare da ma'aikacin daidaito ba:

CREATE TABLE tbl(k text, v integer, x point);
SELECT
  array_agg(ctid) ctids
FROM
  tbl T
GROUP BY
  T;
-- ERROR:  could not identify an equality operator for type tbl

Ee, nan da nan mun ga cewa idan akwai shigarwa fiye da ɗaya a cikin tsararru, waɗannan duka clones ne. Mu bar su kawai:

SELECT
  unnest(ctids[2:])
FROM
  (
    SELECT
      array_agg(ctid) ctids
    FROM
      tbl T
    GROUP BY
      T::text
  ) T;

unnest
------
(0,6)
(0,7)
(0,4)

Ga masu son rubuta gajartaHakanan zaka iya rubuta shi kamar haka:

SELECT
  unnest((array_agg(ctid))[2:])
FROM
  tbl T
GROUP BY
  T::text;

Tun da darajar kirtani da kanta ba ta da ban sha'awa a gare mu, kawai mun jefa shi daga cikin ginshiƙan da aka dawo da su.

Akwai saura kaɗan a yi - yi DELETE amfani da saitin da muka samu:

DELETE FROM
  tbl
WHERE
  ctid = ANY(ARRAY(
    SELECT
      unnest(ctids[2:])
    FROM
      (
        SELECT
          array_agg(ctid) ctids
        FROM
          tbl T
        GROUP BY
          T::text
      ) T
  )::tid[]);

Mu duba kanmu:

Share bayanan clone daga tebur ba tare da PK ba
[duba bayanin.tensor.ru]

Ee, duk abin da yake daidai: an zaɓi bayanan mu 3 don Seq Scan kawai na duka tebur, kuma an yi amfani da kumburin Share don bincika bayanai. wucewa guda tare da Tid Scan:

->  Tid Scan on tbl (actual time=0.050..0.051 rows=3 loops=1)
      TID Cond: (ctid = ANY ($0))

Idan kun share bayanai da yawa, kar a manta da gudanar da VACUUM ANALYZE.

Bari mu bincika tebur mafi girma kuma tare da adadi mai yawa na kwafi:

TRUNCATE TABLE tbl;

INSERT INTO tbl
SELECT
  chr(ascii('a'::text) + (random() * 26)::integer) k -- a..z
, (random() * 100)::integer v -- 0..99
FROM
  generate_series(1, 10000) i;

Share bayanan clone daga tebur ba tare da PK ba
[duba bayanin.tensor.ru]

Don haka, hanyar tana aiki cikin nasara, amma dole ne a yi amfani da ita tare da taka tsantsan. Domin duk wani faifan da aka goge, akwai shafin bayanai guda daya da ake karantawa a cikin Tid Scan, daya kuma a cikin Delete.

source: www.habr.com

Add a comment