Nadiifinta diiwaanada clone ee miiska bilaa PK

Waxaa jira xaalado marka miis aan furaha aasaasiga ah lahayn ama tusmooyin kale oo gaar ah, kormeerka awgeed, xidhmooyin dhamaystiran oo diiwaanadii hore u jiray ayaa lagu daray.

Nadiifinta diiwaanada clone ee miiska bilaa PK

Tusaale ahaan, qiyamka mitirka taariikhiga ah waxaa lagu qoraa PostgreSQL iyadoo la adeegsanayo qulqulka COPY, ka dibna waxaa jira guuldarro lama filaan ah, iyo qayb ka mid ah xogta gebi ahaanba isku midka ah ayaa mar kale timid.

Sidee looga takhalusaa kaydka xogta ee clones aan loo baahnayn?

Marka PK aanu ahayn caawiye

Sida ugu fudud ayaa ah in marka hore laga hortago in xaaladdan oo kale ay dhacdo. Tusaale ahaan, duub furaha aasaasiga ah. Laakiin tani mar walba suurtagal maaha iyada oo aan la kordhin mugga xogta la kaydiyay.

Tusaale ahaan, haddii saxnaanta nidaamka isha ay ka sarreyso saxnaanta goobta ku jirta xogta:

metric   | ts                  | data
--------------------------------------------------
cpu.busy | 2019-12-20 00:00:00 | {"value" : 12.34}
cpu.busy | 2019-12-20 00:00:01 | {"value" : 10}
cpu.busy | 2019-12-20 00:00:01 | {"value" : 11.2}
cpu.busy | 2019-12-20 00:00:03 | {"value" : 15.7}

Ma dareentay? Tirinta halkii 00:00:02 ayaa lagu duubay xogta iyadoo ts ilbiriqsi ka hor, laakiin si fiican ayey u ansaxisay aragtida codsiga (ka dib oo dhan, qiimaha xogtu way kala duwan yihiin!).

Dabcan waad sameyn kartaa PK (metric, ts) - laakiin markaa waxaan heli doonaa isku dhacyada gelinta xogta saxda ah.

samayn kara PK (metric, ts, xogta) - laakiin tani waxay si weyn u kordhin doontaa mugga, taas oo aanan isticmaali doonin.

Sidaa darteed, ikhtiyaarka ugu saxsan waa in la sameeyo tilmaame caadi ah oo aan caadi ahayn (metric, ts) oo la tacaal dhibaatooyinka ka dib xaqiiqda haddii ay soo baxaan.

"Dagaalkii clonic ayaa bilaabmay"

Nooc ka mid ah shil ayaa dhacay, oo hadda waa inaan burburinaa diiwaannada clone ee miiska.

Nadiifinta diiwaanada clone ee miiska bilaa PK

Aynu qaabeyno xogta asalka ah:

CREATE TABLE tbl(k text, v integer);

INSERT INTO tbl
VALUES
  ('a', 1)
, ('a', 3)
, ('b', 2)
, ('b', 2) -- oops!
, ('c', 3)
, ('c', 3) -- oops!!
, ('c', 3) -- oops!!
, ('d', 4)
, ('e', 5)
;

Halkan gacantayadu saddex jeer ayay gariirtay, Ctrl+V ayaa ku dhegtay, oo hadda...

Marka hore, aynu fahamno in miiskayagu uu noqon karo mid aad u weyn, markaa ka dib markaan helno dhammaan clones-ka, waxaa lagu talinayaa in aan si dhab ah u "fartayaga u qaadno" si aan u tirtirno diiwaanno gaar ah iyada oo aan dib loo raadin.

Oo waxaa jira hab sida - this wax ka qabashada by ctid, aqoonsiga jireed ee diiwaan gaar ah.

Taasi waa, marka ugu horeysa, waxaan u baahanahay inaan aruurinno ctid of records in macnaha guud ee nuxurka buuxa ee safka miiska. Xulashada ugu fudud waa in lagu tuuro dhammaan xariiqda qoraalka:

SELECT
  T::text
, array_agg(ctid) ctids
FROM
  tbl T
GROUP BY
  1;

t     | ctids
---------------------------------
(e,5) | {"(0,9)"}
(d,4) | {"(0,8)"}
(c,3) | {"(0,5)","(0,6)","(0,7)"}
(b,2) | {"(0,3)","(0,4)"}
(a,3) | {"(0,2)"}
(a,1) | {"(0,1)"}

Suurtagal ma tahay in la tuuro?Mabda 'ahaan, waa suurtagal inta badan kiisaska. Ilaa aad ka bilaabayso isticmaalka goobaha shaxdan noocyo aan lahayn hawlwadeen sinnaan:

CREATE TABLE tbl(k text, v integer, x point);
SELECT
  array_agg(ctid) ctids
FROM
  tbl T
GROUP BY
  T;
-- ERROR:  could not identify an equality operator for type tbl

Haa, waxaan isla markiiba aragnaa in haddii ay jiraan wax ka badan hal gelitaan oo ku jira soo diyaarinta, kuwani waa dhammaan clones. Aynu ka tagno iyaga:

SELECT
  unnest(ctids[2:])
FROM
  (
    SELECT
      array_agg(ctid) ctids
    FROM
      tbl T
    GROUP BY
      T::text
  ) T;

unnest
------
(0,6)
(0,7)
(0,4)

Kuwa jecel inay qoraan gaabanWaxaad sidoo kale u qori kartaa sidan:

SELECT
  unnest((array_agg(ctid))[2:])
FROM
  tbl T
GROUP BY
  T::text;

Mar haddii qiimaha xadhigga taxanaha ah laftiisu aanu xiiso noo lahayn, waxaanu si fudud uga soo tuurnay tiirarkii soo noqday ee subquery.

Wax yar baa ka haray in la sameeyo - ka dhig DELETE isticmaal qalabka aan helnay:

DELETE FROM
  tbl
WHERE
  ctid = ANY(ARRAY(
    SELECT
      unnest(ctids[2:])
    FROM
      (
        SELECT
          array_agg(ctid) ctids
        FROM
          tbl T
        GROUP BY
          T::text
      ) T
  )::tid[]);

Aan is hubinno:

Nadiifinta diiwaanada clone ee miiska bilaa PK
[fiiri sharaxaad.tensor.ru]

Haa, wax walba waa sax: 3da diiwan eeyagu waxa loo doortay hal Seq Scan ee miiska oo dhan, iyo Delete node waxa loo adeegsaday raadinta xogta hal baas oo leh Tid Scan:

->  Tid Scan on tbl (actual time=0.050..0.051 rows=3 loops=1)
      TID Cond: (ctid = ANY ($0))

Haddii aad nadiifisay diiwaanno badan, ha iloobin inaad socodsiiso VACUUM ANALYZE.

Aynu eegno miis ka weyn iyo tiro ka badan oo nuqullo ah:

TRUNCATE TABLE tbl;

INSERT INTO tbl
SELECT
  chr(ascii('a'::text) + (random() * 26)::integer) k -- a..z
, (random() * 100)::integer v -- 0..99
FROM
  generate_series(1, 10000) i;

Nadiifinta diiwaanada clone ee miiska bilaa PK
[fiiri sharaxaad.tensor.ru]

Markaa, habku si guul leh ayuu u shaqeeyaa, laakiin waa in si taxaddar leh loo isticmaalo. Sababtoo ah diiwaan kasta oo la tirtiro, waxaa jira hal bog oo xog ah oo lagu akhriyay Tid Scan, iyo mid ku jira Delete.

Source: www.habr.com

Add a comment