Sauya EAV tare da JSONB a cikin PostgreSQL

TL; DR: JSONB na iya sauƙaƙa haɓaka tsarin tsarin bayanai ba tare da sadaukar da aikin tambaya ba.

Gabatarwar

Bari mu ba da misali na ƙila ɗaya daga cikin tsofaffin shari'o'in amfani a cikin duniyar bayanan alaƙa (database): muna da mahalli, kuma muna buƙatar adana wasu kaddarorin (halayen) na wannan mahaɗin. Amma ba duka al'amura na iya samun saitin kaddarori iri ɗaya ba, kuma ana iya ƙara ƙarin kaddarorin nan gaba.

Hanya mafi sauƙi don magance wannan matsalar ita ce ƙirƙirar ginshiƙi a cikin tebur ɗin bayanai don kowane ƙimar kadara, kuma kawai cika waɗanda ake buƙata don takamaiman mahallin misali. Mai girma! An warware matsalar... har sai teburin ku ya ƙunshi miliyoyin bayanai kuma kuna buƙatar ƙara sabon rikodin.

Yi la'akari da tsarin EAV (Halaye-Hanyoyin-darajar), yana faruwa sau da yawa. Tebu ɗaya yana ƙunshe da ƙungiyoyi (rikodi), wani tebur ɗin yana ɗauke da sunaye na dukiya (halaye), kuma tebur na uku yana haɗa ƙungiyoyi tare da halayensu kuma ya ƙunshi ƙimar waɗannan sifofin don mahaɗan na yanzu. Wannan yana ba ku ikon samun nau'ikan kaddarorin daban-daban don abubuwa daban-daban, da kuma ƙara kaddarorin akan tashi ba tare da canza tsarin bayanai ba.

Koyaya, ba zan rubuta wannan post ɗin ba idan babu wasu abubuwan da ba su da kyau ga tsarin EVA. Don haka, alal misali, don samun ƙungiyoyi ɗaya ko fiye waɗanda ke da sifa 1 kowanne, ana buƙatar haɗawa 2 a cikin tambayar: na farko shine haɗawa tare da tebur na sifa, na biyu shine haɗawa tare da tebur ƙimar. Idan mahaluži yana da halaye 2, to ana buƙatar haɗin 4! Bugu da ƙari, ana adana duk sifofi yawanci azaman kirtani, wanda ke haifar da nau'in simintin gyare-gyare don duka sakamakon da jumlar INA. Idan ka rubuta tambayoyi da yawa, to wannan yana da ɓarna sosai dangane da amfani da albarkatu.

Duk da waɗannan kurakuran bayyane, an daɗe ana amfani da EAV don magance waɗannan nau'ikan matsalolin. Waɗannan gazawa ne da babu makawa, kuma babu wata hanya mafi kyau.
Amma sai wani sabon "fasaha" ya bayyana a PostgreSQL ...

An fara da PostgreSQL 9.4, an ƙara nau'in bayanan JSONB don adana bayanan binary na JSON. Kodayake adana JSON a cikin wannan tsari yawanci yana ɗaukar sarari da lokaci kaɗan fiye da rubutu na JSON a sarari, yin ayyuka akan sa yana da sauri sosai. JSONB kuma tana goyan bayan firikwensin, wanda ke sa tambayoyi ma da sauri.

Nau'in bayanan JSONB yana ba mu damar maye gurbin tsarin EAV mai ban tsoro ta ƙara ginshiƙin JSONB ɗaya kawai zuwa teburin mahaɗan mu, yana sauƙaƙa ƙira ta bayanai. Amma da yawa suna jayayya cewa wannan ya kamata ya kasance tare da raguwar yawan aiki ... Shi ya sa na rubuta wannan labarin.

Kafa bayanan gwaji

Don wannan kwatancen, na ƙirƙiri bayanan akan sabon shigarwa na PostgreSQL 9.5 akan ginin $80 DigitalOcean Ubuntu 14.04. Bayan saita wasu sigogi a postgresql.conf na gudu wannan Rubutun ta amfani da psql. An ƙirƙiri teburin masu zuwa don gabatar da bayanai a cikin sigar EAV:

CREATE TABLE entity ( 
  id           SERIAL PRIMARY KEY, 
  name         TEXT, 
  description  TEXT
);
CREATE TABLE entity_attribute (
  id          SERIAL PRIMARY KEY, 
  name        TEXT
);
CREATE TABLE entity_attribute_value (
  id                  SERIAL PRIMARY KEY, 
  entity_id           INT    REFERENCES entity(id), 
  entity_attribute_id INT    REFERENCES entity_attribute(id), 
  value               TEXT
);

A ƙasa akwai tebur inda za a adana bayanai iri ɗaya, amma tare da halaye a cikin nau'in JSONB - Properties.

CREATE TABLE entity_jsonb (
  id          SERIAL PRIMARY KEY, 
  name        TEXT, 
  description TEXT,
  properties  JSONB
);

Ga alama ya fi sauƙi, ko ba haka ba? Sa'an nan kuma an ƙara shi zuwa teburin mahaɗan (mahalu .i & mahaluki_jsonb) 10 miliyan records, kuma bisa ga haka, tebur ya cika da bayanai iri ɗaya ta amfani da tsarin EAV da tsarin tare da shafi na JSONB - entity_jsonb.property. Don haka, mun sami nau'ikan bayanai daban-daban a cikin duk saitin kaddarorin. Misalin bayanai:

{
  id:          1
  name:        "Entity1"
  description: "Test entity no. 1"
  properties:  {
    color:        "red"
    lenght:       120
    width:        3.1882420
    hassomething: true
    country:      "Belgium"
  } 
}

Don haka yanzu muna da bayanai iri ɗaya don zaɓuɓɓukan biyu. Bari mu fara kwatanta aiwatarwa a wurin aiki!

Sauƙaƙe ƙirar ku

A baya an bayyana cewa an sauƙaƙe ƙirar bayanan bayanai: tebur ɗaya, ta amfani da ginshiƙin JSONB don kaddarorin, maimakon amfani da tebur uku don EAV. Amma ta yaya wannan yake nunawa a cikin buƙatun? Ana sabunta kadarorin mahalli guda kamar haka:

-- EAV
UPDATE entity_attribute_value 
SET value = 'blue' 
WHERE entity_attribute_id = 1 
  AND entity_id = 120;

-- JSONB
UPDATE entity_jsonb 
SET properties = jsonb_set(properties, '{"color"}', '"blue"') 
WHERE id = 120;

Kamar yadda kake gani, buƙatar ƙarshe ba ta yi kama da sauƙi ba. Don sabunta ƙimar kadara a cikin abin JSONB dole ne mu yi amfani da aikin jsonb_set(), kuma yakamata mu wuce sabon ƙimar mu azaman abun JSONB. Koyaya, ba ma buƙatar sanin kowane mai ganowa a gaba. Duban misalin EAV, muna buƙatar sanin mahallin_id da mahallin_attribute_id don yin sabuntawa. Idan kana son sabunta dukiya a cikin ginshiƙi na JSONB bisa sunan abu, to duk anyi shi cikin layi ɗaya mai sauƙi.

Yanzu bari mu zaɓi mahallin da muka sabunta bisa sabon launi:

-- EAV
SELECT e.name 
FROM entity e 
  INNER JOIN entity_attribute_value eav ON e.id = eav.entity_id
  INNER JOIN entity_attribute ea ON eav.entity_attribute_id = ea.id
WHERE ea.name = 'color' AND eav.value = 'blue';

-- JSONB
SELECT name 
FROM entity_jsonb 
WHERE properties ->> 'color' = 'blue';

Ina tsammanin za mu iya yarda cewa na biyu ya fi guntu (babu shiga!), Saboda haka mafi karantawa. JSONB yayi nasara anan! Muna amfani da ma'aikacin JSON ->> don samun launi azaman ƙimar rubutu daga abin JSONB. Hakanan akwai hanya ta biyu don cimma sakamako iri ɗaya a cikin ƙirar JSONB ta amfani da mai aiki @>:

-- JSONB 
SELECT name 
FROM entity_jsonb 
WHERE properties @> '{"color": "blue"}';

Wannan ya ɗan fi rikitarwa: muna bincika don ganin ko abin JSON a cikin ginshiƙin kaddarorinsa ya ƙunshi wani abu da ke hannun dama na mai aiki @>. Karancin karantawa, mafi fa'ida (duba ƙasa).

Bari mu sanya amfani da JSONB ya fi sauƙi lokacin da kuke buƙatar zaɓar kaddarorin da yawa a lokaci ɗaya. Wannan shine inda tsarin JSONB ya shigo da gaske: kawai muna zaɓar kaddarorin azaman ƙarin ginshiƙai a cikin saitin sakamakonmu ba tare da buƙatar haɗin kai ba:

-- JSONB 
SELECT name
  , properties ->> 'color'
  , properties ->> 'country'
FROM entity_jsonb 
WHERE id = 120;

Tare da EAV zaku buƙaci haɗin 2 don kowane kadarorin da kuke son tambaya. A ganina, tambayoyin da ke sama suna nuna babban sauƙaƙawa a ƙirar bayanai. Duba ƙarin misalan yadda ake rubuta tambayoyin JSONB, kuma a ciki wannan post.
Yanzu lokaci ya yi da za a yi magana game da wasan kwaikwayo.

Yawan aiki

Don kwatanta aikin da na yi amfani da shi BAYYANA BINCIKE a cikin tambayoyin, don ƙididdige lokacin aiwatarwa. An aiwatar da kowace tambaya aƙalla sau uku saboda mai tsara tambaya ya ɗauki tsawon lokaci na farko. Na farko na gudanar da tambayoyin ba tare da wani fihirisa ba. Babu shakka, wannan fa'ida ce ta JSONB, tunda haɗin da ake buƙata don EAV ba zai iya amfani da fihirisa ba (ba a ƙididdige filayen maɓalli na ƙasashen waje ba). Bayan wannan na ƙirƙiri fihirisa akan ginshiƙan maɓalli na ƙasashen waje guda 2 na teburin ƙimar EAV, da kuma fihirisa. Jin don shafi na JSONB.

Sabunta bayanan sun nuna sakamako masu zuwa dangane da lokaci (a cikin ms). Lura cewa ma'aunin logarithmic ne:

Sauya EAV tare da JSONB a cikin PostgreSQL

Mun ga cewa JSONB yana da yawa (> 50000-x) sauri fiye da EAV idan ba ku yi amfani da fihirisa ba, saboda dalilin da aka bayyana a sama. Lokacin da muka nuna ginshiƙai tare da maɓallan farko, bambancin ya kusan ɓacewa, amma JSONB har yanzu yana da sauri sau 1,3 fiye da EAV. Lura cewa fihirisar kan ginshiƙin JSONB ba ta da wani tasiri a nan tunda ba ma amfani da ginshiƙin kadara a cikin ma'aunin kimantawa.

Don zaɓar bayanai dangane da ƙimar kadara, muna samun sakamako masu zuwa (ma'auni na yau da kullun):

Sauya EAV tare da JSONB a cikin PostgreSQL

Kuna iya lura cewa JSONB ta sake yin aiki da sauri fiye da EAV ba tare da fihirisa ba, amma lokacin da EAV tare da fihirisa, har yanzu yana aiki da sauri fiye da JSONB. Amma sai na ga cewa lokutan tambayoyin JSONB iri ɗaya ne, wannan ya sa ni ga gaskiyar cewa alamun GIN ba sa aiki. A bayyane lokacin da kake amfani da fihirisar GIN akan ginshiƙi mai yawan jama'a, yana ɗaukar tasiri ne kawai lokacin amfani da hada da afareta @>. Na yi amfani da wannan a cikin sabon gwaji kuma yana da babban tasiri akan lokacin: kawai 0,153ms! Wannan shine sau 15000 da sauri fiye da EAV kuma sau 25000 cikin sauri fiye da ma'aikacin ->>.

Ina tsammanin ya isa sauri!

Girman tebur Database

Bari mu kwatanta girman tebur don hanyoyin biyu. A cikin psql za mu iya nuna girman duk allunan da fihirisa ta amfani da umarnin dti+

Sauya EAV tare da JSONB a cikin PostgreSQL

Don tsarin EAV, girman tebur yana kusa da 3068 MB kuma alamun har zuwa 3427 MB don jimlar 6,43 GB. Hanyar JSONB tana amfani da 1817 MB don tebur da 318 MB don ma'auni, wanda shine 2,08 GB. Yana juya sau 3 kasa! Wannan gaskiyar ta ɗan ba ni mamaki saboda muna adana sunayen dukiya a cikin kowane abu na JSONB.

Amma har yanzu, lambobin suna magana da kansu: a cikin EAV muna adana maɓallan ƙasashen waje lamba 2 a kowace ƙimar sifa, yana haifar da 8 bytes na ƙarin bayanai. Bugu da ƙari, EAV tana adana duk ƙimar dukiya azaman rubutu, yayin da JSONB za ta yi amfani da lambobi da ƙimar boolean a ciki inda zai yiwu, yana haifar da ƙaramin sawun.

Sakamakon

Gabaɗaya, Ina tsammanin adana kaddarorin mahalli a cikin tsarin JSONB na iya sa ƙira da kiyaye bayananku cikin sauƙi. Idan kuna gudanar da tambayoyi da yawa, to, adana duk abin da ke cikin tebur ɗaya kamar mahallin zai yi aiki sosai da inganci. Kuma gaskiyar cewa wannan yana sauƙaƙe hulɗar da ke tsakanin bayanai ya riga ya zama ƙari, amma bayanan da aka samu ya ragu sau 3 a girma.

Har ila yau, dangane da gwaje-gwajen da aka yi, za mu iya yanke shawarar cewa asarar aikin ba shi da mahimmanci. A wasu lokuta, JSONB ya fi sauri fiye da EAV, yana sa ya fi kyau. Duk da haka, wannan ma'auni ba shakka ba ya rufe dukkan bangarori (misali mahaɗan da ke da adadi mai yawa na kaddarorin, haɓakar haɓakar yawan kaddarorin bayanan da ke akwai,...), don haka idan kuna da wasu shawarwari kan yadda ake inganta su. , don Allah jin kyauta don barin a cikin sharhi!

source: www.habr.com

Add a comment