Cassandra. Osamwalira bwanji ngati mumangodziwa Oracle

Pa Habr.

Dzina langa ndine Misha Butrimov, ndikufuna ndikuuzeni pang'ono za Cassandra. Nkhani yanga idzakhala yothandiza kwa iwo omwe sanakumanepo ndi ma database a NoSQL - ili ndi zinthu zambiri zoyendetsera ntchito ndi zovuta zomwe muyenera kudziwa. Ndipo ngati simunawone china chilichonse kupatula Oracle kapena database ina iliyonse yaubale, zinthu izi zidzapulumutsa moyo wanu.

Cassandra ndi chiyani chabwino? Ndi database ya NoSQL yopangidwa popanda mfundo imodzi yolephera yomwe imakula bwino. Ngati mukufuna kuwonjezera ma terabytes angapo pa database, mumangowonjezera ma node ku mphete. Ikulitseni kupita kumalo ena a data? Onjezani node ku masango. Wonjezerani RPS yokonzedwa? Onjezani node ku masango. Zimagwiranso ntchito mosiyana.

Cassandra. Osamwalira bwanji ngati mumangodziwa Oracle

Ndi chiyani chinanso chomwe amachita bwino? Ndi za kusamalira zopempha zambiri. Koma nanga bwanji? 10, 20, 30, 40 zopempha zikwi pa sekondi sizochuluka. Zopempha 100 zikwi pa sekondi imodzi kuti mujambule - nayenso. Pali makampani omwe adanena kuti amasunga zopempha 2 miliyoni pamphindikati. Iwo mwina adzayenera kukhulupirira izo.

Ndipo kwenikweni, Cassandra ali ndi kusiyana kwakukulu kuchokera ku data yaubale - sizofanana nawo konse. Ndipo izi ndi zofunika kwambiri kukumbukira.

Sikuti zonse zomwe zimawoneka zofanana zimagwira ntchito mofanana

Tsiku lina mnzanga wina anabwera kwa ine ndikundifunsa kuti: "Nayi chinenero cha CQL Cassandra, ndipo ili ndi mawu osankhidwa, ili ndi kumene, ili ndi. Ndimalemba makalata ndipo sizikugwira ntchito. Chifukwa chiyani?". Kuchitira Cassandra ngati nkhokwe yaubale ndiyo njira yabwino yodzipha mwankhanza. Ndipo sindikulimbikitsa, ndizoletsedwa ku Russia. Inu mungopanga chinachake cholakwika.

Mwachitsanzo, kasitomala amabwera kwa ife n’kunena kuti: “Tiyeni tipange nkhokwe ya nkhani za pa TV, kapena nkhokwe ya kabuku ka maphikidwe. Tidzakhala ndi zakudya kumeneko kapena mndandanda wamasewera apawailesi yakanema ndi zisudzo momwemo. ” Timanena mosangalala kuti: “Tiyeni tizipita!” Ingotumizani ma byte awiri, zizindikiro zingapo ndipo mwamaliza, zonse zigwira ntchito mwachangu komanso modalirika. Ndipo zonse zili bwino mpaka makasitomala abwera ndikunena kuti amayi apakhomo nawonso akuthetsa vuto losiyana: ali ndi mndandanda wazinthu, ndipo amafuna kudziwa zomwe akufuna kuphika. Ndinu akufa.

Izi ndichifukwa choti Cassandra ndi database yosakanizidwa: nthawi imodzi imapereka mtengo wofunikira ndikusunga zidziwitso m'mizere yayikulu. Ku Java kapena Kotlin, zitha kufotokozedwa motere:

Map<RowKey, SortedMap<ColumnKey, ColumnValue>>

Ndiko kuti, mapu omwe alinso ndi mapu osanjidwa. Kiyi yoyamba pamapuwa ndi kiyi ya Row kapena makiyi a Partition - kiyi yogawa. Kiyi yachiwiri, yomwe ndi kiyi ya mapu osankhidwa kale, ndi kiyi ya Clustering.

Kuti tiwonetse kugawidwa kwa database, tiyeni tijambule mfundo zitatu. Tsopano muyenera kumvetsetsa momwe mungawonongere deta mu node. Chifukwa ngati tilowetsa chilichonse mu chimodzi (mwa njira, pakhoza kukhala chikwi, zikwi ziwiri, zisanu - monga momwe mukufunira), izi sizokhudza kugawa. Chifukwa chake, timafunikira ntchito yamasamu yomwe ingabweretse nambala. Nambala yokha, inti yayitali yomwe ingagwere mumitundu ina. Ndipo tidzakhala ndi node imodzi yomwe imayang'anira mzere umodzi, yachiwiri yachiwiri, yachisanu ndi nth.

Cassandra. Osamwalira bwanji ngati mumangodziwa Oracle

Nambala iyi imatengedwa pogwiritsa ntchito hashi, yomwe imagwiritsidwa ntchito pazomwe timatcha kiyi ya Partition. Ili ndiye gawo lomwe latchulidwa mu malangizo a makiyi a Pulayimale, ndipo iyi ndiye ndime yomwe idzakhale kiyi yoyamba komanso yofunikira kwambiri pamapu. Zimatsimikizira kuti ndi node iti yomwe idzalandira deta. Gome limapangidwa ku Cassandra ndi mawu ofanana ndi SQL:

CREATE TABLE users (
	user_id uu id,
	name text,
	year int,
	salary float,
	PRIMARY KEY(user_id)

)

Kiyi Yoyambira pankhaniyi ili ndi gawo limodzi, komanso ndi kiyi yogawa.

Kodi ogwiritsa ntchito athu azichita bwanji? Ena adzapita ku mfundo imodzi, ena kwa ena, ndi ena kwa lachitatu. Zotsatira zake ndi tebulo wamba la hashi, lomwe limadziwikanso kuti mapu, lomwe limadziwikanso kuti dikishonale ku Python, kapena dongosolo losavuta la Key value lomwe titha kuwerenga zikhalidwe zonse, kuwerenga ndi kulemba ndi kiyi.

Cassandra. Osamwalira bwanji ngati mumangodziwa Oracle

Sankhani: mukalola kusefa kumasanduka jambulani kwathunthu, kapena zomwe simuyenera kuchita

Tiyeni tilembe mawu osankhidwa: select * from users where, userid = . Zimakhala ngati ku Oracle: timalemba kusankha, tchulani mikhalidwe ndi chilichonse chimagwira ntchito, ogwiritsa ntchito amachipeza. Koma ngati mumasankha, mwachitsanzo, wogwiritsa ntchito chaka china chobadwa, Cassandra akudandaula kuti sangathe kukwaniritsa pempholo. Chifukwa sadziwa kalikonse za momwe timagawira zambiri za chaka chobadwa - ali ndi gawo limodzi lokha lomwe lawonetsedwa ngati kiyi. Kenako akuti, “Chabwino, ndikhozabe kukwaniritsa pempholi. Onjezani kulola kusefa." Timawonjezera malangizo, zonse zimagwira ntchito. Ndipo pa nthawiyi pakuchitika chinthu choopsa kwambiri.

Tikamayendetsa pa data yoyeserera, zonse zili bwino. Ndipo mukamafunsa pakupanga, komwe tili, mwachitsanzo, ma rekodi 4 miliyoni, ndiye kuti zonse sizili zabwino kwa ife. Chifukwa kulola kusefa ndi chitsogozo chomwe chimalola Cassandra kusonkhanitsa zidziwitso zonse kuchokera patebuloli kuchokera ku ma node onse, malo onse a data (ngati alipo ambiri mgululi), ndikusefa. Ichi ndi chithunzithunzi cha Full Scan, ndipo palibe amene amasangalala nacho.

Tikadangofuna ogwiritsa ntchito ndi ID, tikadakhala bwino ndi izi. Koma nthawi zina timafunika kulemba mafunso ena ndikuika zoletsa zina pakusankha. Chifukwa chake, tikukumbukira: awa onse ndi mapu omwe ali ndi kiyi yogawa, koma mkati mwake muli mapu osanjidwa.

Ndipo alinso ndi kiyi, yomwe timayitcha Clustering Key. Chinsinsi ichi, chomwe, chimakhala ndi mizati yomwe timasankha, mothandizidwa ndi Cassandra amamvetsetsa momwe deta yake imasankhidwira ndipo idzakhala pa mfundo iliyonse. Ndiko kuti, pa kiyi ina ya Partition, kiyi ya Clustering idzakuuzani momwe mungakankhire deta mumtengo uwu, malo omwe idzatengere pamenepo.

Uwu ndi mtengo, wofananitsa amangoyitanidwa pamenepo, komwe timadutsira magawo enaake mu mawonekedwe a chinthu, ndipo amatchulidwanso ngati mndandanda wa mizati.

CREATE TABLE users_by_year_salary_id (
	user_id uuid,
	name text,
	year int,
	salary float,
	PRIMARY KEY((year), salary, user_id)

Samalani ku malangizo ofunikira a Pulayimale; kutsutsana kwake koyamba (kwa ife, chaka) nthawi zonse kumakhala chinsinsi cha Gawo. Itha kukhala ndi mizati imodzi kapena zingapo, zilibe kanthu. Ngati pali mizati ingapo, iyenera kuchotsedwanso m'mabulaketi kuti wotsogolera chinenero amvetsetse kuti iyi ndiye kiyi Yoyambira, ndipo kumbuyo kwake zigawo zina zonse ndi kiyi ya Clustering. Pankhaniyi, iwo adzafalitsidwa mu comparator mu dongosolo limene iwo akuwonekera. Ndiko kuti, gawo loyamba ndi lofunika kwambiri, lachiwiri ndi lochepa kwambiri, ndi zina zotero. Momwe timalembera, mwachitsanzo, ndizofanana ndi magawo a magulu a data: timalemba minda, ndipo kwa iwo timalemba zomwe zili zazikulu ndi zazing'ono. Ku Cassandra, izi ndi, kunena kwake, magawo a kalasi ya data, pomwe zofananira zomwe zidalembedwera zidzagwiritsidwa ntchito.

Timakhazikitsa kusanja ndikukhazikitsa zoletsa

Muyenera kukumbukira kuti dongosolo la mtundu (kutsika, kukwera, chirichonse) limayikidwa panthawi yomwe fungulo limapangidwa, ndipo silingasinthidwe pambuyo pake. Imasankha mwakuthupi momwe deta idzasankhidwe komanso momwe idzasungidwe. Ngati mukufuna kusintha fungulo la Clustering kapena dongosolo lamtundu, muyenera kupanga tebulo latsopano ndikusamutsa deta mmenemo. Izi sizigwira ntchito ndi yomwe ilipo.

Cassandra. Osamwalira bwanji ngati mumangodziwa Oracle

Tidadzaza tebulo lathu ndi ogwiritsa ntchito ndikuwona kuti adagwa mu mphete, choyamba ndi chaka chobadwa, ndiyeno mkati mwa node iliyonse ndi malipiro ndi ID. Tsopano tikhoza kusankha poika zoletsa.

Ntchito yathu ikuwonekeranso where, and, ndipo timapeza ogwiritsa ntchito, ndipo zonse zili bwino. Koma ngati tiyesa kugwiritsa ntchito gawo lokha la kiyi ya Clustering, komanso yocheperako, ndiye kuti Cassandra adzadandaula nthawi yomweyo kuti sangapeze malo pamapu athu pomwe chinthu ichi, chomwe chili ndi magawo awa kwa wofananira wopanda pake, ndi uyu. izo zinangokhazikitsidwa , - kumene iye wagona. Ndiyenera kukokeranso deta yonse mu node iyi ndikuyisefa. Ndipo ichi ndi analogue ya Full Scan mkati mwa node, izi ndizoyipa.

Muzochitika zilizonse zosadziwika bwino, pangani tebulo latsopano

Ngati tikufuna kutsata ogwiritsa ntchito ndi ID, kapena zaka, kapena malipiro, tiyenera kuchita chiyani? Palibe. Ingogwiritsani ntchito matebulo awiri. Ngati mukufuna kufikira ogwiritsa ntchito m'njira zitatu zosiyanasiyana, padzakhala matebulo atatu. Apita masiku omwe tidasunga malo pa screw. Izi ndi zotsika mtengo kwambiri. Zimawononga ndalama zochepa kwambiri kuposa nthawi yoyankha, zomwe zingakhale zowononga kwa wogwiritsa ntchito. Ndizosangalatsa kwambiri kuti wogwiritsa ntchito alandire kena kake pamphindikati kuposa mphindi 10.

Timagulitsa malo osafunikira komanso ma data osasinthika kuti tithe kukula bwino ndikugwira ntchito modalirika. Ndipotu, masango omwe ali ndi malo atatu a deta, omwe ali ndi mfundo zisanu, ndi mlingo wovomerezeka wa kusunga deta (pamene palibe chomwe chatayika), amatha kupulumuka imfa ya deta imodzi kwathunthu. Ndipo mfundo zina ziwiri pa awiri otsalawo. Ndipo pambuyo pake mavuto amayamba. Ichi ndi chowonjezera chabwino kwambiri, ndichofunika ma drive angapo owonjezera a SSD ndi mapurosesa. Choncho, kuti mugwiritse ntchito Cassandra, yomwe siili SQL, yomwe mulibe maubwenzi, makiyi akunja, muyenera kudziwa malamulo osavuta.

Timapanga chilichonse malinga ndi pempho lanu. Chinthu chachikulu si deta, koma momwe ntchito idzagwirira ntchito nayo. Ngati ikufunika kulandira deta yosiyana m'njira zosiyanasiyana kapena deta yofanana m'njira zosiyanasiyana, tiyenera kuziyika m'njira yabwino kuti tigwiritse ntchito. Apo ayi, tidzalephera mu Full Scan ndipo Cassandra sadzatipatsa mwayi uliwonse.

Denormalizing deta ndi chizolowezi. Timayiwala za mafomu abwinobwino, tilibenso zosunga zolumikizana. Ngati tiika chinachake pansi maulendo 100, chidzagona nthawi 100. Akadali otsika mtengo kuposa kuyimitsa.

Timasankha makiyi ogawa kuti agawidwe bwino. Sitikufuna kuti hashi ya makiyi athu agwere munjira imodzi yopapatiza. Ndiko kuti, chaka chobadwa mu chitsanzo pamwambapa ndi chitsanzo choipa. Kunena zowona, ndikwabwino ngati ogwiritsa ntchito nthawi zambiri amagawidwa pofika chaka chobadwa, ndipo zoyipa ngati tikulankhula za ophunzira a giredi 5 - kugawa kumeneko sikungakhale kwabwino kwambiri.

Kusanja kumasankhidwa kamodzi pakupanga Clustering Key. Ngati ikufunika kusinthidwa, tidzasintha tebulo lathu ndi kiyi ina.

Ndipo chofunika kwambiri: ngati tifunika kupeza deta yomweyo m'njira 100 zosiyana, ndiye kuti tidzakhala ndi matebulo 100 osiyanasiyana.

Source: www.habr.com

Kuwonjezera ndemanga