Cassandra. Ungafi kanjani uma wazi i-Oracle kuphela

Sawubona, Habr.

Igama lami ngingu-Misha Butrimov, ngingathanda ukukutshela kancane ngoCassandra. Indaba yami izoba wusizo kulabo abangakaze bahlangabezane nemininingwane yolwazi ye-NoSQL - inezici eziningi zokuqalisa kanye nezingibe okudingeka wazi ngazo. Futhi uma ungabonanga lutho ngaphandle kwe-Oracle nanoma iyiphi enye isizindalwazi esihlobene, lezi zinto zizosindisa impilo yakho.

Yini enhle kangaka ngoCassandra? Kuyisizindalwazi se-NoSQL esiklanywe ngaphandle kwephuzu elilodwa lokwehluleka esikala kahle. Uma udinga ukwengeza ama-terabyte ambalwa kwenye isizindalwazi, uvele wengeze amanodi eringini. Inwebe iye kwesinye isikhungo sedatha? Engeza ama-node ku-cluster. Khulisa i-RPS ecutshunguliwe? Engeza ama-node ku-cluster. Isebenza ngakolunye uhlangothi futhi.

Cassandra. Ungafi kanjani uma wazi i-Oracle kuphela

Yini enye ayenza kahle? Imayelana nokusingatha izicelo eziningi. Kodwa yimalini eningi? 10, 20, 30, 40 izicelo eziyizinkulungwane ngomzuzwana akuningi. Izicelo eziyizinkulungwane eziyi-100 ngomzuzwana zokurekhoda - futhi. Kunezinkampani ezithi zigcina izicelo ezingu-2 million ngomzuzwana. Cishe kuyodingeka bakholwe.

Futhi ngokomthetho, uCassandra unomehluko owodwa omkhulu kudatha yobudlelwano - ayifani nhlobo nabo. Futhi lokhu kubaluleke kakhulu ukukhumbula.

Akuyona yonke into ebukeka ifana esebenza ngokufanayo

Kwake kwafika omunye engisebenza naye wangibuza: “Nali i-CQL Cassandra yombuzo, futhi inesitatimende esikhethiwe, inalapho, ikhona futhi. Ngibhala izincwadi futhi akusebenzi. Ngani?". Ukuphatha u-Cassandra njengesizindalwazi esihlobene kuyindlela ephelele yokuzibulala ngodlame. Futhi angiyikhuthazi, ivinjelwe eRussia. Uzovele udizayine okuthile okungalungile.

Ngokwesibonelo, ikhasimende liza kithi lithi: “Ake sakhe isizindalwazi sochungechunge lwe-TV, noma isizindalwazi senkomba yeresiphi. Sizoba nezitsha zokudla lapho noma uhlu lochungechunge lwe-TV nabalingisi abakulo.” Sithi ngenjabulo: “Asihambe!” Vele uthumele amabhayithi amabili, izimpawu ezimbalwa futhi usuqedile, yonke into izosebenza ngokushesha kakhulu nangokuthembekile. Futhi konke kuhamba kahle kuze kube yilapho amakhasimende efika futhi ethi omama bezindlu nabo baxazulula inkinga ephambene: banohlu lwemikhiqizo, futhi bafuna ukwazi ukuthi yisiphi isidlo abafuna ukusipheka. Ufile.

Lokhu kungenxa yokuthi i-Cassandra iyisizindalwazi esiyingxube: ihlinzeka kanyekanye ngenani elingukhiye futhi igcine idatha kumakholomu abanzi. Ku-Java noma ku-Kotlin, ingachazwa kanje:

Map<RowKey, SortedMap<ColumnKey, ColumnValue>>

Okusho ukuthi, imephu equkethe imephu ehlungiwe. Ukhiye wokuqala wale mephu ukhiye Womugqa noma ukhiye Wokuhlukanisa - ukhiye wokuhlukanisa. Ukhiye wesibili, owukhiye wemephu evele ihlungiwe, ukhiye we-Clustering.

Ukukhombisa ukusatshalaliswa kwedathabhesi, ake sidwebe amanodi amathathu. Manje udinga ukuqonda ukuthi ungabola kanjani idatha ibe ngama-node. Ngoba uma sicindezela yonke into ibe yinye (ngendlela, kungaba yinkulungwane, izinkulungwane ezimbili, ezinhlanu - eziningi ngokuthanda kwakho), lokhu akukona ngempela mayelana nokusabalalisa. Ngakho-ke, sidinga umsebenzi wezibalo ozobuyisela inombolo. Inombolo nje, int ende ezowela kolunye uhla. Futhi sizoba nenodi eyodwa ebhekele uhla olulodwa, okwesibili okwesibili, okwe-nth okwe-nth.

Cassandra. Ungafi kanjani uma wazi i-Oracle kuphela

Le nombolo ithathwa kusetshenziswa umsebenzi we-hashi, osetshenziswa kulokho esikubiza ngokuthi ukhiye Wokuhlukanisa. Lena ikholomu eshiwo kumyalelo wokhiye Oyinhloko, futhi lena ikholomu ezoba ukhiye wokuqala noyisisekelo wemephu. Inquma ukuthi iyiphi i-node ezothola ukuthi iyiphi idatha. Ithebula lenziwa e-Cassandra cishe ne-syntax efanayo naleyo eku-SQL:

CREATE TABLE users (
	user_id uu id,
	name text,
	year int,
	salary float,
	PRIMARY KEY(user_id)

)

Ukhiye Oyinhloko kuleli cala uqukethe ikholomu eyodwa, futhi uwukhiye wokuhlukanisa.

Ngabe abasebenzisi bethu bazosebenza kanjani? Abanye bazoya endaweni eyodwa, abanye baye kwenye, futhi abanye baye kwesithathu. Umphumela uyithebula le-hashi elivamile, elaziwa nangokuthi imephu, elaziwa nangokuthi isichazamazwi ku-Python, noma ukwakheka kwenani elingukhiye elilula lapho singakwazi ukufunda khona wonke amanani, ukufunda nokubhala ngokhiye.

Cassandra. Ungafi kanjani uma wazi i-Oracle kuphela

Khetha: uma uvumela ukuhlunga kuphenduke ukuskena okugcwele, noma lokho okungafanele ukwenze

Ake sibhale isitatimende esikhethiwe: select * from users where, userid = . Kuvela ku-Oracle: sibhala khetha, sicacise izimo futhi yonke into iyasebenza, abasebenzisi bayayithola. Kodwa uma ukhetha, isibonelo, umsebenzisi ononyaka othile wokuzalwa, uCassandra ukhononda ngokuthi akakwazi ukufeza isicelo. Ngoba akazi lutho nhlobo mayelana nendlela esisabalalisa ngayo idatha mayelana nonyaka wokuzalwa - unekholomu eyodwa kuphela ekhonjiswe njengokhiye. Bese ethi, “Kulungile, ngisengasifeza lesi sicelo. Engeza vumela ukuhlunga." Sengeza isiqondiso, konke kuyasebenza. Futhi ngalo mzuzu kwenzeka into embi kakhulu.

Uma sisebenzisa idatha yokuhlola, yonke into ihamba kahle. Futhi uma wenza umbuzo ekukhiqizeni, lapho sine, isibonelo, amarekhodi ayizigidi ezi-4, khona-ke yonke into ayisihle kakhulu. Ngoba ukuvumela ukuhlunga kuwumyalelo ovumela i-Cassandra ukuthi iqoqe yonke idatha kusuka kuleli thebula kusuka kuwo wonke ama-node, zonke izikhungo zedatha (uma ziziningi kule qoqo), bese ziyihlunga kuphela. Lena i-analogue ye-Full Scan, futhi cishe akekho umuntu ojabule ngayo.

Ukube besidinga kuphela abasebenzisi nge-ID, besizolunga ngalokhu. Kodwa ngezinye izikhathi sidinga ukubhala eminye imibuzo futhi sibeke eminye imikhawulo ekukhetheni. Ngakho-ke, siyakhumbula: lena yonke imephu enokhiye wokuhlukanisa, kodwa ngaphakathi kwayo kunemephu ehleliwe.

Futhi unokhiye, esiwubiza ngokuthi Ukhiye Wokuhlanganisa. Lo khiye, wona, uqukethe amakholomu esiwakhethayo, ngosizo i-Cassandra eqonda ngayo ukuthi idatha yayo ihlelwa kanjani ngokomzimba futhi izotholakala endaweni ngayinye. Okusho ukuthi, komunye ukhiye Wokuhlukanisa, ukhiye Wokuhlanganisa uzokutshela kahle ukuthi uyiphusha kanjani idatha kulesi sihlahla, ukuthi izothatha yiphi indawo lapho.

Lesi isihlahla ngempela, isiqhathanisi simane sibizwe lapho, lapho sidlulisela khona iqoqo elithile lamakholomu ngendlela yento, futhi lichazwa njengohlu lwamakholomu.

CREATE TABLE users_by_year_salary_id (
	user_id uuid,
	name text,
	year int,
	salary float,
	PRIMARY KEY((year), salary, user_id)

Naka umyalo wokhiye Oyinhloko; impikiswano yawo yokuqala (kithi, unyaka) ihlale ingukhiye Wokuhlukanisa. Ingaqukatha ikholomu eyodwa noma ngaphezulu, akunandaba. Uma kunamakholomu amaningana, idinga ukukhishwa kubakaki futhi ukuze i-preprocessor yolimi iqonde ukuthi lona ukhiye Oyinhloko, futhi ngemuva kwawo wonke amanye amakholomu kukhona ukhiye Wokuhlanganisa. Kulokhu, zizodluliselwa kusiqhathanisi ngokulandelana kwazo. Okungukuthi, ikholamu yokuqala ibaluleke kakhulu, eyesibili ayibalulekile kangako, njalo njalo. Indlela esibhala ngayo, isibonelo, ilingana nezinkambu zamakilasi edatha: sibhala izinkambu, futhi kuzo sibhala ukuthi yiziphi ezinkulu nezincane. E-Cassandra, lezi, uma kuqhathaniswa, izinkambu zekilasi ledatha, lapho okulinganayo okubhalelwe khona kuzosetshenziswa khona.

Setha ukuhlunga futhi sibeke imikhawulo

Udinga ukukhumbula ukuthi uhlelo lokuhlunga (ukwehla, ukukhuphuka, noma yini) lusethwe ngesikhathi esifanayo lapho ukhiye udalwa, futhi awukwazi ukushintshwa kamuva. Inquma ngokoqobo ukuthi idatha izohlungwa kanjani nokuthi izogcinwa kanjani. Uma udinga ukushintsha ukhiye we-Clustering noma ukuhleleka kokuhlunga, kuzodingeka udale ithebula elisha bese udlulisela idatha kulo. Lokhu ngeke kusebenze nesikhona.

Cassandra. Ungafi kanjani uma wazi i-Oracle kuphela

Sagcwalisa itafula lethu ngabasebenzisi futhi sabona ukuthi bawela eringini, okokuqala ngonyaka wokuzalwa, bese bengaphakathi endaweni ngayinye ngeholo kanye ne-ID yomsebenzisi. Manje singakhetha ngokubeka imikhawulo.

Eyethu esebenzayo iyavela futhi where, and, futhi sithola abasebenzisi, futhi yonke into isilungile futhi. Kodwa uma sizama ukusebenzisa kuphela ingxenye yokhiye we-Clustering, futhi ongabalulekile kangako, u-Cassandra uzokhononda ngokushesha ngokuthi ngeke akwazi ukuthola indawo kumephu yethu lapho le nto, enalezi zindawo zesifanisi esingenalutho, kanye nalena. lokho kwakusethiwe , - lapho elele khona. Kuzodingeka ngikhiphe yonke idatha kule nodi futhi ngiyihlunge. Futhi lena i-analogue ye-Full Scan ngaphakathi kwe-node, lokhu kubi.

Kunoma yisiphi isimo esingacacile, dala itafula elisha

Uma sifuna ukukwazi ukukhomba abasebenzisi nge-ID, noma ngeminyaka, noma ngeholo, yini okufanele siyenze? Lutho. Vele usebenzise amatafula amabili. Uma udinga ukufinyelela kubasebenzisi ngezindlela ezintathu ezahlukene, kuzoba namatafula amathathu. Sezadlula izinsuku lapho songa isikhala kusikulufa. Lona insiza eshibhe kakhulu. Kubiza kancane kunesikhathi sokuphendula, okungaba yingozi kumsebenzisi. Kujabulisa kakhulu ukuthi umsebenzisi athole okuthile ngomzuzwana kunasemizuzwini eyi-10.

Sihweba ngendawo engadingekile kanye nedatha eshintshile ukuze sikwazi ukukala kahle futhi sisebenze ngokwethembeka. Phela, eqinisweni, iqoqo elihlanganisa izikhungo ezintathu zedatha, ngayinye enama-node amahlanu, enezinga elamukelekayo lokugcinwa kwedatha (uma kungekho lutho olulahlekile), liyakwazi ukusinda ekufeni kwesikhungo esisodwa sedatha ngokuphelele. Futhi amanye amanodi amabili kwelinye lamabili asele. Futhi kuphela ngemva kwalokhu izinkinga ziqala. Lokhu ukuphindaphindeka okuhle kakhulu, kuwufanele amadrayivu namaphrosesa ambalwa we-SSD. Ngakho-ke, ukuze usebenzise i-Cassandra, engeyona i-SQL, lapho kungekho khona ubudlelwano, okhiye bangaphandle, udinga ukwazi imithetho elula.

Siklama yonke into ngokwesicelo sakho. Into esemqoka akuyona idatha, kodwa ukuthi uhlelo lokusebenza luzosebenza kanjani nayo. Uma idinga ukuthola idatha ehlukile ngezindlela ezihlukile noma idatha efanayo ngezindlela ezihlukile, kufanele siyibeke ngendlela elungele uhlelo lokusebenza. Uma kungenjalo, sizohluleka ku-Full Scan futhi u-Cassandra ngeke asinike noma iyiphi inzuzo.

Ukunciphisa idatha kuyinto evamile. Siyakhohlwa ngamafomu ajwayelekile, asisenawo ama-database ahlobene. Uma sibeka into phansi izikhathi ezingu-100, izolala izikhathi ezingu-100. Kuseshibhile kunokuma.

Sikhetha okhiye bokuhlukanisa ukuze basakazwe ngokujwayelekile. Asifuni ukuthi i-hashi yokhiye bethu iwele kububanzi obubodwa obuncane. Okusho ukuthi, unyaka wokuzalwa kulesi sibonelo esingenhla uyisibonelo esibi. Ngokunembayo, kuhle uma abasebenzisi bethu bevamise ukusatshalaliswa ngonyaka wokuzalwa, futhi kubi uma sikhuluma ngabafundi bebanga lesi-5 - ukuhlukaniswa lapho ngeke kube kuhle kakhulu.

Ukuhlunga kukhethwa kanye esigabeni sokudalwa kwe-Clustering Key. Uma idinga ukushintshwa, kuzodingeka sibuyekeze ithebula lethu ngokhiye ohlukile.

Futhi into ebaluleke kakhulu: uma sidinga ukubuyisa idatha efanayo ngezindlela eziyi-100 ezihlukene, sizoba namatafula ayi-100 ahlukene.

Source: www.habr.com

Engeza amazwana