NewSQL = NoSQL+ACID

NewSQL = NoSQL+ACID
Har zuwa kwanan nan, Odnoklassniki yana adana kusan TB 50 na bayanan da aka sarrafa a ainihin lokacin a cikin SQL Server. Don irin wannan ƙarar, yana da kusan ba zai yuwu a samar da sauri da aminci ba, har ma da gazawar cibiyar bayanai ta hanyar amfani da SQL DBMS. Yawanci, a irin waɗannan lokuta, ana amfani da ɗaya daga cikin ma'ajin NoSQL, amma ba duk abin da za'a iya canza shi zuwa NoSQL ba: wasu ƙungiyoyi suna buƙatar garantin ma'amala na ACID.

Wannan ya kai mu ga yin amfani da ajiya na NewSQL, wato, DBMS wanda ke ba da haƙuri ga kuskure, scalability da kuma aiwatar da tsarin NoSQL, amma a lokaci guda kiyaye ACID yana ba da tabbacin sanannun tsarin gargajiya. Akwai ƙananan tsarin masana'antu masu aiki na wannan sabon aji, don haka mun aiwatar da irin wannan tsarin da kanmu kuma muka sanya shi cikin aiki na kasuwanci.

Yadda yake aiki da abin da ya faru - karanta a ƙarƙashin yanke.

A yau, masu sauraron Odnoklassniki na kowane wata sun fi miliyan 70 na musamman baƙi. Mu Muna cikin manyan biyar manyan cibiyoyin sadarwar jama'a a duniya, kuma daga cikin shafuka ashirin da masu amfani ke amfani da su. Kayan aikin OK suna ɗaukar kaya masu yawa: buƙatun HTTP sama da miliyan ɗaya/sec a gaba. Sassan rundunar sabar na fiye da guda 8000 suna kusa da juna - a cikin cibiyoyin bayanan Moscow guda huɗu, wanda ke ba da damar tabbatar da latency na cibiyar sadarwa na ƙasa da 1 ms a tsakanin su.

Muna amfani da Cassandra tun 2010, farawa da sigar 0.6. A yau akwai gungu guda goma sha biyu da ke aiki. Babban gungu mafi sauri yana aiwatar da ayyuka sama da miliyan 4 a sakan daya, kuma mafi girma yana adana tarin fuka 260.

Koyaya, waɗannan duka gungu na NoSQL ne na yau da kullun da ake amfani da su don ajiya a rauni hadewa data. Mun so mu maye gurbin babban ma'auni mai daidaituwa, Microsoft SQL Server, wanda aka yi amfani da shi tun kafuwar Odnoklassniki. Ma'ajiyar ta ƙunshi na'urori masu daidaitawa na SQL Server fiye da 300, waɗanda ke ɗauke da TB 50 na bayanai - ƙungiyoyin kasuwanci. An canza wannan bayanan azaman ɓangaren ma'amalar ACID kuma yana buƙata babban daidaito.

Don rarraba bayanai a cikin nodes na SQL Server, mun yi amfani da duka a tsaye da a kwance rabuwa (sharing). A tarihi, mun yi amfani da tsari mai sauƙi na ɓarna bayanai: kowane mahalli yana da alaƙa da alama - aikin ID na mahallin. An sanya ƙungiyoyi masu alama iri ɗaya akan sabar SQL iri ɗaya. An aiwatar da dangantakar dalla-dalla ta yadda alamomin manyan bayanai da na yara koyaushe suna daidaitawa kuma suna kan sabar iri ɗaya. A cikin hanyar sadarwar zamantakewa, kusan duk bayanan ana samar da su a madadin mai amfani - wanda ke nufin cewa duk bayanan mai amfani da ke cikin tsarin aiki guda ɗaya ana adana su akan sabar guda ɗaya. Wato, kasuwancin kasuwanci kusan koyaushe yana haɗa tebur daga sabar SQL ɗaya, wanda ya ba da damar tabbatar da daidaiton bayanai ta amfani da ma'amalar ACID na gida, ba tare da buƙatar amfani da su ba. sannu a hankali da rashin dogaro rarraba ACID ma'amaloli.

Godiya ga sharding da haɓaka SQL:

  • Ba ma amfani da maɓallan maɓalli na Ƙasashen waje, tun lokacin da ake raba ID ɗin mahaɗin yana iya kasancewa akan wata uwar garken.
  • Ba ma yin amfani da hanyoyin da aka adana da abubuwan da ke haifar da ƙarin nauyi akan DBMS CPU.
  • Ba ma amfani da JOINs saboda duk abubuwan da ke sama da yawan karatun bazuwar daga faifai.
  • A waje da ma'amala, muna amfani da matakin keɓewa mara iyaka don rage ƙulli.
  • Muna yin gajerun ma'amaloli ne kawai (a matsakaicin ƙasa da 100 ms).
  • Ba ma yin amfani da UPDATE da DELETE da yawa a jere saboda yawan makullai - muna sabunta rikodin guda ɗaya kawai a lokaci guda.
  • Koyaushe muna yin tambayoyi akan fihirisa kawai - tambaya mai cikakken tsarin duba tebur a gare mu yana nufin wuce gona da iri kuma yana haifar da gazawar.

Waɗannan matakan sun ba mu damar matse kusan iyakar aiki daga sabar SQL. Duk da haka, matsalolin sun ƙara yawa. Mu duba su.

Matsaloli tare da SQL

  • Tun da mun yi amfani da rubutun rubutun kai, ƙara sabbin shards an yi su da hannu ta masu gudanarwa. Duk tsawon wannan lokacin, kwafin bayanai masu ƙima ba buƙatun sabis ba ne.
  • Yayin da adadin bayanan da ke cikin tebur ke girma, saurin shigarwa da gyare-gyare yana raguwa; lokacin ƙara fihirisa zuwa teburin da ke akwai, saurin yana raguwa da wani abu; ƙirƙira da sake sake fasalin fihirisa yana faruwa tare da raguwa.
  • Samun ƙaramin adadin Windows don SQL Server a cikin samarwa yana sa sarrafa abubuwan more rayuwa da wahala

Amma babbar matsalar ita ce

hakuri da laifi

Sabar SQL na gargajiya tana da rashin haƙuri mara kyau. A ce kana da uwar garken bayanai guda ɗaya kawai, kuma tana kasawa sau ɗaya a kowace shekara uku. A wannan lokacin rukunin yanar gizon yana ƙasa don mintuna 20, wanda ke yarda. Idan kana da sabobin 64, to, shafin yana raguwa sau ɗaya kowane mako uku. Kuma idan kana da sabobin 200, to shafin ba ya aiki kowane mako. Wannan matsala ce.

Me za a iya yi don inganta haƙurin kuskure na uwar garken SQL? Wikipedia yana gayyatar mu don ginawa sosai samuwa tari: inda idan aka gaza kowane daga cikin abubuwan da aka gyara akwai madadin.

Wannan yana buƙatar tarin kayan aiki masu tsada: kwafi masu yawa, fiber na gani, ajiya mai raba, da haɗawar ajiyar baya aiki da dogaro: kusan kashi 10% na sauye-sauye sun ƙare tare da gazawar kumburin madadin kamar jirgin ƙasa a bayan babban kumburi.

Amma babban illar irin wannan gungu da ake da shi sosai shine rashin samun sisi idan cibiyar bayanai da ke cikinta ta gaza. Odnoklassniki yana da cibiyoyin bayanai guda huɗu, kuma muna buƙatar tabbatar da aiki idan an sami cikakkiyar gazawa a ɗayansu.

Don wannan za mu iya amfani Multi-Master Kwafi da aka gina a cikin SQL Server. Wannan bayani ya fi tsada sosai saboda tsadar software kuma yana fama da sanannun matsaloli tare da maimaitawa - jinkirin ma'amala mara ƙima tare da kwafin aiki tare da jinkirin yin amfani da kwafi (kuma, a sakamakon haka, gyare-gyaren da aka rasa) tare da kwafi asynchronous. Abin nufi warware rikici na hannu ya sa wannan zaɓin bai dace da mu gaba ɗaya ba.

Duk waɗannan matsalolin sun buƙaci mafita mai tsauri, kuma mun fara yin nazari dalla-dalla. Anan muna buƙatar sanin abin da SQL Server ya fi yi - ma'amaloli.

Ma'amala mai sauƙi

Bari mu yi la'akari da mafi sauƙi ma'amala, daga ra'ayi na mai amfani da shirye-shiryen SQL: ƙara hoto zuwa kundin. Ana adana faifai da hotuna a cikin faranti daban-daban. Kundin yana da ma'aunin hoto na jama'a. Sannan an raba irin wannan ciniki zuwa matakai kamar haka:

  1. Muna kulle kundi ta maɓalli.
  2. Ƙirƙiri shigarwa a cikin tebur na hoto.
  3. Idan hoton yana da matsayi na jama'a, to ƙara ma'aunin hoto na jama'a zuwa kundin, sabunta rikodin kuma aiwatar da ma'amala.

Ko a cikin pseudocode:

TX.start("Albums", id);
Album album = albums.lock(id);
Photo photo = photos.create(…);

if (photo.status == PUBLIC ) {
    album.incPublicPhotosCount();
}
album.update();

TX.commit();

Mun ga cewa yanayin da ya fi dacewa don ma'amalar kasuwanci shine karanta bayanai daga bayanan bayanan cikin ƙwaƙwalwar ajiyar sabar aikace-aikacen, canza wani abu kuma adana sabbin dabi'u a cikin bayanan. Yawancin lokaci a cikin irin wannan ma'amala muna sabunta abubuwa da yawa, tebur da yawa.

Lokacin aiwatar da ma'amala, gyare-gyare na lokaci guda na wannan bayanai daga wani tsarin na iya faruwa. Misali, Antispam na iya yanke shawarar cewa mai amfani yana da ko ta yaya kuma saboda haka duk hotunan mai amfani bai kamata ya zama jama'a ba, suna buƙatar a aika su don daidaitawa, wanda ke nufin canza yanayin hoto zuwa wata ƙima da kashe masu ƙididdiga masu dacewa. A bayyane yake, idan wannan aikin ya faru ba tare da garantin atomity na aikace-aikacen da keɓance gyare-gyaren gasa ba, kamar a cikin ACID, to sakamakon ba zai zama abin da ake buƙata ba - ko dai ma'aunin hoto zai nuna ƙimar da ba daidai ba, ko kuma ba za a aika duk hotuna don daidaitawa ba.

Yawancin lambobi masu kama da juna, suna sarrafa ƙungiyoyin kasuwanci daban-daban a cikin ma'amala guda ɗaya, an rubuta duk tsawon kasancewar Odnoklassniki. Dangane da ƙwarewar ƙaura zuwa NoSQL daga Ƙarshen Ƙarshe Mun san cewa babban ƙalubale (da zuba jari na lokaci) ya fito ne daga haɓaka lambar don kula da daidaiton bayanai. Saboda haka, mun yi la'akari da babban abin da ake bukata don sabon ajiya don zama tanadi don ainihin ma'amaloli na ACID don basirar aikace-aikacen.

Sauran, ba ƙaramin mahimmanci ba, buƙatun sune:

  • Idan cibiyar bayanai ta gaza, duka karatu da rubutu zuwa sabon ma'aji dole ne a samu.
  • Kula da saurin ci gaba na yanzu. Wato, lokacin aiki tare da sabon ma'ajiyar, adadin lambar ya kamata ya zama kusan iri ɗaya; kada a sami buƙatar ƙara wani abu a cikin ma'ajiyar, haɓaka algorithms don warware rikice-rikice, kiyaye alamomin sakandare, da sauransu.
  • Gudun sabon ajiyar dole ne ya kasance mai girma sosai, duka lokacin karanta bayanai da kuma lokacin sarrafa ma'amaloli, wanda ke nufin cewa ƙwaƙƙwaran ilimi, na duniya, amma jinkirin mafita, kamar, alal misali, ba a aiwatar da su ba. mataki biyu aikata.
  • Sikeli ta atomatik akan-da-tashi.
  • Yin amfani da sabobin masu arha na yau da kullun, ba tare da buƙatar siyan kayan masarufi ba.
  • Yiwuwar haɓakar ajiya ta masu haɓaka kamfani. A wasu kalmomi, an ba da fifiko ga hanyoyin mallakar mallaka ko buɗaɗɗen tushe, zai fi dacewa a cikin Java.

Yanke shawara, yanke shawara

Yin nazarin hanyoyin da za a iya magance su, mun zo ga zaɓin gine-gine biyu masu yiwuwa:

Na farko shine ɗaukar kowane uwar garken SQL kuma aiwatar da haƙurin kuskuren da ake buƙata, tsarin sikeli, gungu mai gazawa, warware rikice-rikice da rarrabawa, amintaccen ma'amalar ACID da sauri. Mun ƙididdige wannan zaɓin a matsayin mara nauyi kuma mai tsananin aiki.

Zaɓin na biyu shine ɗaukar ma'ajin NoSQL wanda aka shirya tare da aiwatar da sikelin, gungu mai gazawa, warware rikici, da aiwatar da ma'amaloli da SQL da kanka. A kallon farko, har ma da aikin aiwatar da SQL, ba tare da ambaton ma'amalar ACID ba, yana kama da aikin da zai ɗauki shekaru. Amma sai muka gane cewa fasalin fasalin SQL da muke amfani da shi a aikace yana da nisa daga ANSI SQL kamar Cassandra CQL nesa da ANSI SQL. Idan muka kalli CQL sosai, mun gane cewa ya kusa da abin da muke bukata.

Cassandra da CQL

Don haka, menene ban sha'awa game da Cassandra, menene damar da yake da shi?

Da farko, a nan zaku iya ƙirƙirar tebur waɗanda ke tallafawa nau'ikan bayanai daban-daban; kuna iya yin SELECT ko UPDATE akan maɓallin farko.

CREATE TABLE photos (id bigint KEY, owner bigint,…);
SELECT * FROM photos WHERE id=?;
UPDATE photos SET … WHERE id=?;

Don tabbatar da daidaiton bayanan kwafi, Cassandra yana amfani tsarin qurum. A cikin mafi sauƙi, wannan yana nufin cewa idan aka sanya kwafi uku na jeri ɗaya a kan nodes daban-daban na gungu, ana ɗaukar rubutun nasara idan yawancin nodes (wato biyu cikin uku) sun tabbatar da nasarar wannan aikin rubutawa. . Ana ɗaukar bayanan jere daidai gwargwado idan, lokacin karantawa, yawancin nodes ɗin an ƙirƙira su kuma an tabbatar da su. Don haka, tare da kwafi guda uku, cikakke da daidaiton bayanai nan take yana da tabbacin idan kumburi ɗaya ya gaza. Wannan dabarar ta ba mu damar aiwatar da wani tsari mafi aminci: koyaushe aika buƙatun zuwa duk kwafi uku, muna jiran amsa daga waɗanda suka fi sauri. An watsar da ƙarshen amsa na kwafi na uku a wannan yanayin. Kullin da ya makara a amsa yana iya samun matsaloli masu tsanani - birki, tarin datti a cikin JVM, dawo da ƙwaƙwalwar ajiya kai tsaye a cikin kernel Linux, gazawar hardware, cire haɗin yanar gizo. Koyaya, wannan baya shafar ayyukan abokin ciniki ko bayanan ta kowace hanya.

Hanyar lokacin da muka tuntuɓi nodes uku kuma mu karɓi amsa daga biyu ana kiranta hasashe: ana aika buƙatar ƙarin kwafi tun kafin ya “faɗi”.

Wani fa'idar Cassandra shine Batchlog, tsarin da ke tabbatar da cewa an yi amfani da ɗimbin canje-canjen da kuka yi ko dai ba a yi amfani da su ba. Wannan yana ba mu damar warware A a cikin ACID - atomity daga cikin akwatin.

Abu mafi kusa ga ma'amaloli a Cassandra shine abin da ake kira "ma'amaloli masu nauyi". Amma sun yi nisa daga "ainihin" ma'amaloli na ACID: a gaskiya, wannan dama ce ta yin CAS akan bayanai daga rikodin guda ɗaya kawai, ta amfani da yarjejeniya ta amfani da ka'idar Paxos mai nauyi. Saboda haka, saurin irin wannan ma'amala yana da ƙasa.

Abin da muka rasa a Cassandra

Don haka, dole ne mu aiwatar da ainihin ma'amalolin ACID a Cassandra. Yin amfani da wanda za mu iya aiwatar da wasu fasalulluka guda biyu masu dacewa na DBMS na yau da kullun: daidaitattun ma'auni masu sauri, waɗanda zasu ba mu damar yin zaɓin bayanai ba kawai ta maɓalli na farko ba, da janareta na yau da kullun na ID na haɓaka auto-motonic.

C*Daya

Ta haka aka haifi sabon DBMS C*Daya, wanda ya ƙunshi nau'ikan nodes na uwar garken guda uku:

  • Adana - (kusan) daidaitattun sabobin Cassandra masu alhakin adana bayanai akan fayafai na gida. Yayin da nauyi da ƙarar bayanai ke girma, ana iya daidaita adadinsu cikin sauƙi zuwa dubun da ɗaruruwa.
  • Masu daidaita ma'amala - tabbatar da aiwatar da ma'amaloli.
  • Abokan ciniki sabobin aikace-aikace ne waɗanda ke aiwatar da ayyukan kasuwanci da fara ciniki. Ana iya samun dubban irin waɗannan abokan ciniki.

NewSQL = NoSQL+ACID

Sabar kowane iri ɓangare ne na tari gama gari, yi amfani da ka'idar saƙon Cassandra na ciki don sadarwa da juna da tsegumi don musayar bayanai ta gungu. Tare da Heartbeat, sabobin suna koya game da gazawar juna, suna kula da tsarin bayanai guda ɗaya - tebur, tsarin su da maimaitawa; tsarin rabo, cluster topology, da dai sauransu.

Abokan ciniki

NewSQL = NoSQL+ACID

Maimakon daidaitattun direbobi, ana amfani da yanayin Abokin Ciniki mai Fat. Irin wannan kullin ba ya adana bayanai, amma yana iya aiki a matsayin mai gudanarwa don neman aiwatarwa, wato, Abokin ciniki da kansa yana aiki a matsayin mai gudanar da buƙatunsa: yana tambayar kwafin ajiya kuma yana warware rikice-rikice. Wannan ba kawai ya fi dogara da sauri fiye da daidaitaccen direba ba, wanda ke buƙatar sadarwa tare da mai gudanarwa mai nisa, amma kuma yana ba ku damar sarrafa watsa buƙatun. Bayan ma'amala da aka buɗe akan abokin ciniki, ana aika buƙatun zuwa wuraren ajiya. Idan abokin ciniki ya buɗe ma'amala, to, duk buƙatun da ke cikin ma'amala ana aika zuwa mai daidaita ma'amala.
NewSQL = NoSQL+ACID

C*Mai Gudanar da Ma'amala guda ɗaya

Mai gudanarwa wani abu ne da muka aiwatar don C * Daya daga karce. Yana da alhakin sarrafa ma'amaloli, kulle-kulle, da tsarin da ake aiwatar da ma'amaloli.

Ga kowane ma'amala mai hidima, mai gudanarwa yana haifar da tambarin lokaci: kowace ma'amala ta gaba ta fi ma'amalar da ta gabata girma. Tunda tsarin warware rikici na Cassandra ya dogara ne akan tambari na lokaci (na rubuce-rubuce masu cin karo da juna biyu, wanda ke da sabon tambarin lokaci ana ɗaukarsa a halin yanzu), za a warware rikicin koyaushe don samun goyon bayan ciniki na gaba. Ta haka muka aiwatar agogon Lamport - hanya mai arha don warware rikice-rikice a cikin tsarin rarrabawa.

Makulli

Don tabbatar da keɓewa, mun yanke shawarar amfani da hanya mafi sauƙi - makullin rashin tausayi dangane da maɓallin farko na rikodin. A wasu kalmomi, a cikin ma'amala, dole ne a fara kulle rikodin, sannan a karanta, gyara, da adanawa. Sai bayan ƙaddamar da nasara za a iya buɗe rikodin ta yadda ma'amaloli masu gasa za su iya amfani da shi.

Aiwatar da irin wannan kulle yana da sauƙi a cikin yanayin da ba a rarraba ba. A cikin tsarin da aka rarraba, akwai manyan zaɓuɓɓuka guda biyu: ko dai aiwatar da kulle kullewa a kan gungu, ko rarraba ma'amaloli ta yadda ma'amaloli da suka haɗa da rikodin iri ɗaya koyaushe suna aiki ta hanyar mai gudanarwa iri ɗaya.

Tun da yake a cikin yanayinmu an riga an rarraba bayanai tsakanin ƙungiyoyin ma'amaloli na gida a cikin SQL, an yanke shawarar sanya ƙungiyoyin ma'amala na gida ga masu gudanarwa: ɗaya mai gudanarwa yana yin duk ma'amaloli tare da alamun daga 0 zuwa 9, na biyu - tare da alamun daga 10 zuwa 19. da sauransu. Sakamakon haka, kowane misalan mai gudanarwa ya zama mai kula da ƙungiyar ciniki.

Sannan ana iya aiwatar da makullai ta hanyar banal HashMap a cikin ƙwaƙwalwar ajiyar mai gudanarwa.

gazawar mai gudanarwa

Tunda mai gudanarwa ɗaya ke hidimar ƙungiyar ma'amaloli kawai, yana da matukar mahimmanci a hanzarta tantance gaskiyar gazawarsa ta yadda ƙoƙari na biyu na aiwatar da ciniki zai ƙare. Don yin wannan cikin sauri kuma abin dogaro, mun yi amfani da cikakkiyar haɗin gwiwar ƙa'idar jin sautin ƙararrawa:

Kowace cibiyar bayanai tana ɗaukar aƙalla nodes masu gudanarwa guda biyu. Lokaci-lokaci, kowane kodineta yana aika saƙon bugun zuciya ga sauran masu gudanarwa da kuma sanar da su game da ayyukansa, da kuma waɗanne saƙon bugun bugun zuciya da ya samu daga waɗanda masu gudanarwa a cikin gungu na ƙarshe.

NewSQL = NoSQL+ACID

Karɓan bayanai iri ɗaya daga wasu a matsayin wani ɓangare na saƙon bugun zuciyarsu, kowane mai gudanarwa yana yankewa kansa shawarar waɗanne nodes ɗin gundumomi ke aiki da waɗanda ba sa aiki, bisa ƙa'idar ƙungiyar: idan kumburin X ya sami bayanai daga yawancin nodes a cikin gungu game da al'ada. karɓar saƙonni daga kumburi Y, sannan , Y yana aiki. Kuma akasin haka, da zaran mafi rinjaye sun ba da rahoton ɓacewar saƙonni daga kumburin Y, to Y ya ƙi. Yana da sha'awar cewa idan qurum ɗin ya sanar da kumburin X cewa baya karɓar saƙonni daga gare ta, to node X kanta zai ɗauki kansa a matsayin ya gaza.

Ana aika saƙonnin bugun zuciya tare da mitar mai yawa, kusan sau 20 a cikin daƙiƙa, tare da tsawon 50 ms. A cikin Java, yana da wahala a ba da garantin amsa aikace-aikacen a cikin 50 ms saboda kwatankwacin tsayin dakatawa da mai tara shara ya haifar. Mun sami damar cimma wannan lokacin mayar da martani ta amfani da mai tara shara na G1, wanda ke ba mu damar tantance manufa na tsawon lokacin dakatarwar GC. Koyaya, wani lokacin, da wuya, mai karɓar yana tsayawa ya wuce 50 ms, wanda zai iya haifar da gano kuskuren ƙarya. Don hana faruwar hakan, kodinetan ba ya bayar da rahoton gazawar kumburin nesa lokacin da saƙon bugun bugun zuciya na farko daga gare shi ya ɓace, kawai idan da yawa sun ɓace a jere. ms.

Amma bai isa ba da sauri fahimtar wane kumburi ya daina aiki. Muna bukatar mu yi wani abu game da wannan.

Ajiye

Tsarin al'ada ya ƙunshi, a cikin yanayin gazawar maigida, fara sabon zaɓe ta amfani da ɗayan na gaye na duniya algorithms. Duk da haka, irin waɗannan algorithms suna da sanannun matsaloli tare da haɗuwa da lokaci da kuma tsawon lokacin da tsarin zaɓen kansa. Mun sami damar gujewa irin waɗannan ƙarin jinkiri ta amfani da tsarin maye gurbin mai gudanarwa a cikin hanyar sadarwa mai cikakken haɗin kai:

NewSQL = NoSQL+ACID

Bari mu ce muna so mu aiwatar da ma'amala a cikin rukuni na 50. Bari mu ƙayyade a gaba da tsarin maye gurbin, wato, wanda nodes zai aiwatar da ma'amaloli a cikin rukuni na 50 a cikin yanayin rashin nasarar babban mai gudanarwa. Manufarmu ita ce kiyaye aikin tsarin a yayin da aka sami gazawar cibiyar bayanai. Bari mu ƙayyade cewa ajiyar farko zai zama kumburi daga wani cibiyar bayanai, kuma ajiyar na biyu zai zama kumburi daga kashi na uku. Ana zaɓin wannan makirci sau ɗaya kuma baya canzawa har sai yanayin tsarin cluster ya canza, wato, har sai sabbin nodes sun shiga cikinsa (wanda ke faruwa da wuya). Hanyar zabar sabon mai aiki idan tsohon ya gaza zai kasance kamar haka: ajiyar farko zai zama mai aiki, idan kuma ya daina aiki, ajiyar na biyu zai zama mai aiki.

Wannan makirci ya fi dogara fiye da algorithm na duniya, tun da yake don kunna sabon master ya isa ya ƙayyade gazawar tsohon.

Amma ta yaya abokan ciniki za su fahimci wane maigidan ke aiki yanzu? Ba shi yiwuwa a aika bayanai ga dubban abokan ciniki a cikin 50 ms. Wani yanayi yana yiwuwa lokacin da abokin ciniki ya aiko da buƙatun buɗe ma'amala, ba tare da sanin cewa wannan maigidan baya aiki ba, kuma buƙatar za ta ƙare. Don hana faruwar hakan, abokan ciniki sun yi hasashe aika buƙatu don buɗe ma'amala ga masters na rukuni da duka ajiyarsa a lokaci ɗaya, amma wanda shine babban malamin a halin yanzu zai amsa wannan buƙatar. Abokin ciniki zai yi duk sadarwa ta gaba a cikin ma'amala tare da maigidan aiki kawai.

Ajiyayyen masters wuri samu buƙatun don ma'amaloli da ba nasu a cikin jerin gwano na ma'amaloli da ba a haifa ba, inda aka adana su na wani lokaci. Idan maigidan mai aiki ya mutu, sabon maigidan yana aiwatar da buƙatun buɗe ma'amaloli daga layin sa kuma yana amsawa abokin ciniki. Idan abokin ciniki ya riga ya buɗe ma'amala tare da tsohon maigidan, to, an yi watsi da amsa na biyu (kuma, a fili, irin wannan ma'amala ba zai cika ba kuma abokin ciniki zai maimaita shi).

Yadda ciniki ke aiki

Bari mu ce abokin ciniki ya aika da buƙatu zuwa ga mai gudanarwa don buɗe ma'amala don irin wannan mahaluƙi mai irin wannan maɓalli na farko. Mai gudanarwa yana kulle wannan mahallin kuma ya sanya shi a cikin tebur na kulle a ƙwaƙwalwar ajiya. Idan ya cancanta, mai gudanarwa yana karanta wannan mahallin daga ma'adana kuma yana adana bayanan da aka samu a cikin yanayin ma'amala a cikin ƙwaƙwalwar mai gudanarwa.

NewSQL = NoSQL+ACID

Lokacin da abokin ciniki ke son canza bayanai a cikin ma'amala, yana aika buƙatu ga mai gudanarwa don gyara mahallin, kuma mai gudanarwa yana sanya sabbin bayanai a cikin tebur matsayin ma'amala a cikin ƙwaƙwalwar ajiya. Wannan yana kammala rikodin - ba a yin rikodin zuwa ma'ajiyar.

NewSQL = NoSQL+ACID

Lokacin da abokin ciniki ya nemi nasa bayanan da aka canza a matsayin wani ɓangare na ma'amala mai aiki, mai gudanarwa yana aiki kamar haka:

  • idan ID ya riga ya kasance a cikin ma'amala, to, ana ɗaukar bayanan daga ƙwaƙwalwar ajiya;
  • idan ID ɗin ba ya cikin ƙwaƙwalwar ajiya, to ana karanta bayanan da suka ɓace daga nodes ɗin ajiya, haɗe da waɗanda ke cikin ƙwaƙwalwar ajiya, kuma an ba da sakamakon ga abokin ciniki.

Don haka, abokin ciniki zai iya karanta nasa canje-canje, amma sauran abokan ciniki ba sa ganin waɗannan canje-canje, saboda ana adana su kawai a cikin ƙwaƙwalwar mai gudanarwa; har yanzu ba su kasance a cikin nodes na Cassandra ba.

NewSQL = NoSQL+ACID

Lokacin da abokin ciniki ya aika alkawari, mai gudanarwa yana adana yanayin da ke cikin ƙwaƙwalwar ajiyar sabis ɗin a cikin rukunin da aka shigar, kuma ana aika shi azaman saƙon saƙo zuwa ma'ajiyar Cassandra. Shagunan suna yin duk abin da ya wajaba don tabbatar da cewa an yi amfani da wannan fakitin atomically (cikakken), da mayar da martani ga mai gudanarwa, wanda ya saki makullai kuma ya tabbatar da nasarar cinikin ga abokin ciniki.

NewSQL = NoSQL+ACID

Kuma don komawa baya, mai gudanarwa kawai yana buƙatar yantar da ƙwaƙwalwar ajiyar da jihar ma'amala ta mamaye.

Sakamakon abubuwan ingantawa na sama, mun aiwatar da ka'idodin ACID:

  • Atomity. Wannan garantin ne cewa babu wani ɓangaren ciniki da za a rubuta a cikin tsarin; ko dai za a kammala duk ayyukan da aka yi, ko kuma ba za a kammala ba. Muna bin wannan ka'ida ta hanyar shiga cikin Cassandra.
  • Daidaitawa. Kowace ma'amala mai nasara, ta ma'anarta, tana rubuta ingantaccen sakamako kawai. Idan, bayan buɗe ma'amala da aiwatar da wani ɓangare na ayyukan, an gano cewa sakamakon bai inganta ba, ana yin jujjuyawar.
  • Kaɗaici. Lokacin da aka aiwatar da ma'amala, ma'amaloli na lokaci guda bai kamata su shafi sakamakonta ba. An ware ma'amaloli masu fafatawa ta amfani da makullai marasa ra'ayi akan mai gudanarwa. Don karantawa a wajen ciniki, ana lura da ƙa'idar keɓewa a matakin Ƙaddamar da Karatu.
  • РЈСЃС‚РѕР№С ‡ ивость. Ba tare da la'akari da matsaloli a ƙananan matakan ba-baƙin tsarin, gazawar hardware - canje-canjen da aka yi ta hanyar ma'amala cikin nasara ya kamata a kiyaye su lokacin da ayyukan suka ci gaba.

Karatu ta fihirisa

Bari mu ɗauki tebur mai sauƙi:

CREATE TABLE photos (
id bigint primary key,
owner bigint,
modified timestamp,
…)

Yana da ID (maɓalli na farko), mai shi da kwanan wata gyarawa. Kuna buƙatar yin buƙatu mai sauƙi - zaɓi bayanai akan mai shi tare da canjin kwanan wata "don ranar ƙarshe".

SELECT *
WHERE owner=?
AND modified>?

Domin aiwatar da irin wannan tambayar da sauri, a cikin DBMS na gargajiya na SQL kuna buƙatar gina fihirisa ta ginshiƙai (mai shi, gyara). Za mu iya yin hakan cikin sauƙi, tunda yanzu muna da garantin ACID!

Fihirisa a cikin C * Ɗaya

Akwai tebur mai tushe tare da hotuna wanda ID ɗin rikodin shine maɓalli na farko.

NewSQL = NoSQL+ACID

Don fihirisa, C * Ɗaya yana ƙirƙirar sabon tebur wanda shine kwafin asali. Makullin iri ɗaya ne da furcin fihirisa, kuma ya haɗa da maɓallin farko na rikodin daga teburin tushe:

NewSQL = NoSQL+ACID

Yanzu tambayar "mai shi na ranar ƙarshe" ana iya sake rubuta shi azaman zaɓi daga wani tebur:

SELECT * FROM i1_test
WHERE owner=?
AND modified>?

Daidaiton bayanai a cikin hotunan tebur na tushe da tebur mai ma'ana i1 ana kiyaye su ta atomatik ta mai gudanarwa. Dangane da tsarin bayanan kawai, lokacin da aka karɓi canji, mai gudanarwa ya haifar da adana canji ba kawai a cikin babban tebur ba, har ma a cikin kwafi. Ba a yin ƙarin ayyuka akan tebur mai ƙididdiga, ba a karanta rajistan ayyukan, kuma ba a yi amfani da makullai. Wato, ƙara fihirisa yana cinye kusan babu albarkatu kuma kusan ba shi da tasiri kan saurin aiwatar da gyare-gyare.

Yin amfani da ACID, mun sami damar aiwatar da firikwensin SQL. Suna daidaitawa, daidaitawa, sauri, daidaitawa, kuma an gina su cikin harshen tambayar CQL. Ba a buƙatar canje-canje ga lambar aikace-aikacen don tallafawa fihirisa. Komai yana da sauƙi kamar yadda yake a cikin SQL. Kuma mafi mahimmanci, fihirisa ba sa shafar saurin aiwatar da gyare-gyare zuwa teburin ma'amala na asali.

Me ya faru

Mun haɓaka C * Daya shekaru uku da suka gabata kuma mun ƙaddamar da shi zuwa kasuwancin kasuwanci.

Me muka samu a karshe? Bari mu kimanta wannan ta amfani da misalin tsarin sarrafa hoto da adanawa, ɗaya daga cikin mahimman nau'ikan bayanai a cikin hanyar sadarwar zamantakewa. Ba muna magana ne game da jikin hotunan da kansu ba, amma game da kowane irin bayanan meta. Yanzu Odnoklassniki yana da kimanin biliyan 20 irin waɗannan bayanan, tsarin yana aiwatar da buƙatun karantawa dubu 80 a sakan na biyu, har zuwa ma'amalar ACID dubu 8 a sakan daya da alaƙa da gyare-gyaren bayanai.

Lokacin da muka yi amfani da SQL tare da nau'in maimaitawa = 1 (amma a cikin RAID 10), an adana bayanan hoto akan gungun injuna 32 da ke da Microsoft SQL Server (da 11 backups). An kuma ware sabar guda 10 don adana madogara. Jimlar motoci 50 masu tsada. A lokaci guda, tsarin yana aiki a ƙimar ƙimar nauyi, ba tare da ajiya ba.

Bayan ƙaura zuwa sabon tsarin, mun sami kwafi factor = 3 - kwafi a kowace cibiyar bayanai. Tsarin ya ƙunshi nodes ɗin ajiya na Cassandra 63 da injuna masu daidaitawa 6, don jimlar sabar 69. Amma waɗannan injinan sun fi rahusa, jimlar kuɗin su kusan kashi 30% na farashin tsarin SQL. A lokaci guda, ana ajiye kaya a 30%.

Tare da gabatarwar C * Daya, latency kuma ya ragu: a cikin SQL, aikin rubutu ya ɗauki kusan 4,5 ms. A cikin C * Daya - kusan 1,6 ms. Tsawon lokacin ciniki yana kan matsakaicin ƙasa da 40 ms, an gama ƙaddamarwa a cikin 2 ms, lokacin karantawa da rubutawa yana kan matsakaicin 2 ms. 99th percentile - kawai 3-3,1 ms, adadin lokacin fita ya ragu da sau 100 - duk saboda yawan amfani da hasashe.

Ya zuwa yanzu, yawancin nodes ɗin SQL Server an soke su; ana haɓaka sabbin samfuran kawai ta amfani da C * One. Mun daidaita C * Daya don yin aiki a cikin girgijenmu girgije daya, wanda ya sa ya yiwu a hanzarta tura sabbin gungu, sauƙaƙe daidaitawa da sarrafa aiki ta atomatik. Idan ba tare da lambar tushe ba, yin wannan zai zama mafi wahala da wahala.

Yanzu muna aiki don canja wurin sauran wuraren ajiyar mu zuwa gajimare - amma wannan labari ne mabanbanta.

source: www.habr.com

Add a comment