Endleleni yolwazi olungenaseva - kanjani futhi ngani

Sanibonani nonke! Igama lami nginguGolov Nikolay. Ngaphambilini, ngasebenza e-Avito futhi ngiphethe i-Data Platform iminyaka eyisithupha, okungukuthi, ngasebenza kuzo zonke izingosi zolwazi: ukuhlaziya (Vertica, ClickHouse), ukusakaza kanye ne-OLTP (Redis, Tarantool, VoltDB, MongoDB, PostgreSQL). Ngalesi sikhathi, ngibhekane nenani elikhulu lemininingwane yolwazi - ehluke kakhulu futhi engavamile, kanye namacala angewona ajwayelekile okusetshenziswa kwawo.

Njengamanje ngisebenza kwaManyChat. Empeleni, lokhu kuyisiqalo - esisha, esinesifiso sokuvelela futhi esikhula ngokushesha. Futhi lapho ngiqala ukujoyina inkampani, kwaphakama umbuzo wakudala: "Yini okufanele isiqalo esincane manje siyithathe ku-DBMS kanye nemakethe yedatha?"

Kulesi sihloko, ngokusekelwe embikweni wami othi Umkhosi we-inthanethi we-RIT++2020, ngizowuphendula lo mbuzo. Inguqulo yevidiyo yombiko iyatholakala kokuthi YouTube.

Endleleni yolwazi olungenaseva - kanjani futhi ngani

Imininingwane eyaziwa kakhulu ngo-2020

Ngu-2020, ngiqalaze ngabona izinhlobo ezintathu zolwazi.

Uhlobo lokuqala - isizindalwazi se-OLTP sakudala: I-PostgreSQL, i-SQL Server, i-Oracle, i-MySQL. Zabhalwa kudala, kodwa zisabalulekile ngoba zijwayeleke kakhulu emphakathini onjiniyela.

Uhlobo lwesibili izisekelo kusuka ku-"zero". Bazame ukuqhela kumaphethini akudala ngokushiya i-SQL, izakhiwo zendabuko kanye ne-ACID, ngokwengeza ukushadi okwakhelwe ngaphakathi nezinye izici ezikhangayo. Isibonelo, lena i-Cassandra, i-MongoDB, i-Redis noma i-Tarantool. Zonke lezi zixazululo bezifuna ukunikeza imakethe okuthile okusha futhi okuthathe indawo yazo ngoba kuvele ukuthi yayilungele kakhulu imisebenzi ethile. Ngizosho lezi zingosi zolwazi ngegama lesambulela elithi NOSQL.

"Ama-zero" asephelile, sesijwayele ukugcinwa kwedatha ye-NOSQL, futhi umhlaba, ngokubuka kwami, uthathe isinyathelo esilandelayo - ukuze isizindalwazi esiphethwe. Lezi zingosi zolwazi zinomongo ofanayo nezizindalwazi ze-OLTP zasendulo noma ezintsha ze-NoSQL. Kepha abanaso isidingo se-DBA ne-DevOps futhi basebenza ngehadiwe ephethwe emafini. Kunjiniyela, lokhu "isisekelo nje" esisebenza endaweni ethile, kodwa akekho onendaba ukuthi sifakwe kanjani kuseva, ngubani olungiselele iseva nokuthi ubani oyibuyekezayo.

Izibonelo zesizindalwazi esinjalo:

  • I-AWS RDS isisonga esiphethwe se-PostgreSQL/MySQL.
  • I-DynamoDB iyi-analogue ye-AWS yesizindalwazi esekwe kudokhumenti, efana neRedis neMongoDB.
  • I-Amazon Redshift iyisizindalwazi sokuhlaziya esiphethwe.

Lezi zingosi zolwazi ezindala, kodwa ezikhuliswe endaweni ephethwe, ngaphandle kwesidingo sokusebenza ngehadiwe.

Qaphela. Izibonelo zithathelwa indawo ye-AWS, kodwa ama-analogue azo akhona ku-Microsoft Azure, Google Cloud, noma Yandex.Cloud.

Endleleni yolwazi olungenaseva - kanjani futhi ngani

Yini entsha ngalokhu? Ngo-2020, akukho kulokhu.

Umqondo ongenaseva

Okusha ngempela emakethe ngo-2020 kuyizixazululo ezingenasiphakeli noma ezingenasiphakeli.

Ngizozama ukuchaza ukuthi lokhu kusho ukuthini ngokusebenzisa isibonelo sesevisi evamile noma uhlelo lokusebenza olungemuva.
Ukuze sikhiphe isicelo esivamile esingemuva, sithenga noma siqashe iseva, sikopishela ikhodi kuyo, sishicilele indawo yokugcina ngaphandle futhi sikhokhela njalo irenti, ugesi kanye nezinsizakalo zesikhungo sedatha. Lolu uhlelo olujwayelekile.

Ingabe ikhona enye indlela? Ngamasevisi angenaseva ungakwazi.

Iyini le ndlela egxilwe ngayo: ayikho iseva, akukho ngisho nokuqasha isenzakalo esibonakalayo emafini. Ukuze usebenzise isevisi, kopisha ikhodi (imisebenzi) endaweni yokugcina futhi uyishicilele ekugcineni. Bese sivele sikhokhele ikholi ngayinye kulo msebenzi, singazinaki ngokuphelele ihadiwe lapho yenzelwa khona.

Ngizozama ukufanekisa le ndlela ngezithombe.
Endleleni yolwazi olungenaseva - kanjani futhi ngani

Ukuthunyelwa kwakudala. Sinesevisi enomthwalo othile. Siphakamisa izimo ezimbili: amaseva aphathekayo noma izimo ku-AWS. Izicelo zangaphandle zithunyelwa kulezi zimo futhi zicutshungulwe lapho.

Njengoba ubona esithombeni, amaseva awachithwa ngokulinganayo. Eyodwa isetshenziswa ngo-100%, kunezicelo ezimbili, kanti eyodwa i-50% kuphela - ayisebenzi kancane. Uma kungenjalo izicelo ezintathu ezifika, kodwa ezingu-30, khona-ke lonke uhlelo ngeke lukwazi ukubhekana nomthwalo futhi luzoqala ukwehla.

Endleleni yolwazi olungenaseva - kanjani futhi ngani

Ukuthunyelwa okungenaseva. Endaweni engenasiphakeli, isevisi enjalo ayinazo izimo noma amaseva. Kukhona ichibi elithile lezinsiza ezishisayo - iziqukathi ezincane ezilungisiwe ze-Docker ezinekhodi yokusebenza efakiwe. Uhlelo luthola izicelo zangaphandle futhi ngayinye yazo uhlaka olungenasiphakeli luphakamisa isitsha esincane ngekhodi: lucubungula lesi sicelo esithile futhi lubulale isitsha.

Isicelo esisodwa - isitsha esisodwa esiphakanyisiwe, izicelo eziyi-1000 - iziqukathi eziyi-1000. Futhi ukuthunyelwa kumaseva ehadiwe sekuvele kuwumsebenzi womhlinzeki wamafu. Ifihliwe ngokuphelele ngohlaka olungenaseva. Kulo mqondo sikhokhela zonke izingcingo. Isibonelo, ucingo olulodwa lwafika ngosuku - sikhokhele ucingo olulodwa, isigidi safika ngomzuzu - sikhokhele isigidi. Noma ngomzuzwana, nalokhu kwenzeka.

Umqondo wokushicilela umsebenzi ongenasiphakeli ulungele isevisi engenasimo. Futhi uma udinga isevisi (yesifundazwe) egcwele, bese sengeza isizindalwazi kusevisi. Kulokhu, uma kukhulunywa ngokusebenza nesifunda, umsebenzi ngamunye we-statefull umane ubhala futhi ufunde kusizindalwazi. Ngaphezu kwalokho, kusukela kusizindalwazi sanoma yiziphi izinhlobo ezintathu ezichazwe ekuqaleni kwesihloko.

Yimuphi umkhawulo ovamile wazo zonke lezi zingosi zolwazi? Lezi yizindleko zefu elisetshenziswa njalo noma iseva yehadiwe (noma amaseva amaningana). Akukhathalekile ukuthi sisebenzisa imininingo egciniwe yakudala noma ephethwe, kungakhathaliseki ukuthi sinayo i-Devops nomqondisi noma cha, sisakhokhela izingxenyekazi zekhompuyutha, ugesi kanye nokuqashwa kwesikhungo sedatha 24/7. Uma sinesisekelo sakudala, sikhokhela inkosi nesigqila. Uma kuyisizindalwazi esilayishwe kakhulu, sikhokhela amaseva angu-10, 20 noma angu-30, futhi sikhokha njalo.

Ukuba khona kwamaseva agcinwe unomphela esakhiweni sezindleko ngaphambilini kubonwe njengobubi obudingekayo. Imininingo egciniwe evamile nayo inobunye ubunzima, njengemikhawulo enanini loxhumo, imikhawulo yokukala, ukuvumelana okusatshalaliswe nge-geo - ingaxazululwa ngandlela thize kusizindalwazi esithile, kodwa hhayi konke ngesikhathi esisodwa futhi hhayi ngokufanelekile.

Isizindalwazi esingenaseva - ithiyori

Umbuzo ka-2020: kungenzeka yini ukwenza isizindalwazi singabi nasiphakeli futhi? Wonke umuntu uzwile mayelana ne-serverless backend... ake sizame ukwenza isizindalwazi singabi nasiphakeli?

Lokhu kuzwakala kuxakile, ngoba isizindalwazi siyisevisi egcwele, ayilungele ingqalasizinda engenaseva. Ngesikhathi esifanayo, isimo se-database sikhulu kakhulu: ama-gigabytes, ama-terabytes, nasezinqolobaneni zokuhlaziya ngisho nama-petabytes. Akulula kangako ukuyiphakamisa ezitsheni ze-Docker ezingasindi.

Ngakolunye uhlangothi, cishe zonke izingosi zolwazi zesimanje ziqukethe inani elikhulu lezinto ezinengqondo nezingxenye: ukuthengiselana, ukuxhumanisa ubuqotho, izinqubo, ukuncika kobudlelwane kanye nokunengqondo okuningi. Ngobuningi be-logic yedathabhesi, isimo esincane sanele. Ama-Gigabytes nama-Terabytes asetshenziswa ngokuqondile ingxenye encane ye-logic yedathabhesi ehilelekile ekwenzeni imibuzo ngokuqondile.

Ngokuvumelana nalokho, umbono uthi: uma ingxenye yengqondo ivumela ukubulawa okungenasimo, kungani ungahlukanisi isisekelo sibe izingxenye Ezinobuzwe nezingenasimo.

Ayinaseva yezixazululo ze-OLAP

Ake sibone ukuthi ukusika isizindalwazi ezingxenyeni ezinoMbuso nezingenasimo kungase kubukeke kanjani usebenzisa izibonelo ezingokoqobo.

Endleleni yolwazi olungenaseva - kanjani futhi ngani

Isibonelo, sinesizindalwazi sokuhlaziya: idatha yangaphandle (isilinda esibomvu kwesokunxele), inqubo ye-ETL elayisha idatha kusizindalwazi, kanye nomhlaziyi othumela imibuzo ye-SQL kusizindalwazi. Lolu uhlelo lwakudala lokusebenza lwe-warehouse data.

Kulolu hlelo, i-ETL yenziwa kanye ngokwemibandela. Khona-ke udinga ukukhokhela njalo amaseva lapho i-database isebenza khona ngedatha egcwele i-ETL, ukuze kube khona into yokuthumela imibuzo kuyo.

Ake sibheke enye indlela esetshenziswe ku-AWS Athena Serverless. Azikho izingxenyekazi zekhompuyutha ezizinikezele unaphakade lapho kugcinwa khona idatha elandiwe. Esikhundleni salokhu:

  • Umsebenzisi uhambisa umbuzo we-SQL ku-Athena. I-Athena optimizer ihlaziya umbuzo we-SQL futhi isesha esitolo semethadatha (Imethadatha) ukuze uthole idatha ethile edingekayo ukuze kusetshenziswe umbuzo.
  • Isilungiseleli, ngokusekelwe kudatha eqoqiwe, silanda idatha edingekayo emithonjeni yangaphandle iye kwisitoreji sesikhashana (isizindalwazi sesikhashana).
  • Umbuzo we-SQL ovela kumsebenzisi usetshenziswa kwisitoreji sesikhashana futhi umphumela ubuyiselwa kumsebenzisi.
  • Isitoreji sesikhashana siyasulwa futhi izinsiza ziyakhululwa.

Kulesi sakhiwo, sikhokhela kuphela inqubo yokwenza isicelo. Azikho izicelo - azikho izindleko.

Endleleni yolwazi olungenaseva - kanjani futhi ngani

Lena indlela yokusebenza futhi ayisetshenziswanga kuphela ku-Athena Serverless, kodwa naku-Redshift Spectrum (ku-AWS).

Isibonelo se-Athena sibonisa ukuthi isizindalwazi se-Serverless sisebenza emibuzweni yangempela ngamashumi namakhulu ama-Terabytes edatha. Amakhulu amaTerabyte azodinga amakhulukhulu amaseva, kodwa akudingeki siwakhokhele - siyazikhokhela izicelo. Isivinini sesicelo ngasinye siphansi (kakhulu) uma siqhathaniswa nesizindalwazi sokuhlaziya esikhethekile njenge-Vertica, kodwa asizikhokheli izikhathi zokuphumula.

Isizindalwazi esinjalo siyasebenza emibuzweni ye-ad-hoc yokuhlaziya engajwayelekile. Isibonelo, uma sinquma ngokuzenzakalelayo ukuhlola i-hypothesis enani elikhulu ledatha. I-Athena iphelele kulawa macala. Ngezicelo ezijwayelekile, uhlelo olunjalo luyabiza. Kulokhu, gcina idatha kusixazululo esithile esikhethekile.

Ayinaseva yezixazululo ze-OLTP

Isibonelo sangaphambilini sibheke imisebenzi ye-OLAP (yokuhlaziya). Manje ake sibheke imisebenzi ye-OLTP.

Ake sicabange nge-PostgreSQL engenzeka noma i-MySQL. Ake siphakamise isibonelo esiphethwe njalo i-PostgreSQL noma i-MySQL enezinsiza ezincane. Uma isibonelo sithola umthwalo owengeziwe, sizoxhuma izifaniso ezengeziwe lapho sizosabalalisa khona ingxenye yomthwalo wokufunda. Uma zingekho izicelo noma zilayisha, sivala ama-replicas. Owokuqala ungowokuqala, kanti ezinye ziyizifaniso.

Lo mbono usetshenziswa kusizindalwazi esibizwa nge-Aurora Serverless AWS. Umgomo ulula: izicelo ezivela kuzicelo zangaphandle zamukelwa umkhumbi wama-proxy. Ngokubona umthwalo ukhuphuka, yabela izinsiza zekhompiyutha ezimweni ezincane ezifudunyeziwe ngaphambilini - ukuxhumana kwenziwa ngokushesha okukhulu. Ukukhubaza izimo kwenzeka ngendlela efanayo.

Ngaphakathi kwe-Aurora kunomqondo we-Aurora Capacity Unit, ACU. Lesi (ngokombandela) yisibonelo (iseva). I-ACU ngayinye ethile ingaba inkosi noma isigqila. Iyunithi yeCapacity ngayinye ine-RAM yayo, iphrosesa kanye nediski elincane. Ngakho-ke, enye ingumpetha, enye ifundwa kuphela izifaniso.

Inombolo yalawa ma-Aurora Capacity Units asebenzayo iyipharamitha elungisekayo. Ubuningi obuncane bungaba yinye noma zero (kulokhu, i-database ayisebenzi uma zingekho izicelo).

Endleleni yolwazi olungenaseva - kanjani futhi ngani

Lapho isizinda sithola izicelo, imikhumbi ye-proxy inyusa i-Aurora CapacityUnits, ikhulisa izinsiza zokusebenza zesistimu. Ikhono lokukhulisa nokunciphisa izinsiza livumela isistimu ukuthi "ijuquze" izinsiza: ibonise ngokuzenzakalelayo ama-ACU angawodwana (iwashintsha afake amasha) futhi ikhiphe zonke izibuyekezo zamanje ezinsizeni ezihoxisiwe.

Isisekelo se-Aurora Serverless singakala umthwalo wokufunda. Kodwa imibhalo ayisho lokhu ngokuqondile. Kungase kuzwakale sengathi bangakwazi ukuphakamisa i-multi-master. Awukho umlingo.

Le database ifaneleka kahle ukugwema ukusebenzisa imali enkulu kumasistimu anokufinyelela okungalindelekile. Isibonelo, uma udala i-MVP noma amasayithi ekhadi lebhizinisi lokumaketha, ngokuvamile asilindele umthwalo ozinzile. Ngokufanelekile, uma kungekho ukufinyelela, asizikhokheli izimo. Uma kwenzeka umthwalo ongalindelekile, isibonelo ngemva kwengqungquthela noma umkhankaso wokukhangisa, izixuku zabantu zivakashela isayithi futhi umthwalo ukhuphuka kakhulu, i-Aurora Serverless ithatha ngokuzenzakalelayo lo mthwalo futhi ixhuma ngokushesha izinsiza ezingekho (ACU). Bese kudlula ingqungquthela, wonke umuntu uyakhohlwa mayelana ne-prototype, amaseva (ACU) abe mnyama, futhi izindleko zehla ziye ku-zero - zilula.

Lesi sixazululo asifanele ukulayishwa okuphezulu okuzinzile ngoba asilinganisi umthwalo wokubhala. Konke lokhu kuxhumana kanye nokunqanyulwa kwezinsiza kwenzeka endaweni ebizwa ngokuthi “iphoyinti lesikali” - isikhathi lapho isizindalwazi singasekelwa yi-transaction noma amathebula esikhashana. Isibonelo, phakathi nesonto iphuzu lesikali lingase lingenzeki, futhi isisekelo sisebenza ezinsizeni ezifanayo futhi asikwazi ukunwebeka noma isivumelwano.

Awukho umlingo - i-PostgreSQL ejwayelekile. Kodwa inqubo yokwengeza imishini nokuyinqamula i-automatic ngokwengxenye.

Ayinaseva ngokuklama

I-Aurora Serverless isizindalwazi esidala esibhalwe kabusha ukuze ifu lisebenzise ezinye zezinzuzo ze-Serverless. Futhi manje ngizokutshela ngesisekelo, esasibhalelwe ifu ekuqaleni, ngendlela engenasiphakeli - I-Serverless-by-design. Yakhiwe ngokushesha ngaphandle kokucatshangwa ukuthi izosebenza kumaseva angokwenyama.

Lesi sisekelo sibizwa nge-Snowflake. Inamabhulokhi angukhiye amathathu.

Endleleni yolwazi olungenaseva - kanjani futhi ngani

Eyokuqala ivimba imethadatha. Lena isevisi yenkumbulo esheshayo exazulula izinkinga ngokuvikeleka, imethadatha, imisebenzi, nokuthuthukiswa kwemibuzo (eboniswe emfanekisweni ongakwesokunxele).

Ibhulokhi yesibili iqoqo lamaqoqo ekhompuyutha abonakalayo okubala (emfanekisweni kukhona isethi yemibuthano eluhlaza okwesibhakabhaka).

Ibhulokhi yesithathu iyisistimu yokugcina idatha esekelwe ku-S3. I-S3 iyisitoreji sezinto ezingenasici ku-AWS, efana neDropbox engenasici yebhizinisi.

Ake sibone ukuthi i-Snowflake isebenza kanjani, sithatha isiqalo esibandayo. Okusho ukuthi, kukhona i-database, idatha ilayishwa kuyo, ayikho imibuzo esebenzayo. Ngokufanelekile, uma zingekho izicelo kusizindalwazi, sizobe sesiphakamisa isevisi ye-Metadata yenkumbulo esheshayo (ibhulokhi yokuqala). Futhi sinesitoreji se-S3, lapho idatha yetafula igcinwa khona, ihlukaniswe okuthiwa yi-micropartitions. Ukwenza kube lula: uma ithebula liqukethe ukuthengiselana, khona-ke ama-micropartitions ayizinsuku zokuthengiselana. Nsuku zonke i-micropartition ehlukile, ifayela elihlukile. Futhi uma isizindalwazi sisebenza kule modi, ukhokhela kuphela isikhala esithathwe idatha. Ngaphezu kwalokho, izinga lesihlalo ngasinye liphansi kakhulu (ikakhulukazi kucatshangelwa ukucindezelwa okuphawulekayo). Isevisi yemethadatha nayo isebenza njalo, kodwa awudingi izinsiza eziningi ukuze ulungiselele imibuzo, futhi isevisi ingabhekwa njenge-shareware.

Manje ake sicabange ukuthi umsebenzisi ufike kusizindalwazi sethu futhi wathumela umbuzo we-SQL. Umbuzo we-SQL uthunyelwa ngokushesha kusevisi ye-Metadata ukuze icutshungulwe. Ngakho, lapho ithola isicelo, le sevisi ihlaziya isicelo, idatha etholakalayo, izimvume zabasebenzisi futhi, uma konke kuhamba kahle, idweba uhlelo lokucubungula isicelo.

Okulandelayo, isevisi iqala ukwethulwa kweqoqo lekhompuyutha. I-computing cluster iyiqoqo lamaseva enza izibalo. Okusho ukuthi, leli yiqoqo elingaqukatha iseva engu-1, amaseva angu-2, 4, 8, 16, 32 - amaningi ngendlela oyifunayo. Uphonsa isicelo futhi ukwethulwa kwaleli qoqo ngokushesha kuqala. Ngempela kuthatha imizuzwana.

Endleleni yolwazi olungenaseva - kanjani futhi ngani

Okulandelayo, ngemva kokuthi iqoqo seliqalile, ama-micropartitions adingekayo ukuze kucutshungulwe isicelo sakho aqale ukukopishelwa kuqoqo ukusuka ku-S3. Okusho ukuthi, ake sicabange ukuthi ukwenza umbuzo we-SQL udinga ukuhlukaniswa okubili kusuka etafuleni elilodwa nokukodwa kwesibili. Kulesi simo, kuphela izingxenye ezintathu ezidingekayo ezizokopishwa ku-cluster, hhayi wonke amatafula ngokuphelele. Kungakho, futhi ngenxa yokuthi yonke into itholakala phakathi kwesikhungo sedatha esisodwa futhi ixhunywe iziteshi ezishesha kakhulu, yonke inqubo yokudlulisa yenzeka ngokushesha okukhulu: ngemizuzwana, kuyaqabukela emaminithini, ngaphandle uma sikhuluma ngezinye izicelo ezimbi kakhulu . Ngokuvumelana nalokho, ama-micropartitions akopishelwa kuqoqo lekhompiyutha, futhi, lapho sekuqediwe, umbuzo we-SQL uyasetshenziswa kuleli qoqo lekhompyutha. Umphumela walesi sicelo ungaba umugqa owodwa, imigqa eminingana noma ithebula - ithunyelwa ngaphandle kumsebenzisi ukuze akwazi ukuyilanda, ayiveze ethuluzini lakhe le-BI, noma alisebenzise ngenye indlela.

Umbuzo ngamunye we-SQL awukwazi ukufunda kuphela izilinganiso ezivela kudatha elayishwe ngaphambilini, kodwa futhi ulayishe/ukhiqize idatha entsha kusizindalwazi. Okusho ukuthi, kungaba umbuzo wokuthi, ngokwesibonelo, ufaka amarekhodi amasha kwelinye itafula, okuholela ekubukeni kokuhlukaniswa okusha kuqoqo le-computing, okulondolozwe ngokuzenzakalelayo kusitoreji esisodwa se-S3.

Isimo esichazwe ngenhla, kusukela ekufikeni komsebenzisi kuya ekukhulisweni kweqoqo, ukulayisha idatha, ukwenza imibuzo, ukuthola imiphumela, kukhokhwa ngenani lamaminithi okusebenzisa iqoqo lekhompuyutha eliphakanyisiwe, indawo yokugcina impahla ebonakalayo. Izinga liyahlukahluka kuye ngendawo ye-AWS nosayizi weqoqo, kodwa ngokwesilinganiso ngamadola ambalwa ngehora. Iqoqo lemishini emine libiza ngokuphindwe kabili kuneqoqo lemishini emibili, futhi iqoqo lemishini eyisishiyagalombili lisabiza ngokuphindwe kabili. Izinketho ezingu-16, 32 zemishini ziyatholakala, kuye ngobunkimbinkimbi bezicelo. Kodwa ukhokhela leyo mizuzu kuphela lapho iqoqo lisebenza ngempela, ngoba uma kungekho zicelo, ususa izandla zakho, futhi ngemva kwemizuzu engu-5-10 yokulinda (ipharamitha elungisekayo) izozihambela yodwa, khulula izinsiza futhi ukhululeke.

Isimo esingokoqobo ngokuphelele lapho uthumela isicelo, iqoqo liyavela, uma sikhuluma nje, ngomzuzu, libala omunye umzuzu, bese kuba imizuzu emihlanu ukuvala, bese ugcina ukhokhele imizuzu eyisikhombisa yokusebenza kwaleli qoqo, futhi hhayi izinyanga neminyaka.

Isimo sokuqala esichazwe kusetshenziswa i-Snowflake kusilungiselelo somsebenzisi oyedwa. Manje ake sicabange ukuthi kunabasebenzisi abaningi, okusondelene nesimo sangempela.

Ake sithi sinabahlaziyi abaningi kanye nemibiko ye-Tableau ehlala ihlasela isizindalwazi sethu ngenani elikhulu lemibuzo yokuhlaziya ye-SQL.

Ngaphezu kwalokho, ake sithi sinososayensi beDatha abasunguliwe abazama ukwenza izinto ezimbi kakhulu ngedatha, basebenze ngamashumi amaTerabyte, bahlaziye izigidigidi nezigidigidi zemigqa yedatha.

Ezinhlotsheni ezimbili zomthwalo womsebenzi ochazwe ngenhla, i-Snowflake ikuvumela ukuthi ukhuphule amaqoqo amaningana ekhompyutha azimele wamakhono ahlukene. Ngaphezu kwalokho, lawa maqoqo ekhompiyutha asebenza ngokuzimela, kodwa anedatha evamile engaguquki.

Ngenani elikhulu lemibuzo elula, ungaphakamisa amaqoqo amancane angu-2-3, cishe imishini emi-2 lilinye. Lokhu kuziphatha kungenziwa, phakathi kwezinye izinto, kusetshenziswa izilungiselelo ezizenzakalelayo. Ngakho uthi, “Ikhekheba leqhwa, phakamisa iqoqo elincane. Uma umthwalo kuwo ukhuphuka ngaphezu kwepharamitha ethile, phakamisa umzuzwana ofanayo, wesithathu. Lapho umthwalo uqala ukwehla, cisha okweqile.” Ukuze kungakhathaliseki ukuthi bangaki abahlaziyi abafika futhi baqale ukubheka imibiko, wonke umuntu unezinsiza ezanele.

Ngesikhathi esifanayo, uma abahlaziyi belele futhi kungekho muntu obheka imibiko, amaqoqo angase abe mnyama ngokuphelele, futhi uyeke ukuwakhokhela.

Ngesikhathi esifanayo, ngemibuzo esindayo (kusuka koSosayensi Bedatha), ungaphakamisa iqoqo elikhulu kakhulu lemishini engu-32. Leli qoqo lizokhokhelwa kuphela kuleyo mizuzu namahora lapho isicelo sakho esikhulu sisebenza lapho.

Ithuba elichazwe ngenhla likuvumela ukuthi uhlukanise hhayi kuphela i-2, kodwa futhi nezinhlobo eziningi zomthwalo womsebenzi zibe amaqoqo (ETL, ukuqapha, ukwenza umbiko,...).

Ake sifingqe i-Snowflake. Isisekelo sihlanganisa umbono omuhle kanye nokuqaliswa okusebenzayo. KwaManyChat, sisebenzisa i-Snowflake ukuze sihlaziye yonke idatha esinayo. Asinawo amaqoqo amathathu, njengakusibonelo, kodwa kusuka ku-5 kuya ku-9, osayizi abahlukene. Sinomshini ojwayelekile ongu-16, umshini ongu-2, kanye nomshini ongu-1 emincane kakhulu kweminye imisebenzi. Basakaza ngempumelelo umthwalo futhi basivumela ukuthi songe okuningi.

Isizindalwazi sikala ngempumelelo umthwalo wokufunda nokubhala. Lona umehluko omkhulu kanye nokuphumelela okukhulu uma kuqhathaniswa ne-"Aurora" efanayo, ephethe umthwalo wokufunda kuphela. I-snowflake ikuvumela ukuthi ulinganise umsebenzi wakho wokubhala ngala maqoqo ekhompyutha. Okusho ukuthi, njengoba ngishilo, sisebenzisa amaqoqo amaningana ku-ManyChat, amaqoqo amancane namancane kakhulu asetshenziselwa i-ETL, ukulayisha idatha. Futhi abahlaziyi sebevele bahlala kumaqoqo aphakathi, angathinteki nhlobo umthwalo we-ETL, ngakho asebenza ngokushesha okukhulu.

Ngokufanelekile, isizindalwazi siyifanele kahle imisebenzi ye-OLAP. Nokho, ngeshwa, ayikasebenzi emithwalweni yomsebenzi ye-OLTP. Okokuqala, le database iyikholomu, nayo yonke imiphumela elandelayo. Okwesibili, indlela ngokwayo, lapho ngesicelo ngasinye, uma kunesidingo, uphakamisa iqoqo lekhompiyutha futhi uligcwalise ngedatha, ngeshwa, ayikasheshi ngokwanele ukulayisha kwe-OLTP. Ukulinda imizuzwana yemisebenzi ye-OLAP kuvamile, kodwa emisebenzini ye-OLTP akwamukeleki; 100 ms kungaba ngcono, noma 10 ms kungaba ngcono nakakhulu.

Umphumela

Isizindalwazi esingenaseva singenzeka ngokuhlukanisa isizindalwazi sibe izingxenye Ezingenasimo Nezisho. Kungenzeka ukuthi uqaphele ukuthi kuzo zonke izibonelo ezingenhla, ingxenye Esemthethweni, uma kuqhathaniswa, igcina ama-micro-partitions ku-S3, futhi i-Stateless iyisilungiseleli, sisebenza ngemethadatha, sisingatha izindaba zokuphepha ezingaphakanyiswa njengamasevisi azimele angasindi we-Stateless.

Ukusebenzisa imibuzo ye-SQL kungase futhi kubonakale njengezinsizakalo zesimo esikhanyayo ezingavela kumodi engenasiphakeli, njengamaqoqo ekhompiyutha e-Snowflake, landa idatha edingekayo kuphela, yenze umbuzo futhi "iphume."

Imininingo egciniwe yezinga lokukhiqiza engenaseva isivele itholakalela ukusetshenziswa, iyasebenza. Lezi zingosi zolwazi ezingenaseva sezivele zilungele ukuphatha imisebenzi ye-OLAP. Ngeshwa, imisebenzi ye-OLTP isetshenziswa... ngama-nuances, njengoba kunemikhawulo. Ngakolunye uhlangothi, lokhu ukususa. Kodwa, ngakolunye uhlangothi, leli yithuba. Mhlawumbe omunye wabafundi uzothola indlela yokwenza i-OLTP database ingabi nasiphakeli ngokuphelele, ngaphandle kwemikhawulo ye-Aurora.

Ngethemba ukuthi ukuthole kuyathakazelisa. I-serverless ikusasa :)

Source: www.habr.com

Engeza amazwana