Iinkqubo zohlalutyo lweseva

Eli licandelo lesibini loluhlu lwamanqaku malunga neenkqubo zohlalutyo (ikhonkco kwinxalenye 1).

Iinkqubo zohlalutyo lweseva

Namhlanje akusekho mathandabuzo okuba ukusetyenzwa ngononophelo kwedatha kunye nokutolikwa kweziphumo kunokunceda phantse naluphi na uhlobo lweshishini. Kule nkalo, iinkqubo zokuhlalutya zikhula ngokukhawuleza kunye neeparitha, kwaye inani lezinto ezibangela kunye neziganeko zabasebenzisi kwizicelo zikhula.
Ngenxa yoku, iinkampani zinika abahlalutyi bazo ulwazi oluthe kratya ukuze bahlalutye kwaye bajike babe zizigqibo ezifanelekileyo. Ukubaluleka kwenkqubo yohlalutyo lwenkampani akufanele kujongelwe phantsi, kwaye inkqubo ngokwayo kufuneka ithembeke kwaye izinzile.

abahlalutyi abathengi

Uhlalutyo lwabathengi yinkonzo inkampani edibanisa kwiwebhusayithi yayo okanye isicelo ngokusebenzisa i-SDK esemthethweni, idibanisa kwi-codebase yayo kwaye ikhethe izinto ezibangela isiganeko. Kukho ukuchasana okucacileyo kule ndlela: yonke idatha eqokelelweyo ayinakusetyenzwa ngendlela ongathanda ngayo ngenxa yokunciphisa nayiphi na inkonzo oyikhethayo. Umzekelo, kwinkqubo enye akuyi kuba lula ukuqhuba imisebenzi ye- MapReduce, kwenye awuyi kukwazi ukuqhuba imodeli yakho. Enye into engalunganga iya kuba lityala eliqhelekileyo (elinomtsalane) leenkonzo.
Kukho izisombululo ezininzi zokuhlalutya kwabathengi kwiimarike, kodwa ngokukhawuleza okanye abahlalutyi kamva bajongene nenyaniso yokuba akukho nkonzo yendalo yonke efanelekileyo kuwo wonke umsebenzi (ngelixa amaxabiso azo zonke ezi nkonzo enyuka rhoqo). Kwimeko enjalo, iinkampani zihlala zithatha isigqibo sokudala inkqubo yazo yokuhlalutya nazo zonke iisetingi eziyimfuneko kunye nobuchule.

Abahlalutyi beseva

I-Server-side analytics yinkonzo enokuthi ifakwe ngaphakathi kwenkampani kwiiseva zayo kwaye (ngokuqhelekileyo) kunye nemigudu yayo. Kulo mzekelo, zonke iziganeko zabasebenzisi zigcinwa kwiiseva zangaphakathi, ezivumela abaphuhlisi ukuba bazame i-database ehlukeneyo yokugcina kwaye bakhethe i-architecture efanelekileyo kakhulu. Kwaye nokuba usafuna ukusebenzisa uhlalutyo lwabathengi bomntu wesithathu kwimisebenzi ethile, kuyakwenzeka.
Uhlalutyo lwecala lomncedisi lunokubekwa ngeendlela ezimbini. Okokuqala: khetha izinto eziluncedo zomthombo ovulekileyo, uzibeke koomatshini bakho kwaye uphuhlise ingqiqo yeshishini.

ΠŸΠ»ΡŽΡΡ‹
ΠœΠΈΠ½ΡƒΡΡ‹

Unokwenza nantoni na oyifunayo
Oku kuhlala kunzima kakhulu kwaye kufuna abaphuhlisi abahlukeneyo

Okwesibini: thatha iinkonzo ze-SaaS (i-Amazon, i-Google, i-Azure) endaweni yokuzihambisa ngokwakho. Siza kuthetha nge-SaaS ngokubanzi kwinxalenye yesithathu.

ΠŸΠ»ΡŽΡΡ‹
ΠœΠΈΠ½ΡƒΡΡ‹

Isenokuba nexabiso eliphantsi kwimiqulu ephakathi, kodwa ngokukhula okukhulu isenokubiza kakhulu
Akunakwenzeka ukulawula zonke iiparamitha

Ulawulo lugqithiselwa ngokupheleleyo emagxeni omniki-nkonzo
Akusoloko kwaziwa ukuba yintoni engaphakathi kwinkonzo (isenokungafuneki)

Indlela yokuqokelela uhlalutyo lweseva

Ukuba sifuna ukusuka ekusebenziseni i-analytics yabathengi kunye nokwakha ezethu, okokuqala kufuneka sicinge ngokwakhiwa kwenkqubo entsha. Ngezantsi ndiza kukuxelela inyathelo ngenyathelo into ekufuneka uyiqwalasele, kutheni inyathelo ngalinye lifuneka kwaye zeziphi izixhobo onokuzisebenzisa.

1. Ukufumana idatha

Kanye njengokuba kwimeko yohlalutyo lwabathengi, okokuqala, abahlalutyi beenkampani bakhetha iintlobo zeziganeko abafuna ukuzifunda kwixesha elizayo kwaye baziqokelele kuluhlu. Ngokuqhelekileyo, ezi ziganeko zenzeka ngendlela ethile, ebizwa ngokuthi "ipateni yesiganeko."
Emva koko, cinga ukuba isicelo seselula (iwebhusayithi) inabasebenzisi abaqhelekileyo (izixhobo) kunye neeseva ezininzi. Ukudlulisa ngokukhuselekileyo iziganeko ukusuka kwizixhobo ukuya kwiiseva, umaleko ophakathi uyafuneka. Ngokuxhomekeke kuyilo, kusenokubakho iminyhadala eyahlukeneyo yemisitho.
Apache Kafka - yi le pub/umgca ongezantsi, esetyenziswa njengomgca wokuqokelela iziganeko.

Ngokutsho iposti kwiQuora ngo-2014, umdali we-Apache Kafka wagqiba kwelokuba abize isoftware emva kukaFranz Kafka kuba "yinkqubo elungiselelwe ukubhala" kwaye ngenxa yokuba wayeyithanda imisebenzi kaKafka. - Wikipedia

Kumzekelo wethu, baninzi abavelisi bedatha kunye nabathengi bedatha (izixhobo kunye namaseva), kwaye iKafka inceda ukudibanisa omnye nomnye. Abathengi baya kuchazwa ngokubanzi kumanyathelo alandelayo, apho baya kuba zifundo eziphambili. Ngoku siza kuqwalasela kuphela abavelisi bedatha (iziganeko).
I-Kafka ihlanganisa iingqiqo zomgca kunye nokwahlula; kungcono ukufunda ngakumbi malunga nale ndawo kwenye indawo (umzekelo, kwi. amaxwebhu). Ngaphandle kokungena kwiinkcukacha, makhe sicinge ukuba usetyenziso lweselula luqaliswe kwii-OS ezimbini ezahlukeneyo. Emva koko inguqulelo nganye yenza eyayo isiganeko esahlukileyo. Abavelisi bathumela iziganeko eKafka, zirekhodwa kumgca ofanelekileyo.
Iinkqubo zohlalutyo lweseva
(umfanekiso kusuka apha)

Kwangaxeshanye, iKafka ikuvumela ukuba ufunde kwiichunks kwaye uqhube uthotho lweziganeko kwii-mini-batches. I-Kafka sisixhobo esiluncedo kakhulu esilinganisa kakuhle kunye neemfuno ezikhulayo (umzekelo, nge-geolocation yeziganeko).
Ngokuqhelekileyo i-shard enye yanele, kodwa izinto ziba nzima ngakumbi xa ulinganisa (njengoko zihlala zisenza). Mhlawumbi akukho mntu uya kufuna ukusebenzisa i-shard yenyama enye kuphela kwimveliso, ekubeni i-architecture kufuneka ibe nokunyamezela iimpazamo. Ukongeza kwiKafka, kukho esinye isisombululo esaziwayo-iRabbitMQ. Asizange siyisebenzise kwimveliso njengomgca wokuhlalutya isiganeko (ukuba unamava anjalo, sixelele ngayo kwizimvo!). Nangona kunjalo, sasebenzisa i-AWS Kinesis.

Ngaphambi kokuba siqhubele phambili kwisinyathelo esilandelayo, kufuneka sikhankanye enye inwele eyongezelelweyo yenkqubo - ukugcinwa kwelog eluhlaza. Lo ayingomaleko ofunekayo, kodwa kuya kuba luncedo ukuba kukho into engahambi kakuhle kwaye imigca yesiganeko eKafka iphinda isetyenziswe. Ukugcina iinkuni ezikrwada akufuni sisombululo esintsonkothileyo nesibizayo; ungazibhala ngokulula kwindawo ethile ngolandelelwano oluchanekileyo (kwanakwi-hard drive).
Iinkqubo zohlalutyo lweseva

2. Ukucubungula imisinga yesiganeko

Emva kokuba silungiselele zonke iziganeko kwaye sizibeke kwimigca efanelekileyo, siqhubela phambili kwisinyathelo sokucubungula. Apha ndiza kukuxelela malunga nezona ndlela zimbini ziqhelekileyo zokuqhubekeka.
Inketho yokuqala kukwenza iSpark Streaming kwi-Apache system. Zonke iimveliso ze-Apache zihlala kwi-HDFS, inkqubo yefayile ekhuselekileyo kunye neefayile zeefayile. Ukusasazwa kwe-Spark sisixhobo esilula ukusisebenzisa esiphatha idatha yokusasaza kunye nezikali kakuhle. Noko ke, kusenokuba nzima ukuyinyamekela.
Enye inketho kukwakha owakho umphathi wesiganeko. Ukwenza oku, kufuneka, umzekelo, ubhale isicelo sePython, uyakhe kwiDocker kwaye ubhalisele umgca weKafka. Xa ii-triggers zifika kubaphathi be-docker, ukucubungula kuya kuqala. Ngale ndlela, kufuneka ugcine usetyenziso lusebenza ngamaxesha onke.
Makhe sicinge ukuba sikhethe enye yeenketho ezichazwe ngasentla kwaye siqhubele phambili ekuqhubekeni phambili. Abaprosesa kufuneka baqale ngokujonga ukunyaniseka kwedatha, ukucoca inkunkuma kunye neziganeko "eziphukileyo". Ukuqinisekisa siqhele ukusebenzisa Cerberus. Emva koku, unokwenza imephu yedatha: idatha evela kwimithombo eyahlukeneyo iqhelekile kwaye ibekwe emgangathweni ukuze yongezwe kwitafile eqhelekileyo.
Iinkqubo zohlalutyo lweseva

3. Uvimba weenkcukacha

Inyathelo lesithathu kukugcina iziganeko eziqhelekileyo. Xa sisebenza kunye nenkqubo yokuhlalutya esele yenziwe, kuya kufuneka sifikelele kubo rhoqo, ngoko ke kubalulekile ukukhetha i-database efanelekileyo.
Ukuba idatha ingena kakuhle kwisikimu esisisigxina, ungakhetha indawo yokucofa okanye enye idatabase yekholam. Ngaloo ndlela, ii-aggregations ziya kusebenza ngokukhawuleza. I-downside kukuba iskimu siqiniswe ngokungqongqo kwaye ngoko ke akuyi kuba nako ukongeza izinto ezingafanelekanga ngaphandle kokuguqulwa (umzekelo, xa kwenzeka isiganeko esingaqhelekanga). Kodwa ungabala ngokukhawuleza kakhulu.
Ngedatha engacwangciswanga, unokuthatha iNoSQL, umzekelo, Apache cassandra. Isebenza kwi-HDFS, iphindaphinda kakuhle, unokuphakamisa iimeko ezininzi, kwaye inokunyamezela iimpazamo.
Unako kwakhona ukuphakamisa into elula, umzekelo, MongoDB. Iyacotha kwaye ikwimithamo emincinci. Kodwa i-plus kukuba ilula kakhulu kwaye ngoko ifanelekile ukuqala.
Iinkqubo zohlalutyo lweseva

4. Udityaniso

Ukugcina ngononophelo yonke imicimbi, sifuna ukuqokelela zonke iinkcukacha ezibalulekileyo kwibhetshi efikileyo kwaye sihlaziye isiseko sedatha. Kwihlabathi jikelele, sifuna ukufumana iideshibhodi ezifanelekileyo kunye neemetriki. Umzekelo, qokelela iprofayile yomsebenzisi kwiziganeko kwaye ngandlela thile ukulinganisa ukuziphatha. Iziganeko ziyadityaniswa, ziqokelelwe, kwaye zigcinwe kwakhona (kwiitafile zabasebenzisi). Ngexesha elifanayo, unokwakha inkqubo ukuze ukwazi ukudibanisa isihluzo kwi-aggregator-coordinator: ukuqokelela abasebenzisi kuphela kuhlobo oluthile lomcimbi.
Emva koko, ukuba umntu kwiqela ufuna kuphela uhlalutyo oluphezulu, iinkqubo zokuhlalutya zangaphandle zinokudibaniswa. Unokuthatha iMixpanel kwakhona. kodwa kuba kubiza kakhulu, ayizizo zonke iziganeko zabasebenzisi ezithunyelwa apho, kodwa kuphela into efunekayo. Ukwenza oku, kufuneka senze umnxibelelanisi oya kudlulisela imicimbi ekrwada okanye into thina ngokwethu ehlanganiswe ngaphambili kwiinkqubo zangaphandle, ii-APIs okanye iiplatifti zentengiso.
Iinkqubo zohlalutyo lweseva

5. Umphambili

Kufuneka udibanise i-frontend kwinkqubo eyenziweyo. Umzekelo omhle yinkonzo redash, yi-GUI yedatha enceda ukwakha iidashboards. Indlela intsebenziswano isebenza ngayo:

  1. Umsebenzisi wenza umbuzo weSQL.
  2. Ekuphenduleni ufumana umqondiso.
  3. Idala 'umboniso omtsha' wayo kwaye ifumana igrafu entle onokuzigcinela yona.

Ukubonwa kwinkonzo kuzihlaziya ngokuzenzekelayo, unokwenza ngokwezifiso kwaye ulandele iliso lakho. I-Redash ikhululekile ukuba i-self-hosted, kodwa njenge-SaaS iya kubiza i-$ 50 ngenyanga.
Iinkqubo zohlalutyo lweseva

isiphelo

Emva kokugqiba onke la manyathelo angasentla, uya kwenza uhlalutyo lweseva yakho. Nceda uqaphele ukuba oku akulula njengokudibanisa uhlalutyo lwabathengi, kuba yonke into ifuna ukulungiswa ngokwakho. Ke ngoko, ngaphambi kokudala eyakho inkqubo, kufanelekile ukuthelekisa imfuno yenkqubo yohlalutyo olunzulu kunye nezibonelelo ozimisele ukuzabela yona.
Ukuba wenze izibalo kwaye wafumanisa ukuba iindleko ziphezulu kakhulu, kwicandelo elilandelayo ndiza kuthetha malunga nendlela yokwenza inguqulelo ephantsi yohlalutyo lwe-server-side analytics.

Enkosi ngokufunda! Ndiya kuvuya ukubuza imibuzo kwizimvo.

umthombo: www.habr.com

Yongeza izimvo