I-Sber.DS liqonga elikuvumela ukuba udale kwaye usebenzise iimodeli ngaphandle kwekhowudi

Izimvo kunye neentlanganiso malunga nokuba zeziphi ezinye iinkqubo ezinokuzenzekela zivela kumashishini anobungakanani obahlukeneyo yonke imihla. Kodwa ukongeza kwinto yokuba ixesha elininzi lingachithwa ekudaleni imodeli, kufuneka uchithe ekuyivavanyeni kwaye ujonge ukuba isiphumo esifunyenweyo asiyonto ingaqhelekanga. Emva kokuphunyezwa, nayiphi na imodeli kufuneka ibekwe esweni kwaye ihlolwe ngamaxesha.

Kwaye ezi zizo zonke izigaba ekufuneka zigqitywe kuyo nayiphi na inkampani, kungakhathaliseki ubungakanani bayo. Ukuba sithetha ngomlinganiselo kunye nelifa le-Sberbank, inani lokulungiswa kakuhle landa kakhulu. Ekupheleni kuka-2019, uSber wayesele esebenzise iimodeli ezingaphezu kwe-2000. Akwanelanga ukuphuhlisa nje imodeli; kuyimfuneko ukudibanisa kunye neenkqubo zemizi-mveliso, ukuphuhlisa i-data marts kwiimodeli zokwakha, kunye nokuqinisekisa ukulawula ukusebenza kwayo kwiqela.

I-Sber.DS liqonga elikuvumela ukuba udale kwaye usebenzise iimodeli ngaphandle kwekhowudi

Iqela lethu liphuhlisa iqonga le-Sber.DS. Ikuvumela ukuba usombulule iingxaki zokufunda koomatshini, ukhawulezise inkqubo yokuvavanya i-hypotheses, ngokomgaqo wenza lula inkqubo yokuphuhlisa kunye nokuqinisekisa iimodeli, kwaye ulawula umphumo wemodeli kwi-PROM.

Ukuze ungakhohlisi ukulindela kwakho, ndifuna ukuthetha kwangaphambili ukuba esi sithuba sisintshayelelo, kwaye phantsi kokusikwa, kubaqalayo, sithetha malunga nantoni na, ngokomgaqo, phantsi kwe-hood ye-platform ye-Sber.DS. Siza kuxela ibali malunga nomjikelo wobomi bomzekelo ukusuka ekudalweni ukuya ekuphunyezweni ngokwahlukileyo.

I-Sber.DS inamacandelo amaninzi, awona angundoqo lithala leencwadi, inkqubo yophuhliso kunye nenkqubo yokwenziwa kwemodeli.

I-Sber.DS liqonga elikuvumela ukuba udale kwaye usebenzise iimodeli ngaphandle kwekhowudi

Ithala leencwadi lilawula umjikelo wobomi bemodeli ukusuka kumzuzu umbono wokuwuphuhlisa de uphunyezwe kwi-PROM, ukubeka iliso kunye nokuyekiswa kogunyaziso. Uninzi lwezakhono zamathala eencwadi zilawulwa yimithetho yolawulo, umzekelo, ukunika ingxelo kunye nokugcinwa koqeqesho kunye neesampuli zokuqinisekisa. Enyanisweni, le yirejista yazo zonke iimodeli zethu.

Inkqubo yophuhliso yenzelwe uphuhliso olubonakalayo lweemodeli kunye nobuchule bokuqinisekisa. Iimodeli eziphuhlisiwe zifumana ukuqinisekiswa kokuqala kwaye zinikezelwe kwisistim yokuphumeza ukwenza imisebenzi yazo yeshishini. Kwakhona, kwinkqubo yexesha lokusebenza, imodeli inokubekwa esweni ngenjongo yokuqalisa ngamaxesha athile iindlela zokuqinisekisa ukujonga ukusebenza kwayo.

Kukho iindidi ezininzi zeenodi kwinkqubo. Ezinye ziyilelwe ukudibanisa kwimithombo eyahlukeneyo yedatha, ezinye ziyilelwe ukuguqula idatha yomthombo kwaye iyityebise (markup). Kukho iindawo ezininzi zokwakha iimodeli ezahlukeneyo kunye neendawo zokuziqinisekisa. Umphuhlisi unokulayisha idatha kuwo nawuphi na umthombo, aguqule, ahluze, abonise idatha ephakathi, kwaye ayaphule abe ngamacandelo.

Iqonga likwaqulethe iimodyuli esele zenziwe ezinokuthi zitsalwe kwaye zilahlwe kwindawo yokuyila. Zonke izenzo zenziwa kusetyenziswa ujongano olubonwayo. Enyanisweni, unokusombulula ingxaki ngaphandle komgca omnye wekhowudi.

Ukuba izakhono ezakhelwe ngaphakathi azanelanga, inkqubo ibonelela ngokukwazi ukwenza ngokukhawuleza iimodyuli zakho. Senze imowudi yophuhliso edibeneyo esekelwe Isango leJupyter Kernel kwabo benza iimodyuli ezintsha ukusuka ekuqaleni.

I-Sber.DS liqonga elikuvumela ukuba udale kwaye usebenzise iimodeli ngaphandle kwekhowudi

Uyilo lweSber.DS lwakhiwe kwiinkonzo ezincinci. Kukho iimbono ezininzi malunga nokuba zeziphi ii-microservices. Abanye abantu bacinga ukuba kwanele ukwahlula ikhowudi ye-monolithic ibe ngamacandelo, kodwa kwangaxeshanye basaya kwi-database efanayo. I-microservice yethu kufuneka inxibelelane nenye i-microservice kuphela nge-REST API. Akukho manyathelo okusebenza ukufikelela ngqo kwisiseko sedatha.

Sizama ukuqinisekisa ukuba iinkonzo azibi zinkulu kakhulu kwaye zinzima: umzekelo omnye akufanele udle ngaphezu kwe-4-8 gigabytes ye-RAM kwaye kufuneka ibonelele ngokukwazi ukulinganisa izicelo ngokuthe tye ngokusungula iimeko ezintsha. Inkonzo nganye inxibelelana nabanye kuphela nge-REST API (Vula API). Iqela elijongene nenkonzo kufuneka ligcine i-API ngasemva ihambelana kude kube ngumthengi wokugqibela oyisebenzisayo.

Ingundoqo yesicelo ibhalwe kwiJava usebenzisa i-Spring Framework. Isisombululo saqale senzelwe ukuthunyelwa ngokukhawuleza kwisiseko selifu, ngoko ke isicelo sakhiwa kusetyenziswa inkqubo yesikhongozeli. I-Red Hat OpenShift (Kubernetes). Iqonga lihlala liguquka, zombini ngokunyuka komsebenzi wezoshishino (izixhumi ezintsha, i-AutoML yongezwa) kunye nokusetyenziswa kweteknoloji.

Enye yeempawu zeqonga lethu kukuba sinokuqhuba ikhowudi ephuhliswe kwi-interface ebonakalayo kuyo nayiphi na inkqubo yokwenziwa kwemodeli ye-Sberbank. Ngoku sele kukho ezimbini kuzo: enye ikwiHadoop, enye ikwi-OpenShift (Docker). Asiyeki apho kwaye senze iimodyuli zokudityaniswa ukuze siqhube ikhowudi kuyo nayiphi na iziseko zophuhliso, kubandakanya nesiseko kunye nelifu. Ngokumalunga namathuba okudibanisa okusebenzayo kwi-ecosystem ye-Sberbank, sikwaceba ukuxhasa umsebenzi kunye neendawo ezikhoyo zokubulawa. Kwixesha elizayo, isisombululo sinokudibaniswa ngokuguquguqukayo "ngaphandle kwebhokisi" kuyo nayiphi na indawo yombutho.

Abo baye bazama ukuxhasa isisombululo esiqhuba iPython kwiHadoop kwi-PROM bayazi ukuba akwanele ukulungiselela nokuhambisa indawo yomsebenzisi wePython kwidathanode nganye. Inani elikhulu leelayibrari zeC / C ++ zokufunda koomatshini ezisebenzisa iimodyuli zePython aziyi kukuvumela ukuba uphumle ngokulula. Kufuneka sikhumbule ukuhlaziya iipakethi xa songeza amathala eencwadi amatsha okanye iiseva, ngelixa sigcina ukuhambelana ngasemva kunye nekhowudi yemodeli esele iphunyeziwe.

Kukho iindlela ezininzi zokwenza oku. Umzekelo, lungiselela amathala eencwadi asetyenziswa rhoqo kwaye uwasebenzise kwi-PROM. Kusasazo lwe-Hadoop ye-Cloudera, bahlala besebenzisa elingu-. Kwakhona ngoku kwiHadoop kunokwenzeka ukuba usebenze docker-izikhongozeli. Kwezinye iimeko ezilula kunokwenzeka ukuhambisa ikhowudi kunye nephakheji python.amaqanda.

Ibhanki ithatha ukhuseleko lokusebenzisa ikhowudi yomntu wesithathu ngokubaluleke kakhulu, ke senza uninzi lwezinto ezintsha zeLinux kernel, apho inkqubo isebenza kwindawo ekwanti. Isithuba samagama seLinux, unganciphisa, umzekelo, ukufikelela kwinethiwekhi kunye nediski yendawo, enciphisa kakhulu amandla ekhowudi enobungozi. Imimandla yedatha yesebe ngalinye ikhuselwe kwaye ifikeleleka kuphela kubanini bale datha. Iqonga liqinisekisa ukuba idatha esuka kwindawo enye inokufikelela kwenye indawo kuphela ngenkqubo yokupapasha idatha kunye nolawulo kuzo zonke izigaba ukusuka ekufikeleleni kwimithombo ukuya ekufikeni kwedatha kwindawo ejoliswe kuyo.

I-Sber.DS liqonga elikuvumela ukuba udale kwaye usebenzise iimodeli ngaphandle kwekhowudi

Kulo nyaka siceba ukugqiba i-MVP yokuqalisa iimodeli ezibhalwe kwiPython / R / Java kwiHadoop. Sizibekele umsebenzi wamabhongo wokufunda indlela yokuqhuba nayiphi na imeko yesiko kwiHadoop, ukuze singathinteli abasebenzisi beqonga lethu nangayiphi na indlela.

Ukongeza, njengoko kwavelayo, iingcali ezininzi ze-DS zigqwesile kwimathematika kunye nezibalo, zenza iimodeli ezipholileyo, kodwa aziyazi kakuhle kakhulu kwiinguqu ezinkulu zedatha, kwaye zifuna uncedo lweenjineli zethu zedatha ukulungiselela iisampulu zoqeqesho. Sigqibe ekubeni sincede oogxa bethu kunye nokwenza iimodyuli ezifanelekileyo zotshintsho olusemgangathweni kunye nolungiselelo lweempawu zeemodeli kwi-injini yeSpark. Oku kuya kukuvumela ukuba uchithe ixesha elininzi ekuphuhliseni iimodeli kwaye ungalindi iinjineli zedatha ukuba zilungiselele idatha entsha.

Siqesha abantu abanolwazi kwimimandla eyahlukeneyo: iLinux kunye neDevOps, iHadoop neSpark, iJava neSpring, iScala neAkka, iOpenShift neKubernetes. Ngexesha elizayo siza kuthetha ngethala leencwadi elingumzekelo, indlela imodeli ehamba ngayo kumjikelo wobomi ngaphakathi kwinkampani, ukuba ukuqinisekiswa nokuphunyezwa kwenzeka njani.

umthombo: www.habr.com

Yongeza izimvo