I-Sber.DS iyinkundla ekuvumela ukuthi udale futhi usebenzise amamodeli ngisho nangaphandle kwekhodi

Imibono nemihlangano mayelana nokuthi yiziphi ezinye izinqubo ezingenziwa ngokuzenzakalelayo ivela emabhizinisini anosayizi abahlukahlukene nsuku zonke. Kodwa ngaphezu kweqiniso lokuthi isikhathi esiningi singachithwa ekudaleni imodeli, udinga ukusichitha ekuyihloleni nasekuhloleni ukuthi umphumela otholiwe awukona okungahleliwe. Ngemuva kokusetshenziswa, noma iyiphi imodeli kufanele iqashwe futhi ihlolwe ngezikhathi ezithile.

Futhi lezi yizo zonke izigaba ezidinga ukuqedwa kunoma iyiphi inkampani, kungakhathaliseki ukuthi ingakanani. Uma sikhuluma ngesilinganiso kanye nefa le-Sberbank, inani lokulungisa kahle likhuphuka kakhulu. Ekupheleni kuka-2019, uSber wayesesebenzise amamodeli angaphezu kuka-2000. Akwanele ukuvele uthuthukise imodeli; kuyadingeka ukuhlanganisa nezinhlelo zezimboni, ukuthuthukisa ama-data marts amamodeli wokwakha, nokuqinisekisa ukulawula ukusebenza kwawo kuqoqo.

I-Sber.DS iyinkundla ekuvumela ukuthi udale futhi usebenzise amamodeli ngisho nangaphandle kwekhodi

Ithimba lethu lakha inkundla ye-Sber.DS. Ikuvumela ukuthi uxazulule izinkinga zokufunda komshini, usheshise inqubo yokuhlola imibono, empeleni yenza kube lula inqubo yokuthuthukisa nokuqinisekisa amamodeli, futhi ilawula umphumela wemodeli ku-PROM.

Ukuze ungakhohlisi okulindelekile, ngifuna ukusho kusengaphambili ukuthi lokhu okuthunyelwe kuyisingeniso, futhi ngaphansi kokunqunywa, okokuqala, sikhuluma ngalokho, ngokuyisisekelo, ngaphansi kwe-hood yesikhulumi se-Sber.DS. Sizoxoxa indaba ngomjikelezo wempilo wemodeli kusukela ekudalweni kuya ekusetshenzisweni ngokwehlukana.

I-Sber.DS iqukethe izingxenye ezimbalwa, okubalulekile okuwumtapo wolwazi, uhlelo lokuthuthukisa kanye nohlelo lokwenziwa kwamamodeli.

I-Sber.DS iyinkundla ekuvumela ukuthi udale futhi usebenzise amamodeli ngisho nangaphandle kwekhodi

Umtapo wolwazi ulawula umjikelezo wempilo wemodeli kusukela ngesikhathi umqondo wokuyithuthukisa uvela kuze kube yilapho usetshenziswa ku-PROM, ukuqapha kanye nokuyekisa ukusebenza. Amakhono amaningi omtapo wolwazi anqunywa imithetho yokulawula, isibonelo, ukubika nokugcinwa kwamasampula okuqeqeshwa nokuqinisekisa. Eqinisweni, lena irejista yawo wonke amamodeli ethu.

Uhlelo lokuthuthukiswa lwenzelwe ukuthuthukiswa okubonakalayo kwamamodeli nezindlela zokuqinisekisa. Amamodeli athuthukisiwe aqala ukuqinisekiswa futhi anikezwa ohlelweni lokuqalisa ukwenza imisebenzi yawo yebhizinisi. Futhi, ohlelweni lwesikhathi sokusebenza, imodeli ingabekwa phezu kokuqapha ngenjongo yokuqalisa amasu okuqinisekisa ngezikhathi ezithile ukuze kuqashwe ukusebenza kwayo.

Kunezinhlobo eziningana zamanodi ohlelweni. Eminye yakhelwe ukuxhuma emithonjeni yedatha eyahlukahlukene, eminye yakhelwe ukuguqula idatha yomthombo futhi iyinothise (umaki). Kunama-node amaningi okwakha amamodeli ahlukene namanodi wokuwaqinisekisa. Umthuthukisi angalayisha idatha kusuka kunoma yimuphi umthombo, aguqule, ahlunge, abone ngeso lengqondo idatha emaphakathi, futhi ayihlukanise ibe izingxenye.

Ipulatifomu futhi iqukethe amamojula enziwe ngomumo angahudulwa futhi alahlwe endaweni yokuklama. Zonke izenzo zenziwa kusetshenziswa isixhumi esibonakalayo esibonakalayo. Eqinisweni, ungakwazi ukuxazulula inkinga ngaphandle komugqa owodwa wekhodi.

Uma amandla akhelwe ngaphakathi enganele, uhlelo lunikeza ikhono lokudala ngokushesha amamojula akho. Senze imodi yokuthuthukisa edidiyelwe ngokusekelwe ku I-Jupyter Kernel Gateway kulabo abakha amamojula amasha kusukela ekuqaleni.

I-Sber.DS iyinkundla ekuvumela ukuthi udale futhi usebenzise amamodeli ngisho nangaphandle kwekhodi

I-architecture ye-Sber.DS yakhelwe kuma-microservices. Kunemibono eminingi mayelana nokuthi iyini i-microservices. Abanye abantu bacabanga ukuthi kwanele ukuhlukanisa ikhodi ye-monolithic ibe izingxenye, kodwa ngesikhathi esifanayo basaya ku-database efanayo. I-microservice yethu kufanele ixhumane nenye i-microservice kuphela nge-REST API. Awekho ama-workaround okufinyelela kusizindalwazi ngokuqondile.

Sizama ukuqinisekisa ukuthi izinsizakalo azibi zinkulu kakhulu futhi zixakeke: isibonelo esisodwa akufanele sidle ngaphezu kuka-4-8 gigabytes we-RAM futhi kufanele sinikeze amandla okulinganisa ngokuvundlile izicelo ngokuqalisa izimo ezintsha. Isevisi ngayinye ixhumana nabanye kuphela nge-REST API (Vula i-API). Ithimba elibhekele isevisi liyadingeka ukugcina i-API ibuyela emuva ihambisana kuze kube iklayenti lokugcina eliyisebenzisayo.

Umongo wohlelo lokusebenza ubhalwe ku-Java usebenzisa i-Spring Framework. Isixazululo ekuqaleni saklanyelwe ukuthunyelwa ngokushesha kwingqalasizinda yamafu, ngakho-ke uhlelo lokusebenza lwakhiwe kusetshenziswa uhlelo lokuthwala iziqukathi. I-Red Hat OpenShift (Kubernetes). Isiteji sishintsha njalo, kokubili mayelana nokusebenza okwandayo kwebhizinisi (izixhumi ezintsha, i-AutoML iyengezwa) kanye nokusebenza kahle kwezobuchwepheshe.

Esinye sezici zesikhulumi sethu ukuthi singasebenzisa ikhodi ethuthukisiwe esibonakalayo esibonakalayo kunoma iyiphi isistimu yokukhishwa kwemodeli ye-Sberbank. Manje sekukhona ezimbili zazo: enye ikuHadoop, enye iku-OpenShift (Docker). Asigcini lapho futhi sidala amamojula wokuhlanganisa ukuze sisebenzise ikhodi kunoma iyiphi ingqalasizinda, okufaka phakathi endaweni kanye nasefwini. Ngokuphathelene namathuba okuhlanganiswa okuphumelelayo ku-ecosystem ye-Sberbank, sihlela futhi ukusekela umsebenzi nezimo ezikhona zokubulawa. Ngokuzayo, isisombululo singahlanganiswa kalula "ngaphandle kwebhokisi" kunoma iyiphi indawo yanoma iyiphi inhlangano.

Labo abake bazama ukusekela isisombululo esisebenzisa i-Python ku-Hadoop ku-PROM bayazi ukuthi akwanele ukulungiselela nokuletha imvelo yomsebenzisi we-Python kudathanode ngayinye. Inombolo enkulu yemitapo yolwazi ye-C/C++ yokufunda ngomshini esebenzisa amamojula wePython ngeke ikuvumele ukuthi uphumule kalula. Kufanele sikhumbule ukubuyekeza amaphakheji lapho sengeza amalabhulali amasha noma amaseva, kuyilapho sigcina ukuhambisana okusemuva nekhodi yemodeli esetshenzisiwe kakade.

Kunezindlela eziningana zokwenza lokhu. Isibonelo, lungiselela amalabhulali ambalwa asetshenziswa njalo kusengaphambili futhi uwasebenzise ku-PROM. Ekusabalaliseni kwe-Hadoop ka-Cloudera, ngokuvamile basebenzisa iphasela. Futhi manje ku-Hadoop kungenzeka ukugijima i-docker-izitsha. Kwezinye izimo ezilula kungenzeka ukuletha ikhodi kanye nephakheji python.amaqanda.

Ibhange likuthatha njengento ebaluleke kakhulu ukuphepha kokusebenzisa ikhodi yenkampani yangaphandle, ngakho-ke sisebenzisa izici ezintsha ze-Linux kernel, lapho inqubo isebenza endaweni engayodwa. I-Linux namespace, ungakwazi ukukhawulela, isibonelo, ukufinyelela kunethiwekhi kanye nediski yendawo, okunciphisa kakhulu amandla ekhodi enonya. Izindawo zedatha zomnyango ngamunye zivikelekile futhi zifinyeleleka kuphela kubanikazi bale datha. Inkundla iqinisekisa ukuthi idatha evela endaweni ethile ingafinyelela kwenye indawo kuphela ngenqubo yokushicilela idatha enokulawula kuzo zonke izigaba ukusuka ekufinyeleleni emithonjeni kuya ekubekweni kwedatha endaweni engaphambili yesitolo eqondiwe.

I-Sber.DS iyinkundla ekuvumela ukuthi udale futhi usebenzise amamodeli ngisho nangaphandle kwekhodi

Kulo nyaka sihlela ukuqedela i-MVP yokwethula amamodeli abhalwe ku-Python/R/Java ku-Hadoop. Sizibekele umsebenzi wokuzikhandla wokufunda indlela yokusebenzisa noma iyiphi indawo yangokwezifiso ku-Hadoop, ukuze singakhawuleli abasebenzisi benkundla yethu nganoma iyiphi indlela.

Ngaphezu kwalokho, njengoba kwenzeka, ochwepheshe abaningi be-DS bahamba phambili ezibalweni nasezibalweni, benza amamodeli apholile, kodwa abawazi kahle kakhulu ukuguqulwa kwedatha enkulu, futhi badinga usizo lonjiniyela bethu bedatha ukuze balungiselele amasampula okuqeqesha. Sinqume ukusiza ozakwethu futhi sakhe amamojula alungele ukuguqulwa okujwayelekile kanye nokulungiselela izici zamamodeli enjini ye-Spark. Lokhu kuzokuvumela ukuthi uchithe isikhathi esiningi uthuthukisa amamodeli futhi ungalindi ukuthi onjiniyela bedatha balungise idathasethi entsha.

Siqasha abantu abanolwazi ezindaweni ezahlukene: i-Linux ne-DevOps, i-Hadoop ne-Spark, i-Java ne-Spring, i-Scala ne-Akka, i-OpenShift ne-Kubernetes. Ngokuzayo sizokhuluma ngomtapo wolwazi oyimodeli, ukuthi imodeli ihamba kanjani emjikelezweni wokuphila ngaphakathi kwenkampani, ukuthi ukuqinisekiswa nokusebenza kwenzeka kanjani.

Source: www.habr.com

Engeza amazwana