Uhlelo Lokuhlala Lwe-Yandex, noma Ukuthi Umuntu Osemuva Onolwazi Angaba Kanjani Unjiniyela We-ML

Uhlelo Lokuhlala Lwe-Yandex, noma Ukuthi Umuntu Osemuva Onolwazi Angaba Kanjani Unjiniyela We-ML

I-Yandex ivula uhlelo lokuhlala ekufundeni ngomshini konjiniyela abanolwazi lokubuyela emuva. Uma ubhale okuningi ku-C++/Python futhi ufuna ukusebenzisa lolu lwazi ku-ML, sizobe sesikufundisa indlela yokwenza ucwaningo olusebenzayo futhi sinikeze abeluleki abanolwazi. Uzosebenza ezinsizeni ezibalulekile ze-Yandex futhi uthole amakhono ezindaweni ezinjengamamodeli alayini kanye nokukhulisa i-gradient, amasistimu wokuncoma, amanethiwekhi e-neural okuhlaziya izithombe, umbhalo nomsindo. Uzofunda futhi ukuthi ungawahlola kanjani kahle amamodeli akho usebenzisa amamethrikhi ungaxhunyiwe ku-inthanethi naku-inthanethi.

Isikhathi sohlelo unyaka owodwa, lapho abahlanganyeli bezosebenza emnyangweni wezobunhloli bomshini kanye nocwaningo lweYandex, kanye nokuhambela izinkulumo namasemina. Ukubamba iqhaza kuyakhokhelwa futhi kuhilela umsebenzi wesikhathi esigcwele: amahora angu-40 ngesonto, kusukela ngo-July 1 walo nyaka. Izinhlelo zokusebenza sezivuliwe futhi izoqhubeka kuze kube nguMeyi 1. 

Futhi manje ngokuningiliziwe - mayelana nokuthi hlobo luni lwezithameli esizilindile, ukuthi inqubo yomsebenzi izoba yini futhi, ngokuvamile, ukuthi uchwepheshe we-back-end angashintshela kanjani emsebenzini we-ML.

Gxila

Izinkampani eziningi zinezinhlelo zokuhlala, kufaka phakathi, isibonelo, i-Google ne-Facebook. Iqondiswe kakhulu kochwepheshe abancane nabamaphakathi abazama ukuthatha isinyathelo esibheke ocwaningweni lwe-ML. Uhlelo lwethu olwezithameli ezahlukene. Simema abathuthukisi be-backend asebevele bathole ulwazi olwanele futhi abazi ngokuqinisekile ukuthi ngobuchule babo badinga ukushintshela ku-ML, ukuze bathole amakhono angokoqobo - hhayi amakhono ososayensi - ekuxazululeni izinkinga zokufunda imishini yemboni. Lokhu akusho ukuthi asibasekeli abacwaningi abasha. Sibahlelele uhlelo oluhlukile - i-premium Iqanjwe ngo-Ilya Segalovich, okukuvumela futhi ukuthi usebenze ku-Yandex.

Uzosebenza kuphi umhlali?

Emnyangweni Wezobunhloli Bomshini Nocwaningo, thina ngokwethu sithuthukisa imibono yephrojekthi. Umthombo oyinhloko wogqozi izincwadi zesayensi, izindatshana, kanye nezitayela emphakathini wocwaningo. Mina nozakwethu sihlaziya esikufundayo, sibheke ukuthi singazithuthukisa kanjani noma sizikhulise kanjani izindlela eziphakanyiswe ososayensi. Ngesikhathi esifanayo, ngamunye wethu ucabangela indawo yakhe yolwazi nezithakazelo, wenza umsebenzi ngokusekelwe ezindaweni azibheka njengezibalulekile. Umqondo wephrojekthi uvame ukuzalwa ezimpambanweni zemiphumela yocwaningo lwangaphandle kanye nekhono lomuntu siqu.

Lolu hlelo luhle ngoba luxazulula kakhulu izinkinga zobuchwepheshe zezinsizakalo ze-Yandex nangaphambi kokuba zivele. Lapho insizakalo ibhekene nenkinga, abameleli bayo beza kithi, okungenzeka ukuthi bathathe ubuchwepheshe esibulungisile kakade, okusele ukuthi kusetshenziswe ngendlela efanele emkhiqizweni. Uma okuthile kungakalungi, okungenani sizokhumbula ngokushesha lapho “singaqala khona ukumba” nokuthi yiziphi izihloko esizobheka kuzo isixazululo. Njengoba sazi, indlela yesayensi iwukuma emahlombe emidondoshiya.

Okufanele ngikwenze

Kwa-Yandex - futhi ikakhulukazi kubaphathi bethu - zonke izindawo ezifanele ze-ML ziyathuthukiswa. Umgomo wethu uwukuthuthukisa ikhwalithi yemikhiqizo ehlukahlukene, futhi lokhu kusebenza njengesikhuthazo sokuhlola yonke into entsha. Ngaphezu kwalokho, izinsiza ezintsha zivela njalo. Ngakho-ke uhlelo lwezinkulumo luqukethe zonke izindawo ezibalulekile (ezifakazelwe kahle) zokufunda ngomshini ekuthuthukisweni kwezimboni. Lapho ngihlanganisa ingxenye yami yesifundo, ngasebenzisa ulwazi lwami lokufundisa eSikoleni Sokuhlaziywa Kwedatha, kanye nezinsiza kanye nomsebenzi wabanye othisha be-SHAD. Ngiyazi ukuthi ozakwethu benza okufanayo.

Ezinyangeni zokuqala, ukuqeqeshwa ngokohlelo lwezifundo kuzothatha cishe u-30% wesikhathi sakho sokusebenza, bese kuba ngu-10%. Nokho, kubalulekile ukuqonda ukuthi ukusebenza namamodeli e-ML ngokwawo kuzoqhubeka nokuthatha cishe izikhathi ezine ngaphansi kwazo zonke izinqubo ezihambisanayo. Lokhu kufaka phakathi ukulungiselela i-backend, ukwamukela idatha, ukubhala ipayipi lokuyicubungula ngaphambili, ukugcwalisa ikhodi, ukuzivumelanisa nezingxenyekazi zekhompiyutha ezithile, njll. Unjiniyela we-ML, uma uthanda, ungumthuthukisi ogcwele isitaki (kuphela ngokugcizelela okukhulu ekufundeni komshini) , ekwazi ukuxazulula inkinga kusukela ekuqaleni kuya ekugcineni. Ngisho noma unemodeli eseyenziwe ngomumo, cishe uzodinga ukwenza ezinye izenzo eziningi: ukufanisa ukwenziwa kwayo emishinini eminingana, lungiselela ukuqaliswa ngendlela yesibambo, umtapo wezincwadi, noma izingxenye zesevisi ngokwayo.

Ukukhetha komfundi
Uma ubungaphansi kombono wokuthi kungcono ukuba unjiniyela we-ML ngokuqala usebenze njengonjiniyela ongemuva, lokhu akulona iqiniso. Ukubhalisa ku-SHAD efanayo ngaphandle kokuhlangenwe nakho kwangempela ekuthuthukiseni izinsizakalo, ukufunda kanye nokudingeka kakhulu emakethe kuyindlela enhle kakhulu. Ochwepheshe abaningi be-Yandex bagcina ezikhundleni zabo zamanje ngale ndlela. Uma ngabe iyiphi inkampani esilungele ukukunikeza umsebenzi emkhakheni we-ML ngokushesha ngemuva kokuthweswa iziqu, kufanele nawe wamukele okunikezwayo. Zama ukungena eqenjini elihle nomeluleki onolwazi futhi ulungele ukufunda okuningi.

Yini evamise ukukuvimbela ekwenzeni i-ML?

Uma i-backender ifisa ukuba unjiniyela we-ML, angakhetha ezindaweni ezimbili zokuthuthuka - ngaphandle kokucabangela uhlelo lokuhlala.

Okokuqala, funda njengengxenye yezifundo ezithile zemfundo. Izifundo I-Coursera izokusondeza ekuqondeni amasu ayisisekelo, kodwa ukuze ugxile kulo msebenzi ngokwezinga elanele, udinga ukuchitha isikhathi esiningi kuwo. Isibonelo, uthweswe iziqu e-SHAD. Ngokuhamba kweminyaka, i-ShaD yayinenani elihlukile lezifundo ngokuqondile ekufundeni ngomshini - ngokwesilinganiso, cishe eziyisishiyagalombili. Ngamunye wabo ubaluleke ngempela futhi uwusizo, kuhlanganise nombono wabathweswe iziqu. 

Okwesibili, ungabamba iqhaza kumaphrojekthi wokulwa lapho udinga ukusebenzisa i-algorithm eyodwa noma enye ye-ML. Nokho, ambalwa kakhulu amaphrojekthi anjalo emakethe yokuthuthukiswa kwe-IT: ukufunda ngomshini akusetshenziswa emisebenzini eminingi. Ngisho nasemabhange ahlola ngenkuthalo amathuba ahlobene ne-ML, ambalwa kuphela abambe iqhaza ekuhlaziyeni idatha. Uma ungakwazanga ukujoyina elinye lala maqembu, okuwukuphela kwakho ongakukhetha ukuthi uqale eyakho iphrojekthi (lapho, cishe, uzozibekela khona eyakho iminqamulajuqu, futhi lokhu akuhlangene kangako nemisebenzi yokukhiqiza ukulwa), noma uqale ukuncintisana Kaggle.

Impela, sebenzisana namanye amalungu omphakathi futhi uzame wena emiqhudelwaneni lula uma kuqhathaniswa - ikakhulukazi uma wenza ikhophi yasenqolobaneni yamakhono akho ngokuqeqeshwa kanye nezifundo ezishiwo ku-Coursera. Umncintiswano ngamunye unomnqamulajuqu - uzosebenza njengesikhuthazo kuwe futhi ulungiselele uhlelo olufanayo ezinkampanini ze-IT. Lena indlela enhle - okuyinto, nokho, futhi ehlukanisiwe kancane nezinqubo zangempela. Ku-Kaggle unikezwa idatha ecutshungulwe ngaphambili, nakuba ingaphelele ngaso sonke isikhathi; unganikezi ukucabanga mayelana nomnikelo kumkhiqizo; futhi okubaluleke kakhulu, azidingi izixazululo ezifanele ukukhiqizwa. Ama-algorithms akho cishe azosebenza futhi anembe kakhulu, kodwa amamodeli akho kanye nekhodi izofana ne-Frankenstein ehlanganiswe ndawonye kusukela ezingxenyeni ezahlukene - kuphrojekthi yokukhiqiza, sonke isakhiwo sizosebenza kancane kakhulu, kuyoba nzima ukuvuselela nokwandisa (isibonelo, ulimi kanye ne-algorithms yezwi izohlala ibhalwa kabusha kancane njengoba ulimi luthuthuka). Izinkampani zinesithakazelo eqinisweni lokuthi umsebenzi osohlwini awukwazi ukwenziwa nguwe kuphela (kusobala ukuthi wena, njengombhali wesixazululo, ungenza lokhu), kodwa futhi nanoma yimuphi ozakwabo. Umehluko phakathi kwezinhlelo zezemidlalo nezezimboni kuxoxwa ngazo много, futhi u-Kaggle ufundisa “abasubathi” ngokuqondile - noma ngabe wenza kahle kakhulu, okubavumela ukuthi bathole ulwazi oluthile.

Ngichaze imigqa emibili yentuthuko - ukuqeqeshwa ngezinhlelo zemfundo nokuqeqeshwa "ekulweni", isibonelo ku-Kaggle. Uhlelo lokuhlala luyinhlanganisela yalezi zindlela ezimbili. Izinkulumo namasemina ezingeni le-SHAD, kanye namaphrojekthi alwa ngempela, akulindile.

Source: www.habr.com

Engeza amazwana