Inkqubo yoMhlali waseYandex, okanye Indlela iBackerender enamava iba yiNjineli yeML

Inkqubo yoMhlali waseYandex, okanye Indlela iBackerender enamava iba yiNjineli yeML

IYandex ivula inkqubo yokufunda ngomatshini kubaphuhlisi abanamava ngasemva. Ukuba ubhale kakhulu kwi-C ++/Python kwaye ufuna ukusebenzisa olu lwazi kwi-ML, ngoko siya kukufundisa indlela yokwenza uphando olusebenzayo kwaye ukhethe abagcini abanamava. Uza kusebenza kwiinkonzo eziphambili zeYandex kwaye ufumane izakhono kwiindawo ezifana neemodeli zomgca kunye nokunyusa i-gradient, iinkqubo zokuncoma, inethiwekhi ye-neural yomfanekiso, isicatshulwa kunye nohlalutyo lwesandi. Uya kufunda kwakhona ukuba uzivavanya ngokuchanekileyo iimodeli zakho usebenzisa iimethrikhi ezingaxhunyiwe kwi-intanethi kunye ne-intanethi.

Ubude beprogram ngunyaka omnye, apho abathathi-nxaxheba baya kusebenza kwiSebe le-Yandex Machine Intelligence kunye noPhando, kunye nokuya kwiintetho kunye neesemina. Inxaxheba ihlawulwa kwaye ithatha umsebenzi opheleleyo: iiyure ezingama-40 ngeveki, ukuqala nge-1 kaJulayi kulo nyaka. izicelo sele zivuliwe kwaye iya kuhlala kude kube ngoMeyi 1st. 

Kwaye ngoku ngokweenkcukacha ezingakumbi - malunga nokuba luhlobo luni lwabaphulaphuli esilulindeleyo, ukuba kuya kuba njani ukuhamba komsebenzi, kwaye ngokubanzi, indlela ingcali ye-backend ingatshintshela ngayo kumsebenzi we-ML.

Ukuqhelaniswa

Iinkampani ezininzi zineeNkqubo zokuHlala, kubandakanya, umzekelo, iGoogle kunye ne-Facebook. Ikakhulu zijolise kubasebenzi abancinci nabaphakathi abazama ukungena kwicala lophando lweML. Inkqubo yethu yeyabaphulaphuli abohlukileyo. Simema abaphuhlisi abasemva abasele befumene amava aneleyo kwaye bazi ngokuqinisekileyo ukuba kufuneka batshintshele kwi-ML kwizakhono zabo, bafumane izakhono ezisebenzayo - hayi izakhono zenzululwazi - ekusombululeni iingxaki zokufunda koomatshini. Oku akuthethi ukuba asibaxhasi abaphandi abancinci. Kubo, silungiselele inkqubo eyahlukileyo - iprimiyamu ebizwa ngokuba ngu-Ilya Segalovich, ekuvumela ukuba usebenze kwiYandex.

Apho umhlali kuya kufuneka asebenze khona

Thina kwisebe lengqondo yomatshini kunye nophando siphuhlisa imibono yeprojekthi ngokwethu. Umthombo oyintloko wokuphefumlelwa luncwadi lwenzululwazi, amanqaku, iindlela zoluntu lophando. Mna nabalingane bam sihlalutya oko sikufundayo, sibone indlela esinokuyiphucula ngayo okanye sandise iindlela ezicetywa zizazinzulu. Kwangaxeshanye, ngamnye kuthi uzithathela ingqalelo indawo yakhe yolwazi kunye nezinto anomdla kuzo, aqulunqe umsebenzi ngokusekelwe kwiinkalo azijonga njengezibalulekileyo. Ekudibaneni kweziphumo zophando lwangaphandle kunye nobuchule bobuqu, umbono weprojekthi uhlala uzalwa.

Inkqubo enjalo ilungile kuba ixazulula ngokubanzi iingxaki zobuchwepheshe beenkonzo zeYandex nangaphambi kokuba zivele. Xa inkonzo ijongene nengxaki, abameli bayo beza kuthi, banokuthi bathathe iteknoloji esele siyilungisile, enokuthi isetyenziswe ngokuchanekileyo kwimveliso. Ukuba into ayilungile, ubuncinane siya kukhumbula ngokukhawuleza apho "ungaqala ukumba", apho amanqaku okukhangela isisombululo. Njengoko usazi, indlela yenzululwazi kukuma emagxeni eengxilimbela.

Yintoni eza kwenziwa

KwiYandex - kwaye ngokukodwa kwisebe lethu - zonke iindawo ezifanelekileyo zeML ziyaphuhliswa. Umsebenzi wethu kukuphucula umgangatho weemveliso ezahlukeneyo, kwaye oku kusebenza njengenkuthazo yokuvavanya yonke into entsha. Ukongeza, iinkonzo ezintsha zivela rhoqo. Ke inkqubo yokufundisa inazo zonke iindawo eziphambili (ezimiselwe kakuhle) zokufunda koomatshini kuphuhliso lwamashishini. Xa ndandisenza inxalenye yam yekhosi, ndasebenzisa amava okufundisa kwiSikolo soHlahlelo lweDatha, kwakunye nezinto zokufunda kunye nempumelelo yabanye ootitshala be-SHAD. Ndiyazi ukuba oogxa benza okufanayo.

Kwiinyanga zokuqala, uqeqesho ngokweprogram yekhosi luya kuba malunga ne-30% yexesha lakho lokusebenza, emva koko - malunga ne-10%. Nangona kunjalo, kubalulekile ukuqonda ukuba ukusebenza kunye neemodeli zeML ngokwazo kuya kuqhubeka ukuthatha malunga namaxesha amane ngaphantsi kwazo zonke iinkqubo ezinxulumeneyo. Ezi ziquka ukulungiselela i-backend, ukufumana idatha, ukubhala umbhobho wokulungiswa kwangaphambili, ukulungelelanisa ikhowudi, ukulungelelanisa kwi-hardware ethile, njl njl. Injineli ye-ML, ukuba uyathanda, umphuhlisi ogcweleyo (kuphela ngokukhetha okukhulu kumatshini wokufunda), usombulula. ingxaki ukusuka ekuqaleni ukuya ekugqibeleni. Nangona imodeli egqityiweyo, kuya kufuneka ukuba wenze ezinye izenzo ezininzi: ukulinganisa ukuphunyezwa kwayo koomatshini abaninzi, ukulungiselela ukuphunyezwa ngendlela yokubamba, ithala leencwadi, okanye icandelo lenkonzo ngokwayo.

Ukhetho lomfundi
Ukuba unoluvo lokuba kungcono ukuya kwiinjineli zeML emva kokusebenza njengomphuhlisi we-backend kuqala, akunjalo. Ukungena kwi-SHAD efanayo ngaphandle kwamava okwenyani ekuphuhliseni iinkonzo, ukufunda kunye nokufunwa kakhulu kwimarike lukhetho olukhulu. Iingcali ezininzi eYandex zaphela kwizikhundla zabo zangoku ngale ndlela. Ukuba inkampani ethile ikulungele ukukunika umsebenzi kwibala leML ngoko nangoko emva kokuphumelela, kuyafaneleka ukuba ulwamkele olo ncedo. Zama ukungena kwiqela elilungileyo kunye nomcebisi onamava kwaye ulungele ukufunda okuninzi.

Yintoni edla ngokukuthintela ekwenzeni iML

Ukuba i-backender inqwenela ukuba yinjineli ye-ML, yena - ngaphandle kokuqwalasela inkqubo yokuhlala - unokukhetha kwiindawo ezimbini zophuhliso.

Okokuqala, ukufunda ngaphakathi kwesakhelo sekhosi ethile yemfundo. Izifundo Kwi-Coursera kuya kukusondeza ekuqondeni iindlela ezisisiseko, kodwa ukuze uzintywilisele kulo msebenzi ukuya kwinqanaba elaneleyo, kufuneka uchithe ixesha elininzi kuwo. Umzekelo, ukugqiba i-SHAD. Kwiminyaka eyahlukeneyo, i-SHAD yayinenani elahlukileyo lezifundo ngokuthe ngqo kumatshini wokufunda - ngokomyinge, malunga nesibhozo. Ngamnye kubo ubaluleke ngokwenene kwaye uluncedo, kubandakanywa noluvo lwabaphumeleleyo. 

Okwesibini, unokuthatha inxaxheba kwiiprojekthi zokulwa apho kufuneka uphumeze enye okanye enye i-algorithm ye-ML. Nangona kunjalo, zimbalwa kakhulu iiprojekthi ezinjalo kwimarike yophuhliso lwe-IT: kwimisebenzi emininzi, ukufundwa komatshini akusetyenziswa. Nakwiibhanki eziphonononga ngenkuthalo amathuba anxulumene ne-ML, bambalwa kuphela ababandakanyeka kuhlalutyo lwedatha. Ukuba awukwazanga ukujoyina elinye lala maqela, ekuphela kwento eseleyo ongayenza kukuqala eyakho iprojekthi (apho, kunokwenzeka ukuba, uya kuzibekela ixesha elibekiweyo, kwaye oku akunanto yakwenza nemisebenzi yokulwa), okanye uqale ukukhuphisana kwi Kaggle.

Ewe, sebenzisana namanye amalungu oluntu kwaye uzame wena kukhuphiswano lula noko - ngakumbi ukuba uxhasa izakhono zakho ngoqeqesho kunye nezifundo ezikhankanyiweyo kwi-Coursera. Ukhuphiswano ngalunye lunomhla wokugqibela - luya kusebenza njengenkuthazo kuwe kwaye likulungiselele inkqubo efanayo kwiinkampani ze-IT. Le yindlela elungileyo - leyo, nangona kunjalo, ikwaqhawule umtshato kancinci kwiinkqubo zokwenyani. I-Kaggle ikunika idatha esetyenziwe kwangaphambili, ukuba ayisoloko igqibelele; unganikezeli ukucinga ngegalelo kwimveliso; kwaye okona kubaluleke kakhulu, abafuni zisombululo ezifanelekileyo kwimveliso. I-algorithms yakho mhlawumbi iya kusebenza kwaye ibe nokuchaneka okuphezulu, kodwa iimodeli zakho kunye nekhowudi ziya kujongeka njengeFrankenstein ethungelwe kwiindawo ezahlukeneyo - kwiprojekthi yokulwa, esi sakhiwo sonke siya kusebenza ngokucothayo, kuya kuba nzima ukuhlaziya kunye nokwandisa (umzekelo, i-algorithms yakho iya kusebenza kwaye ibe nokuchaneka okuphezulu, kodwa imodeli yakho kunye nekhowudi iya kubonakala ngathi i-Frankenstein ethungelwe kwiindawo ezahlukeneyo - kwiprojekthi yokulwa, esi sakhiwo sonke siya kusebenza ngokucothayo, kuya kuba nzima ukuhlaziya kunye nokwandisa (umzekelo, i-algorithms yakho iya kusebenza kwaye ibe nokuchaneka okuphezulu, kodwa imodeli yakho kunye nekhowudi iya kubonakala ngathi i-Frankenstein ithungelwe kwiindawo ezahlukeneyo - kwiprojekthi yokulwa, esi sakhiwo siphela siya kusebenza ngokucothayo, kuya kuba nzima ukuhlaziya kunye nokwandisa (umzekelo; Iialgorithms zolwimi kunye nelizwi zihlala zibhalwa ngokuyinxenye njengoko ulwimi lukhula). Iinkampani zinomdla kwinto yokuba kungekhona wena kuphela onokwenza umsebenzi odweliswe (kucacile ukuba wena, njengombhali wesisombululo, unokwenza oku), kodwa nabani na osebenza nabo. Malunga umahluko phakathi kwemidlalo kunye nenkqubo yoshishino uthe ΠΌΠ½ΠΎΠ³ΠΎ, kunye noKaggle ufundisa "iimbaleki" - nokuba iyenza kakuhle kakhulu, ikuvumela ukuba ufumane amava athile.

Ndachaza imigca emibini enokwenzeka yophuhliso - uqeqesho ngeenkqubo zemfundo kunye noqeqesho "ekulweni", umzekelo kwiKaggle. Inkqubo yokuhlala yindibaniselwano yezi ndlela zimbini. Ulindele iintetho kunye neesemina kwinqanaba le-SHAD, kunye neeprojekthi zokulwa ngokwenene.

umthombo: www.habr.com

Yongeza izimvo