Ukuhlelwa kwemizobo ebhalwe ngesandla. Ingxelo kwiYandex

Kwiinyanga ezimbalwa ezidlulileyo, oogxa bethu bakwaGoogle ibanjiwe kwi-Kaggle ukhuphiswano lokudala umdidiyeli wemifanekiso efunyenwe ngokuchukumisayo umdlalo "Khawuleza, Zoba!" Iqela, elibandakanya umphuhlisi weYandex uRoman Vlasov, wathatha indawo yesine kukhuphiswano. Ngokuqeqeshwa komatshini wokufunda ngoJanuwari, uRoman wabelane ngeengcamango zeqela lakhe, ukuphunyezwa kokugqibela komfundi, kunye nezenzo ezinomdla zabachasi bakhe.


- Molweni nonke! Igama lam ndinguRoma Vlasov, namhlanje ndiza kukuxelela ngokukhawuleza, ukudweba! Umngeni woKuqatshelwa kwemiDoodle.

Ukuhlelwa kwemizobo ebhalwe ngesandla. Ingxelo kwiYandex

Kwakukho abantu abahlanu kwiqela lethu. Ndijoyine kanye phambi kwexesha lokudibanisa. Sasinethamsanqa, sashukunyiswa kancinci, kodwa sashukunyiswa kwisikhundla semali, kwaye zashukunyiswa kule ndawo yegolide. Kwaye sithathe indawo yesine ehloniphekileyo.

(Ngethuba lokhuphiswano, amaqela aziqwalasele kuhlelo, olwathi lwasekwa ngokusekelwe kwiziphumo eziboniswe kwicandelo elinye leseti yedatha ecetywayo. Ireyithingi yokugqibela, nayo, yenziwe kwenye indawo yedathasethi. Oku kwenziwa ngolu hlobo ukuba abathathi-nxaxheba abakhuphisanayo abayilungelelanisa i-algorithms yabo kwidatha ethile.Ngoko ke, ekugqibeleni, xa utshintshela phakathi kokulinganisa, izikhundla ziyagubha kancinci (ukusuka kwisiNgesi ukugubha - ukuxuba): kwenye idatha, umphumo unokuvela. Iqela laseRoma libe kwindawo yokuqala kumanqaku amathathu aphezulu. Kule meko, abathathu abaphezulu yimali, indawo yokulinganisa imali, kuba kuphela iindawo ezintathu zokuqala eziye zanikwa ibhaso lemali. indawo yesine. Ngendlela efanayo, elinye iqela laphulukana noloyiso, indawo yegolide.- Ed.)

Ukuhlelwa kwemizobo ebhalwe ngesandla. Ingxelo kwiYandex

Olu khuphiswano lwalubalulekile kuba u-Evgeniy Babakhnin wafumana umkhulu, u-Ivan Sosin wafumana inkosi, u-Roman Soloviev wahlala eyinkosi enkulu, u-Alex Parinov wafumana inkosi, ndaba yingcali, kwaye ngoku sele ndiyinkosi.

Ukuhlelwa kwemizobo ebhalwe ngesandla. Ingxelo kwiYandex

Yintoni le Quick, Zoba? Le yinkonzo evela kuGoogle. UGoogle wayenenjongo yokwazisa i-AI kwaye ngale nkonzo wayefuna ukubonisa ukuba zisebenza njani iinethiwekhi ze-neural. Uya phaya, cofa Masizobe, kwaye kuvela iphepha elitsha apho uxelelwe khona: zoba igoso, unemizuzwana engama-20 yokwenza oku. Uzama ukuzoba i-zigzag kwimizuzwana engama-20, njengalapha, umzekelo. Ukuba uyaphumelela, inethiwekhi ithi i-zigzag kwaye uqhubela phambili. Mithandathu kuphela imifanekiso enjalo.

Ukuba inethiwekhi kaGoogle ayikwazanga ukuqaphela oko uzobileyo, umnqamlezo ubekwe emsebenzini. Kamva ndiza kukuxelela ukuba kuya kuthetha ntoni kwixesha elizayo ukuba umzobo uyabonwa yinethiwekhi okanye hayi.

Le nkonzo iqokelele inani elikhulu labasebenzisi, kwaye yonke imifanekiso ezotywe ngabasebenzisi ifakiwe.

Ukuhlelwa kwemizobo ebhalwe ngesandla. Ingxelo kwiYandex

Sikwazile ukuqokelela imifanekiso ephantse ibe zizigidi ezingama-50. Ukusuka koku, uloliwe kunye nomhla wovavanyo lokhuphiswano lwethu lwenziwa. Ngendlela, inani ledatha kuvavanyo kunye nenani leeklasi zigxininiswe ngokugqamile ngesizathu. Ndiza kukuxelela ngazo kamva.

Ubume bedatha bube ngolu hlobo lulandelayo. Le ayisiyomifanekiso ye-RGB kuphela, kodwa, ngokuthetha nje, ilog yayo yonke into eyenziwa ngumsebenzisi. Igama lisiko lethu, ikhowudi yelizwe kulapho umbhali we-doodle avela khona, isitampu sexesha lixesha. Ileyibhile eyaziwayo ibonisa ukuba inethwekhi iwubonile umfanekiso kuGoogle okanye hayi. Kwaye umzobo ngokwawo ulandelelwano, uqikelelo lwegophe apho umsebenzisi azoba ngamanqaku. Kwaye amaxesha. Eli lixesha ukususela ekuqaleni kokuzoba umfanekiso.

Ukuhlelwa kwemizobo ebhalwe ngesandla. Ingxelo kwiYandex

Idatha yanikezelwa kwiifomathi ezimbini. Le yifomati yokuqala, kwaye eyesibini yenziwe lula. Basika amaxesha ukusuka apho kwaye baqikelele le seti yamanqaku ngeseti encinci yamanqaku. Ngenxa yoko basebenzise Douglas-Pecker algorithm. Uneseti enkulu yamanqaku asondela ngokulula kumgca othe ngqo, kodwa eneneni ungaqikelela lo mgca ngamanqaku amabini nje. Lo ngumbono we-algorithm.

Idatha yasasazwa ngolu hlobo lulandelayo. Yonke into ifana, kodwa kukho ezinye izinto ezingaphandle. Xa sasiyicombulula ingxaki, asizange siyijonge. Eyona nto iphambili kukuba bekungekho klasi bezimbalwa ngokwenene, bekunganyanzelekanga ukuba senze iisampulu ezinobunzima kunye nokugqithiswa kwedatha.

Ukuhlelwa kwemizobo ebhalwe ngesandla. Ingxelo kwiYandex

Yayikhangeleka njani imifanekiso? Olu ludidi “lwenqwelomoya” kunye nemizekelo ephuma kulo eneeleyibhile ezivunyiweyo nezingaziwayo. Umlinganiselo wabo wawukwindawo ethile malunga no-1 ukuya ku-9. Njengoko ubona, idatha iyangxola. Ndicinga ukuba yinqwelomoya. Ukuba ujonga ayaziwa, kwiimeko ezininzi yingxolo nje. Omnye wada wazama ukubhala “inqwelo-moya,” kodwa ngokucacileyo ngesiFrentshi.

Uninzi lwabathathi-nxaxheba bathatha nje iigridi, bazoba idatha kolu landelelwano lwemigca njengemifanekiso ye-RGB, kwaye bayiphosa kwinethiwekhi. Ndazoba malunga ngendlela efanayo: ndathatha i-palette yemibala, ndazoba umgca wokuqala ngombala omnye, owawusekuqaleni kwale palette, umgca wokugqibela kunye nomnye, owawusekupheleni kwepalethi, kwaye phakathi kwabo. Ndidibanise yonke indawo ndisebenzisa le palette. Ngendlela, oku kunikeze isiphumo esingcono kunokuba uzobe njengakwisilayidi sokuqala - kumnyama nje.

Amanye amalungu eqela, anjengoIvan Sosin, azame iindlela ezahlukeneyo zokuzoba. Ngejelo elinye wazoba umfanekiso ongwevu, nelinye ijelo wazoba istrowukhi nganye ukusuka ekuqaleni ukuya ekupheleni, ukusuka kuma-32 ukuya kutsho kuma-255, yaye ngejelo lesithathu wazoba umzobo kuzo zonke izibetho ukusuka kuma-32 ukuya kuma-255.

Enye into enomdla kukuba u-Alex Parinov ulayishe ulwazi kwinethiwekhi usebenzisa ikhowudi yelizwe.

Ukuhlelwa kwemizobo ebhalwe ngesandla. Ingxelo kwiYandex

Imetric esetyenziswe kukhuphiswano yiMean Average Precision. Yintoni umongo wale metric kukhuphiswano? Unokunika iipredics ezintathu, kwaye ukuba akukho predic echanekileyo kulawa mathathu, ngoko ufumana i-0. Ukuba kukho echanekileyo, ngoko umyalelo wayo uthathelwa ingqalelo. Kwaye isiphumo ekujoliswe kuso siya kubalwa njenge-1 yahlulwe ngomyalelo wengqikelelo yakho. Ngokomzekelo, wenze i-predictors ezintathu, kwaye ichanekileyo ngowokuqala, ngoko uhlula i-1 nge-1 kwaye ufumane i-1. Ukuba i-predictor ilungile kwaye umyalelo wayo ngu-2, uze uhlukanise i-1 ngo-2, ufumana i-0,5. Kulungile, njl.

Ukuhlelwa kwemizobo ebhalwe ngesandla. Ingxelo kwiYandex

Ngokulungiswa kwedatha - indlela yokuzoba imifanekiso kunye nokunye - sinqume kancinane. Zeziphi izakhiwo esizisebenzisileyo? Sizame ukusebenzisa izakhiwo ezityebileyo ezifana ne-PNASNet, i-SENet, kunye nezo zakhiwo zakudala njenge-SE-Res-NeXt, ziya zingena kukhuphiswano olutsha. Bekukho neResNet kunye neDenseNet.

Ukuhlelwa kwemizobo ebhalwe ngesandla. Ingxelo kwiYandex

Ukuhlelwa kwemizobo ebhalwe ngesandla. Ingxelo kwiYandex

Ukuhlelwa kwemizobo ebhalwe ngesandla. Ingxelo kwiYandex

Sakufundisa njani oku? Zonke iimodeli esizithathileyo beziqeqeshwe kwangaphambili kwi-imagenet. Nangona kukho idatha eninzi, imifanekiso ezizigidi ezingama-50, kodwa kunjalo, ukuba uthatha inethiwekhi eqeqeshwe kwangaphambili kwi-imagenet, ibonise iziphumo ezingcono kunokuba uyiqeqeshe ukusuka ekuqaleni.

Bubuphi ubuchule bokufundisa esiye sabusebenzisa? Oku kuCosing Annealing kunye ne-Warm Restarts, endiya kuthetha ngayo emva kwexeshana. Le yindlela endiyisebenzisayo phantse kulo lonke ukhuphiswano lwam lwamva nje, kwaye kunye nabo kuye kwavela ukuqeqesha iigridi kakuhle, ukufezekisa ubuncinci obulungileyo.

Ukuhlelwa kwemizobo ebhalwe ngesandla. Ingxelo kwiYandex

Okulandelayo Nciphisa inqanaba lokuFunda kwiPlateau. Uqala ukuqeqesha inethiwekhi, usethe izinga elithile lokufunda, uqhubeke ufundisa, kwaye ukulahlekelwa kwakho ngokuthe ngcembe kuguqukela kwixabiso elithile. Ujonga oku, umzekelo, kwii-epoch ezilishumi ilahleko ayitshintshanga konke konke. Unciphisa izinga lakho lokufunda ngexabiso elithile kwaye uqhubeke nokufunda. Yehla kancinci kwakhona, idibanisa ubuncinci, kwaye uphinde wehlise ireyithi yokufunda, njalo njalo, de inethiwekhi yakho ihlangane ekugqibeleni.

Okulandelayo yindlela enomdla: Musa ukuwohloka izinga lokufunda, nyusa ubungakanani bebhetshi. Kukho inqaku elinegama elifanayo. Xa uqeqesha inethiwekhi, akuyomfuneko ukuba unciphise izinga lokufunda, unokwandisa nje ubungakanani bebhetshi.

Le ndlela, ngendlela, yayisetyenziswa nguAlex Parinov. Waqala ngebhetshi elingana ne-408, kwaye xa inethiwekhi yakhe ifika kwithafa elithile, waphinda kabini ubukhulu bebhetshi, njl.

Enyanisweni, andikhumbuli ukuba ubungakanani bebhetshi yakhe yafikelela kwixabiso elingakanani, kodwa into enomdla kukuba kwakukho amaqela e-Kaggle asebenzisa ubuchule obufanayo, ubukhulu bebhetshi yabo yayimalunga ne-10000 XNUMX. Ngendlela, izikhokelo zanamhlanje zokufunda ngokujulile, ezifana I-PyTorch, umzekelo, ikuvumela ukuba wenze oku ngokulula kakhulu. Uvelisa ibhetshi yakho kwaye uyingenise kuthungelwano hayi njengoko injalo, iyonke, kodwa iyahlulahlulwe ibe ziziqwengana ukuze ilingane kwikhadi lakho levidiyo, ubale i-gradients, kwaye emva kokuba ubale i-gradient yebhetshi yonke, hlaziya. iintsimbi.

Ngendlela, ubungakanani beebhetshi ezinkulu bezibandakanyiwe kolu khuphiswano, kuba idatha ibinengxolo, kwaye ubungakanani bebhetshi enkulu ikuncede ngokuchanekileyo ngakumbi malunga negradient.

I-pseudo-labeling nayo yasetyenziswa, isetyenziswa kakhulu nguRoman Soloviev. Wenza isampuli malunga nesiqingatha sedatha kuvavanyo kwiibhetshi, kwaye waqeqesha igridi kwiibhetshi ezinjalo.

Ubungakanani bemifanekiso bubalulekile, kodwa inyani kukuba unedatha eninzi, kufuneka uqeqeshe ixesha elide, kwaye ukuba ubungakanani bemifanekiso yakho bukhulu kakhulu, uya kuqeqesha ixesha elide kakhulu. Kodwa oku akuzange kwongeze kakhulu kumgangatho womdidi wakho wokugqibela, ngoko bekufanelekile ukusebenzisa uhlobo oluthile lorhwebo. Kwaye sazama kuphela imifanekiso engemikhulu kakhulu ngobukhulu.

Ifundwe njani yonke? Okokuqala, imifanekiso emincinci yathathwa, kwaqhutywa amaxesha amaninzi kuzo, oku kwathatha ixesha elininzi. Emva koko kwanikezelwa imifanekiso emikhulu, inethiwekhi yaqeqeshwa, ngoko ngakumbi, ngakumbi, ukuze ungayiqeqesheli ukusuka ekuqaleni kwaye ungachithi ixesha elininzi.

Malunga nezilungisi. Sasebenzisa iSGD kunye noAdam. Ngale ndlela kwakunokwenzeka ukufumana imodeli enye, eyanika isantya se-0,941-0,946 kwibhodi yabaphambili yoluntu, into enhle kakhulu.

Ukuba udibanisa iimodeli ngandlela thile, uya kufumana kwindawo ejikeleze i-0,951. Ukuba usebenzisa enye indlela, uya kufumana amanqaku okugqibela ka-0,954 kwibhodi yoluntu, njengokuba sifumeneyo. Kodwa ngaphezulu koko kamva. Okulandelayo ndiza kukuxelela ukuba sizihlanganise njani iimodeli, kunye nendlela esikwazi ngayo ukufikelela kwisantya sokugqibela.

Okulandelayo ndingathanda ukuthetha malunga ne-Cosing Annealing kunye ne-Warm Restarts okanye i-Stochastic Gradient Descent kunye ne-Warm Restarts. Ukuthetha, ngokomgaqo, ungasebenzisa nasiphi na isilungisi, kodwa ingongoma yile: ukuba uqeqesha nje inethiwekhi enye kwaye ngokuthe ngcembe iguqukela kubuncinci, ke yonke into ilungile, uya kufumana inethiwekhi enye, yenza iimpazamo ezithile, kodwa wena. ingayiqeqesha ngokwahlukileyo kancinci. Uyakuseta inqanaba lokuqala lokufunda, kwaye uthe chu kancinci ngokwale fomula. Uyawuthoba, uthungelwano lwakho lufika kubuncinci, emva koko ulondoloze iintsimbi, kwaye uphinde usete ireyithi yokufunda ebisekuqaleni koqeqesho, ngaloo ndlela uye ndaweni ithile phezulu ukusuka kobu buncinane, kwaye kwakhona wehlise ireyithi yokufunda.

Ke, unokundwendwela ubuncinci obuncinci ngexesha elinye, apho ilahleko yakho iya kuba, udibanise okanye uthabathe, okufanayo. Kodwa inyaniso kukuba uthungelwano olu bunzima luya kunika iimpazamo ezahlukeneyo ngomhla wakho. Ngokuzilinganisela, uya kufumana uhlobo oluthile loqikelelo, kwaye isantya sakho siya kuba phezulu.

Ukuhlelwa kwemizobo ebhalwe ngesandla. Ingxelo kwiYandex

Malunga nendlela esihlanganise ngayo iimodeli zethu. Ekuqaleni kwentetho, ndathi ukunikela ingqalelo kwisixa sedatha ekuvavanyeni kunye nenani leeklasi. Ukuba ungeze i-1 kwinani leethagethi kwisethi yovavanyo kwaye ulwahlule ngenani leeklasi, uya kufumana inombolo ye-330, kwaye oku kwabhalwa kwiforum - ukuba iiklasi zovavanyo zilinganisiwe. Oku kunokusetyenziswa.

Ngokusekwe koku, uRoman Soloviev weza ne-metric, sayibiza ngokuba yiProxy Score, enxibelelana kakuhle nebhodi yabaphambili. Ingongoma kukuba: wenza uqikelelo, thatha i-1 ephezulu yee-predictors zakho kwaye ubale inani lezinto zeklasi nganye. Okulandelayo, thabatha i-330 kwixabiso ngalinye kwaye udibanise amaxabiso apheleleyo anesiphumo.

La maxabiso alandelayo afunyenwe. Oku kusincedile ukuba singayili ibhodi yabaphambili ephononongayo, kodwa siqinisekise ekuhlaleni kwaye sikhethe ii-coefficients zeeensembles zethu.

Ngeensemble ungafumana isantya esinjalo. Yintoni enye endinokuyenza? Masithi usebenzise ulwazi lokuba iiklasi kuvavanyo lwakho zilinganisiwe.

Ulungelelwaniso lwahlukile. Umzekelo womnye wabo - ukulinganisa kubafana abathathe indawo yokuqala.

Senze ntoni? Ukulinganisa kwethu kwakulula, kwacetyiswa ngu-Evgeny Babakhnin. Siqale sahlenga iingqikelelo zethu nge-1 ephezulu kwaye sakhetha abaviwa kubo-ukuze inani leeklasi lingagqithi kuma-330. Kodwa kwezinye iiklasi ugqiba ngee-predictors ezingaphantsi kwe-330. Kulungile, masihlele kwakhona nge-2 ephezulu kunye ne-3 ephezulu , kwaye siya kukhetha nabagqatswa.

Ukulungelelana kwethu kwahluke njani kukulungelelana kwendawo yokuqala? Basebenzise indlela yokuphindaphinda, bethatha eyona klasi idumileyo kwaye behlisa amathuba aloo klasi ngenani elincinane de loo klasi ingabi seyona idumileyo. Sathatha eyona klasi idumileyo. Ngoko baqhubeka bewahlisa de inani lazo zonke iindidi lalingana.

Wonke umntu usebenzise ukudibanisa okanye ukuthabatha enye indlela yokuqeqesha uthungelwano, kodwa ayinguye wonke umntu osebenzisa ukulinganisa. Ukusebenzisa ukulinganisa, ungangena kwigolide, kwaye ukuba unethamsanqa, ube yimali.

Indlela yokulungisa kwangaphambili umhla? Wonke umntu wandulele umhla, udibanise okanye uthabathe, ngendlela efanayo - ukwenza iimpawu ezenziwe ngesandla, ezama ukufaka ikhowudi amaxesha ngemibala eyahlukeneyo ye-stroke, njl njl. U-Alexey Nozdrin-Plotnitsky, owathatha indawo ye-8, wathetha ngale nto.

Ukuhlelwa kwemizobo ebhalwe ngesandla. Ingxelo kwiYandex

Wayenza ngokwahlukileyo. Uthe zonke ezi zinto zenziwe ngezandla zakho azisebenzi, awudingi ukwenza loo nto, inethiwekhi yakho kufuneka ifunde yonke le nto yodwa. Kwaye endaweni yoko, weza neemodyuli zokufunda eziqhubela phambili idatha yakho. Waphosa idatha yoqobo kubo ngaphandle kokucubungula kwangaphambili - ulungelelwaniso lwamanqaku kunye namaxesha.

Emva koko wathatha umahluko ngokusekwe kulungelelwaniso, kwaye wayilinganisa yonke ngokusekwe kumaxesha. Uye weza ne matrix ende. Wasebenzisa i-1D convolution kuyo amaxesha amaninzi ukufumana i-matrix yobungakanani be-64xn, apho i-n lilonke inani lamanqaku, kwaye i-64 yenziwe ukwenzela ukondla i-matrix enesiphumo kumaleko wayo nayiphi na inethiwekhi ye-convolutional, eyamkela inani lamajelo. - 64. wafumana i-matrix ye-64xn, ngoko ke kwakuyimfuneko ukwenza i-tensor yobukhulu obuthile ukwenzela ukuba inani lamatshaneli lilingane nama-64. Walungisa zonke iingongoma X, Y kuluhlu ukusuka kwi-0 ukuya ku-32 ukudala i i-tensor yobungakanani 32x32. Andazi ukuba kutheni efuna i-32x32, yenzeka ngolo hlobo. Kwaye kolu lungelelaniso wabeka isiqwenga sale matrix yobungakanani be64xn. Ke igqitywe nge-32x32x64 tensor onokuthi uyibeke ngakumbi kwinethiwekhi yakho ye-neural. Nantso kuphela into ebendifuna ukuyithetha.

umthombo: www.habr.com

Yongeza izimvo