Sawubona Habr!
Asivamisile ukunquma ukuthumela lapha ukuhumushwa kwemibhalo ebineminyaka emibili ubudala, ngaphandle kwekhodi futhi okusobala ukuthi ingokwemfundo - kodwa namuhla sizokwenza okuhlukile. Sithemba ukuthi inkinga evezwe esihlokweni se-athikili ikhathaza abafundi bethu abaningi, futhi usuvele uwufundile umsebenzi oyisisekelo wamasu okuziphendukela kwemvelo lokhu okuthunyelwe okuphikisana nakho ekuqaleni noma ozokufunda manje. Siyakwamukela ekatini!
NgoMashi 2017, i-OpenAI yenza amagagasi emphakathini ofunda ngokujulile ngephepha elithi “
Amasu okuziphendukela kwemvelo
I-thesis eyinhloko yephepha le-OpenAI yayiwukuthi, esikhundleni sokusebenzisa ukufunda okuqiniswayo kuhlanganiswe ne-backpropagation yendabuko, baqeqeshe ngempumelelo inethiwekhi ye-neural ukuxazulula izinkinga eziyinkimbinkimbi besebenzisa lokho abakubiza ngokuthi "isu lokuziphendukela kwemvelo" (ES). Le ndlela ye-ES ihlanganisa ukugcina ukusabalalisa kwezisindo ezibanzi zenethiwekhi, okubandakanya ama-ejenti amaningi asebenza ngokufana nokusebenzisa amapharamitha akhethwe kulokhu kusatshalaliswa. Umenzeli ngamunye usebenza endaweni yakhe, futhi lapho kuqedwa inombolo ethile yeziqephu noma izigaba zesiqephu, i-algorithm ibuyisela umklomelo oqoqiwe, ovezwa njengomphumela wokufaneleka. Ngokucabangela leli nani, ukusatshalaliswa kwamapharamitha kungashintshelwa kuma-ejenti aphumelele kakhulu, kuncishwe abaphumelele kancane. Ngokuphinda umsebenzi onjalo izikhathi eziyizigidi ngokubamba iqhaza kwamakhulu ama-ejenti, kungenzeka ukuhambisa ukusatshalaliswa kwezisindo endaweni ezovumela ama-ejenti ukuba enze inqubomgomo yekhwalithi ephezulu yokuxazulula umsebenzi abawabelwe. Ngempela, imiphumela evezwe esihlokweni iyamangalisa: kuboniswa ukuthi uma usebenzisa ama-agent ayinkulungwane ngokuhambisana, khona-ke i-anthropomorphic locomotion emilenzeni emibili ingafundwa ngaphansi kwesigamu sehora (kuyilapho ngisho nezindlela ezithuthuke kakhulu ze-RL zidinga ukuchitha imali eyengeziwe. ngaphezu kwehora elilodwa kulokhu). Ukuze uthole ukwaziswa okwengeziwe, ngincoma ukufunda okuhle kakhulu
Amasu ahlukene okufundisa ukuhamba okuqondile kwe-anthropomorphic, afundwe kusetshenziswa indlela ye-ES evela ku-OpenAI.
Ibhokisi elimnyama
Inzuzo enkulu yale ndlela ukuthi ingahambisana kalula. Nakuba izindlela ze-RL, njenge-A3C, zidinga ukuthi ulwazi lushintshwe phakathi kwezintambo zesisebenzi kanye neseva yepharamitha, i-ES idinga kuphela izilinganiso zokufaneleka kanye nolwazi oluvamile lokusabalalisa ipharamitha. Kungenxa yalokhu kuba lula ukuthi le ndlela idlula kude izindlela zesimanje ze-RL ngokwamandla okukala. Kodwa-ke, konke lokhu akuzili ize: kufanele wandise inethiwekhi ngokulandela isimiso sebhokisi elimnyama. Kulokhu, "ibhokisi elimnyama" lisho ukuthi ngesikhathi sokuqeqesha isakhiwo sangaphakathi senethiwekhi singanakwa ngokuphelele, futhi kuphela umphumela ophelele (umvuzo wesiqephu) osetshenzisiwe, futhi kuncike kuwo ukuthi izisindo zenethiwekhi ethile zizoba yini. ifa yizizukulwane ezilandelayo. Ezimweni lapho singatholi khona impendulo eningi evela endaweni ezungezile-futhi ezinkingeni eziningi ze-RL zendabuko ukugeleza kwemiklomelo kuncane kakhulu-inkinga isuka ekubeni "ibhokisi elimnyama ngokwengxenye" liye "kwibhokisi elimnyama ngokuphelele." Kulokhu, ungakwazi ukwandisa kakhulu umkhiqizo, ngakho-ke, ukuyekethisa okunjalo kuyafaneleka. "Ubani odinga ama-gradients uma enomsindo ongenathemba noma kunjalo?" - lona umbono jikelele.
Nokho, ezimeni lapho impendulo isebenza kakhulu, izinto ziqala ukungahambi kahle ku-ES. Ithimba le-OpenAI lichaza ukuthi inethiwekhi elula yokuhlukanisa i-MNIST yaqeqeshwa kanjani kusetshenziswa i-ES, futhi kulokhu ukuqeqeshwa bekuhamba kancane izikhathi ezingu-1000. Iqiniso liwukuthi isignali yegradient ekuhlukaniseni izithombe ifundisa kakhulu mayelana nendlela yokufundisa inethiwekhi ukuhlukaniswa okungcono. Ngakho-ke, inkinga incane ngesu le-RL futhi ngaphezulu ngemivuzo embalwa ezindaweni ezikhiqiza ama-gradients anomsindo.
Isixazululo semvelo
Uma sizama ukufunda esibonelweni semvelo, sicabanga ngezindlela zokuthuthukisa i-AI, khona-ke kwezinye izimo i-AI ingacatshangwa njenge
Ngemva kokuhlola ukuziphatha kobuhlakani kwezilwane ezincelisayo, siyabona ukuthi kwakhiwe ngenxa yethonya eliyinkimbinkimbi elihlangene lezinqubo ezimbili ezihlobene eduze: ukufunda kokuhlangenwe nakho kwabanye и ukufunda ngokwenza. Okwakuqala kuvame ukulinganisa nokuziphendukela kwemvelo okuqhutshwa ukuzikhethela kwemvelo, kodwa lapha ngisebenzisa igama elibanzi ukuze ngicabangele i-epigenetics, ama-microbiomes, nezinye izindlela ezivumela ukwabelana kokuhlangenwe nakho phakathi kwezinto eziphilayo ezingahlobene nezakhi zofuzo. Inqubo yesibili, ukufunda kokuhlangenwe nakho, yilo lonke ulwazi isilwane esikwazi ukulufunda kukho konke ukuphila kwaso, futhi lolu lwazi lunqunywa ngokuqondile ukuxhumana kwalesi silwane nezwe langaphandle. Lesi sigaba sihlanganisa yonke into kusukela ekufundeni ukuqaphela izinto kuya ekubambeni kahle ukuxhumana okukhona enqubweni yokufunda.
Uma sikhuluma nje, lezi zinqubo ezimbili ezenzeka emvelweni zingaqhathaniswa nezinketho ezimbili zokuthuthukisa amanethiwekhi e-neural. Amasu okuziphendukela kwemvelo, lapho ulwazi olumayelana nama-gradient lusetshenziswa ukuze kuthuthukiswe ulwazi olumayelana nezinto eziphilayo, sondela eduze nokufunda kokuhlangenwe nakho kwabanye. Ngokufanayo, izindlela ze-gradient, lapho ukuthola isipiliyoni esisodwa noma esinye kuholela ekushintsheni okukodwa noma kolunye ekuziphatheni kwe-ejenti, ziqhathaniswa nokufunda kokuhlangenwe nakho komuntu siqu. Uma sicabanga ngezinhlobo zokuziphatha okukhaliphile noma amakhono ngayinye yalezi zindlela ezimbili ezithuthukiswayo ezilwaneni, ukuqhathanisa kuba sobala kakhulu. Kuzo zombili izimo, "izindlela zokuziphendukela kwemvelo" zikhuthaza ukutadisha ukuziphatha okusebenzayo okuvumela umuntu ukuba athuthukise ukuqina okuthile (okwanele ukuhlala ephila). Ukufunda ukuhamba noma ukuphunyuka ekuthunjweni ezimweni eziningi kulingana nokuziphatha "okungokwemvelo" "okunezintambo eziqinile" ezilwaneni eziningi ezingeni lofuzo. Ngaphezu kwalokho, lesi sibonelo siqinisekisa ukuthi izindlela zokuziphendukela kwemvelo ziyasebenza ezimeni lapho isignali yomvuzo iyivelakancane kakhulu (isibonelo, iqiniso lokukhulisa ingane ngempumelelo). Esimeni esinjalo, akunakwenzeka ukuhlobanisa umvuzo nanoma iyiphi isethi ethile yezenzo okungenzeka ukuthi zenziwe eminyakeni eminingi ngaphambi kokuba leli qiniso libe khona. Ngakolunye uhlangothi, uma sicabangela icala lapho i-ES ihluleka khona, okungukuthi ukuhlukaniswa kwezithombe, imiphumela iqhathaniswa ngokuphawulekayo nemiphumela yokufunda ngezilwane efinyelelwe ekuhlolweni kwengqondo yokuziphatha okungenakubalwa okwenziwe ngaphezu kweminyaka eyi-100-plus.
Ukufunda Ezilwaneni
Izindlela ezisetshenziswa ekuqiniseni ukufunda ezimweni eziningi zithathwe ngokuqondile ezincwadini zengqondo
Indima eyinhloko yokubikezela ekufundeni kokuhlangenwe nakho ishintsha amandla achazwe ngenhla ngezindlela ezibalulekile. Isignali ebikade ibhekwa njengencane kakhulu (umvuzo we-episodic) ivele iminyene kakhulu. Ngokwethiyori, isimo sinjena: nganoma isiphi isikhathi, ubuchopho besilwane esincelisayo bubala imiphumela ngokusekelwe kuchungechunge oluyinkimbinkimbi lwezinzwa nezenzo, kuyilapho isilwane sicwiliswa kulo mfudlana. Kulesi simo, ukuziphatha kokugcina kwesilwane kunikeza isignali eqinile okufanele isetshenziselwe ukuqondisa ukulungiswa kwezibikezelo nokuthuthukiswa kokuziphatha. Ubuchopho busebenzisa zonke lezi zimpawu ukuze kuthuthukiswe izibikezelo (futhi, ngokufanele, ikhwalithi yezenzo ezithathiwe) esikhathini esizayo. Uhlolojikelele lwale ndlela kunikezwa encwadini enhle kakhulu "
Ukuqeqeshwa okucebile kwamanethiwekhi e-neural
Ukwakhela phezu kwezimiso zomsebenzi ophezulu wezinzwa ezitholakala ebuchosheni bezilwane ezincelisayo, obuhlala bumatasa benza izibikezelo, intuthuko yakamuva yenziwe ekuqiniseni ukufunda, manje okucabangela ukubaluleka kwezibikezelo ezinjalo. Ngingancoma ngokushesha imisebenzi emibili efanayo kuwe:
Kuwo womabili la maphepha, ababhali bagcwalisa inqubomgomo yokuzenzakalelayo ejwayelekile yamanethiwekhi abo e-neural ngemiphumela yokuqagela mayelana nesimo sendawo esikhathini esizayo. Esihlokweni sokuqala, ukubikezela kusetshenziswa ezinhlobonhlobo zokulinganisa, kanti okwesibili, ukubikezela kusetshenziswa ekushintsheni kwemvelo kanye nokuziphatha kwe-ejenti kanjalo. Kuzo zombili izimo, isignali eyingcosana ehambisana nokuqiniswa okuhle iba inothe kakhulu futhi ifundisa kakhulu, okuvumela kokubili ukufunda ngokushesha kanye nokutholwa kokuziphatha okuyinkimbinkimbi. Ukuthuthukiswa okunjalo kutholakala kuphela ngezindlela ezisebenzisa isignali yegradient, futhi hhayi ngezindlela ezisebenza ngomgomo "webhokisi elimnyama", njenge-ES.
Ngaphezu kwalokho, ukufunda kokuhlangenwe nakho nezindlela ze-gradient kuphumelela kakhulu. Ngisho nasezimeni lapho kwakungenzeka khona ukutadisha inkinga ethile usebenzisa indlela ye-ES ngokushesha kunokusebenzisa ukufunda okuqiniswayo, inzuzo yafinyelelwa ngenxa yokuthi isu le-ES lalihilela izikhathi eziningi idatha engaphezu kwe-RL. Uma sicabangela kulesi simo ngezimiso zokufunda ezilwaneni, siphawula ukuthi umphumela wokufunda esibonelweni somunye umuntu uzibonakalisa ngemva kwezizukulwane eziningi, kuyilapho ngezinye izikhathi isenzakalo esisodwa esitholwa ngokwaso sanele ukuba isilwane sifunde isifundo kuze kube phakade. Ngenkathi uthanda
Ngakho, kungani ungazihlanganisi?
Kungenzeka ukuthi ingxenye enkulu yalesi sihloko ingase ishiye umbono wokuthi ngikhuthaza izindlela ze-RL. Kodwa-ke, empeleni ngicabanga ukuthi ngokuhamba kwesikhathi isisombululo esihle kakhulu ukuhlanganisa zombili izindlela, ukuze ngayinye isetshenziswe ezimweni lapho ifaneleka khona kakhulu. Ngokusobala, esimweni sezinqubomgomo eziningi ezisebenzayo noma ezimeni ezinezimpawu ezimbalwa zokuqinisa okuhle, i-ES iyawina, ikakhulukazi uma unamandla okwenza ikhompuyutha onawo ongasebenzisa kuwo ukuqeqeshwa okuhambisanayo okukhulu. Ngakolunye uhlangothi, izindlela ze-gradient zisebenzisa ukufunda okuqiniswayo noma ukufunda okugadiwe zizoba usizo lapho sifinyelela impendulo ebanzi futhi sidinga ukufunda indlela yokuxazulula inkinga ngokushesha nangedatha encane.
Uma siphendukela emvelweni, sithola ukuthi indlela yokuqala, empeleni, ibeka isisekelo sesibili. Kungakho, phakathi nesikhathi sokuziphendukela kwemvelo, izilwane ezincelisayo ziye zakha ubuchopho obuzivumela ukuba zifunde ngokuphumelelayo kakhulu kumasignali ayinkimbinkimbi avela endaweni ezungezile. Ngakho-ke, umbuzo uhlala uvulekile. Mhlawumbe amasu okuziphendukela kwemvelo azosisiza ukuthi sisungule izakhiwo zokufunda ezisebenzayo ezizoba wusizo nasezindleleni zokufunda eziphansi. Phela, ikhambi elitholakala ngokwemvelo liphumelela kakhulu.
Source: www.habr.com