Hayi Habr!
Asisoloko sithatha isigqibo sokuthumela apha iinguqulelo zemibhalo ebineminyaka emibini ubudala, ngaphandle kwekhowudi kwaye ngokucacileyo yeyobume bemfundo - kodwa namhlanje siza kwenza okukhethekile. Siyathemba ukuba ingxaki evezwe kwisihloko senqaku ixhalabisa abafundi bethu abaninzi, kwaye sele ufunde umsebenzi osisiseko kwizicwangciso zokuziphendukela kwemvelo apho esi sithuba siphikisana khona kwi-original okanye siya kuyifunda ngoku. Wamkelekile kwikati!
NgoMatshi ka-2017, i-OpenAI yenza amaza kuluntu lokufunda olunzulu ngephepha “
Izicwangciso zokuziphendukela kwemvelo
Ithisisi ephambili yephepha le-OpenAI yayikukuba, endaweni yokusebenzisa ukufunda okomeleza kudityaniswe ne-backpropagation yendabuko, baqeqeshe ngempumelelo inethiwekhi ye-neural ukusombulula iingxaki ezintsonkothileyo besebenzisa oko bakubiza ngokuba “sisicwangciso sokuziphendukela kwemvelo” (ES). Le ndlela ye-ES ibandakanya ukugcina ukuhanjiswa kwenethiwekhi ngokubanzi kobunzima, okubandakanya ii-arhente ezininzi ezisebenza ngokufanayo kunye nokusebenzisa iiparitha ezikhethiweyo kolu lwabiwo. I-arhente nganye isebenza kwindawo yayo, kwaye ekugqityweni kwenani elithile leepisodes okanye izigaba zesiqendu, i-algorithm ibuyisela umvuzo odibeneyo, ochazwe njengamanqaku okufaneleka. Ukuthatha eli xabiso kwi-akhawunti, ukuhanjiswa kweeparameters kunokutshintshelwa kwii-agent eziphumeleleyo, ukunqanda abaphumelele kakhulu. Ngokuphinda umsebenzi onjalo izigidi zamaxesha ngokuthatha inxaxheba kwamakhulu eejenti, kunokwenzeka ukuhambisa ukuhanjiswa kobunzima kwindawo eya kuvumela ii-arhente ukuba zenze umgaqo-nkqubo ophezulu wokusombulula umsebenzi owabelwe wona. Enyanisweni, iziphumo ezichazwe kwinqaku ziyamangalisa: kubonisiwe ukuba uqhuba iwaka leejenti ngokuhambelanayo, ngoko i-anthropomorphic locomotion kwimilenze emibini inokufundwa ngaphantsi kwesiqingatha seyure (ngelixa iindlela eziphambili zeRL zifuna ukuchitha ngaphezulu. ngaphezu kweyure enye kule). Ngolwazi oluthe kratya, ndincoma ukufunda okugqwesileyo
Izicwangciso ezahlukeneyo zokufundisa ukuhamba okuthe tye kwe-anthropomorphic, ezifundwe kusetyenziswa indlela ye-ES evela kwi-OpenAI.
Ibhokisi emnyama
Inzuzo enkulu yale ndlela kukuba iyakwazi ukulinganisa lula. Ngelixa iindlela ze-RL, ezifana ne-A3C, zifuna ukuba ulwazi lutshintshwe phakathi kweentambo zabasebenzi kunye neseva yepharamitha, i-ES ifuna kuphela uqikelelo lokufaneleka kunye nolwazi oluqhelekileyo lokusabalalisa ipharamitha. Kungenxa yolu lula ukuba le ndlela iphambili kakhulu kwiindlela zangoku ze-RL ngokwesakhono sokukala. Nangona kunjalo, konke oku akuzililize: kuya kufuneka ukhulise inethiwekhi ngokomgaqo webhokisi emnyama. Kule meko, "ibhokisi elimnyama" lithetha ukuba ngexesha loqeqesho isakhiwo sangaphakathi sothungelwano asihoywa ngokupheleleyo, kwaye kuphela umphumo opheleleyo (umvuzo wesiganeko) usetyenzisiweyo, kwaye kuxhomekeke kuyo ukuba ubunzima bothungelwano oluthile luya kunceda. malizuzwe zizizukulwana ezilandelayo. Kwiimeko apho singafumani ngxelo eninzi evela kwindalo-kwaye kwiingxaki ezininzi ze-RL zendabuko ukuhamba kwembuyekezo kunqabile kakhulu-ingxaki isuka ekubeni "inxalenye yebhokisi emnyama" ukuya "kwibhokisi elimnyama ngokupheleleyo." Kule meko, unokwandisa kakhulu imveliso, ngoko ke, ngokuqinisekileyo, ukulungelelaniswa okunjalo kuyafaneleka. "Ngubani ofuna i-gradients ukuba akukho ngxolo engapheliyo?" - olu luluvo ngokubanzi.
Nangona kunjalo, kwiimeko apho impendulo isebenza ngakumbi, izinto ziqala ukungahambi kakuhle kwi-ES. Iqela le-OpenAI lichaza indlela inethiwekhi yokuhlelwa kwe-MNIST elula yaqeqeshwa ngayo ngokusebenzisa i-ES, kwaye ngeli xesha uqeqesho lwaluhamba ngokukhawuleza ngamaxesha angama-1000. Inyani yeyokuba isiginali yegradient kuhlelo lwemifanekiso inolwazi kakhulu malunga nendlela yokufundisa inethiwekhi yokuhlelwa ngcono. Ke, ingxaki incinci ngobuchwephesha be-RL kwaye ngaphezulu ngemivuzo enqabileyo kwindawo evelisa i-gradients enengxolo.
Isisombululo sendalo
Ukuba sizama ukufunda kumzekelo wendalo, sicinga ngeendlela zokuphuhlisa i-AI, ngoko kwezinye iimeko i-AI ingacingelwa njenge.
Emva kokuphonononga indlela yokuziphatha kwengqondo yezilwanyana ezanyisayo, siyabona ukuba yenziwa ngenxa yempembelelo edibeneyo yeenkqubo ezimbini ezisondeleleneyo: ukufunda kumava abanye и ukufunda ngokwenza. Eyangaphambili isoloko ilingana nendaleko eqhutywa kukhetho lwendalo, kodwa apha ndisebenzisa igama elibanzi ukuthathela ingqalelo i-epigenetics, i-microbiomes, kunye nezinye iindlela ezenza kube lula ukwabelana ngamava phakathi kwezinto eziphilayo ezinganxulumananga nemfuzo. Inkqubo yesibini, ukufunda kumava, yonke ingcaciso isilwanyana esilawula ukufunda kuyo yonke impilo yaso, kwaye olu lwazi lunqunywe ngokuthe ngqo ngokusebenzisana kwesi silwanyana kunye nehlabathi langaphandle. Olu didi lubandakanya yonke into ukusuka ekufundeni ukuya ekuqondeni izinto ukuya ekulawuleni unxibelelwano olukhoyo kwinkqubo yokufunda.
Xa sithetha nje, ezi nkqubo zimbini zenzeka kwindalo zinokuthelekiswa neenketho ezimbini zokuphucula uthungelwano lwe-neural. Izicwangciso zokuziphendukela kwemvelo, apho ulwazi malunga ne-gradients lusetyenziselwa ukuhlaziya ulwazi malunga ne-organism, sondela ekufundeni kumava abanye. Ngokufanayo, iindlela ze-gradient, apho ukufumana enye okanye enye amava kukhokelela kwinguqu enye okanye enye kwindlela yokuziphatha ye-arhente, ifaniswa nokufunda kumava akhe. Ukuba sicinga ngeentlobo zokuziphatha okukrelekrele okanye izakhono ezithi nganye kwezi ndlela zimbini ziphuhliswe kwizilwanyana, uthelekiso luba lukhulu ngakumbi. Kuzo zombini ezi meko, "iindlela zendaleko" zikhuthaza ukufundwa kokuziphatha okusebenzayo okuvumela umntu ukuba aphuhlise ukuqina okuthile (okwaneleyo ukuhlala ephila). Ukufunda ukuhamba okanye ukuphunyuka ekuthinjweni kwiimeko ezininzi zilingana nokuziphatha "okwemvelo" okungaphezulu "ku-hard-wired" kwizilwanyana ezininzi kwinqanaba lezofuzo. Ukongeza, lo mzekelo uqinisekisa ukuba iindlela zokuzivelela ziyasebenza kwiimeko apho umqondiso womvuzo unqabile kakhulu (umzekelo, inyani yokukhulisa umntwana ngempumelelo). Kwimeko enjalo, akunakwenzeka ukulungelelanisa umvuzo kunye nayo nayiphi na isethi yezenzo ezithile ezinokuthi zenziwe iminyaka emininzi ngaphambi kokuba kwenzeke le nyaniso. Ngakolunye uhlangothi, ukuba siqwalasela imeko apho i-ES ingaphumeleli, oko kukuthi ukuhlelwa kwemifanekiso, iziphumo zifaniswa ngokuphawulekayo neziphumo zokufunda kwezilwanyana eziphunyezwe kwiimvavanyo ezingenakubalwa zengqondo zokuziphatha eziqhutywe ngaphezu kwe-100-plus iminyaka.
Ukufunda kwiZilwanyana
Iindlela ezisetyenziswayo ekuqiniseni ukufunda kwiimeko ezininzi zithathwa ngokuthe ngqo kuncwadi lwezengqondo
Indima ephambili yoqikelelo ekufundeni kumava itshintsha i-dynamics echazwe ngasentla ngeendlela ezibalulekileyo. Isiginali ebikade ithathwa njengencinci kakhulu (i-episodic umvuzo) ijika ibe xinene kakhulu. Ngokwethiyori, imeko ifana nale: nangaliphi na ixesha, ingqondo yesilwanyana esanyisayo ibala iziphumo ezisekelwe kumlambo ontsonkothileyo wentshukumo yeemvakalelo kunye nezenzo, ngelixa isilwanyana sintywiliselwa nje kulo mjelo. Kule meko, ukuziphatha kokugqibela kwesilwanyana kunika umqondiso oqinileyo omele usetyenziswe ukukhokela ukulungiswa kwezibikezelo kunye nokuphuhliswa kokuziphatha. Ingqondo isebenzisa yonke le miqondiso ukuze kuphuculwe uqikelelo (kwaye, ngokufanelekileyo, umgangatho wamanyathelo athathiweyo) kwixesha elizayo. Isishwankathelo sale ndlela sinikwe kwincwadi ebalaseleyo "
Uqeqesho olutyebileyo lothungelwano lwe-neural
Ukwakha phezu kwemigaqo yomsebenzi ophezulu we-neural okhoyo kwingqondo ye-mammalian, ehlala ixakeke ngokwenza uqikelelo, inkqubela phambili yakutshanje yenziwe ekufundiseni ukomeleza, ngoku ithathela ingqalelo ukubaluleka koqikelelo olunjalo. Ndingacebisa ngokukhawuleza imisebenzi emibini efanayo kuwe:
Ukufunda Ukwenza Ngokuqikelela iKamva Ukomelezwa kokuFunda kunye nemiSebenzi eNcedisayo engagadwanga
Kuwo omabini la maphepha, ababhali bongeza umgaqo-nkqubo ongagqibekanga oqhelekileyo wothungelwano lwabo lwe-neural kunye neziphumo zokuqikelela malunga nemeko yokusingqongileyo kwixesha elizayo. Kwinqaku lokuqala, ukubikezelwa kusetyenziswa kwiindidi ezahlukeneyo zokulinganisa, kwaye okwesibini, ukubikezela kusetyenziswa utshintsho kwimo engqongileyo kunye nokuziphatha kwe-arhente. Kuzo zombini iimeko, umqondiso we-sarse ohambelana nokuqiniswa okulungileyo uba nobutyebi kunye nolwazi oluninzi, okuvumela ukufunda ngokukhawuleza kunye nokufumana iindlela zokuziphatha ezinzima. Uphuculo olunjalo lufumaneka kuphela ngeendlela ezisebenzisa isignali ye-gradient, kwaye kungekhona ngeendlela ezisebenza kumgaqo "webhokisi elimnyama", njenge-ES.
Ukongeza, ukufunda kumava kunye neendlela zegradient zisebenza ngakumbi. Kwanakwiimeko apho kwakunokwenzeka ukufundisisa ingxaki ethile usebenzisa indlela ye-ES ngokukhawuleza kunokusebenzisa ukufunda okomeleza, inzuzo yafunyanwa ngenxa yokuba isicwangciso se-ES sibandakanya amaxesha amaninzi idatha ngaphezu kwe-RL. Ukucinga kule meko ngemigaqo yokufunda kwizilwanyana, siphawula ukuba umphumo wokufunda kumzekelo womnye umntu uzibonakalisa emva kwezizukulwana ezininzi, ngelixa ngamanye amaxesha isiganeko esinye esinamava ngokwaso sanele ukuba isilwanyana sifunde isifundo ngonaphakade. Ngexesha uthanda
Ngoko, kutheni ungazidibanisi?
Kusenokwenzeka ukuba uninzi lweli nqaku lunokushiya uluvo lokuba ndithethelela iindlela ze-RL. Nangona kunjalo, ngokwenene ndicinga ukuba ekuhambeni kwexesha isisombululo esona sihle kukudibanisa zombini iindlela, ukwenzela ukuba nganye isetyenziswe kwiimeko ezifanelekileyo. Ngokucacileyo, kwimeko yemigaqo-nkqubo emininzi esebenzayo okanye kwiimeko ezineempawu ezinqabileyo zokuqiniswa okuqinisekileyo, i-ES iyaphumelela, ngakumbi ukuba unamandla ekhompyuter onawo apho unokuqhuba khona uqeqesho olunxuseneyo. Kwelinye icala, iindlela zokuhla ezisebenzisa ukuqinisa ukufunda okanye ukufunda okugadwayo ziya kuba luncedo xa sinokufikelela kwingxelo ebanzi kwaye kufuneka sifunde indlela yokusombulula ingxaki ngokukhawuleza nangedatha encinci.
Ukuguqukela kwindalo, sifumanisa ukuba indlela yokuqala, ngokwenene, ibeka isiseko sesibini. Yiyo loo nto, ekuhambeni kwexesha lokuzivelela kwezinto, izilwanyana ezanyisayo ziye zavelisa ubuchopho obuzivumela ukuba zifunde ngokugqibeleleyo kwimiqondiso entsonkothileyo evela kwindalo esingqongileyo. Ngoko, umbuzo uhlala uvulekile. Mhlawumbi amacebo endaleko aya kusinceda ukuba siyile iindlela zokufunda ezisebenzayo neziya kuba luncedo kwiindlela zokufunda ezithambekileyo. Ngapha koko, isisombululo esifunyanwa yindalo ngokwenene siphumelele kakhulu.
umthombo: www.habr.com