I-OpenAI ifundisa ukusebenzisana kwe-AI kumdlalo wokufihla nokufuna

Umdlalo olungileyo wakudala wokufihla kunye nokufuna unokuba luvavanyo olukhulu kwi-intelligence intelligence (AI) bots ukubonisa indlela abenza ngayo izigqibo kunye nokusebenzisana kunye kunye nezinto ezahlukeneyo ezibangqongileyo.

Kweyakhe inqaku elitsha, epapashwe ngabaphandi abavela kwi-OpenAI, umbutho ongenzi nzuzo wophando lwengqondo eye yaduma uloyiso kwiintshatsheli zehlabathi kumdlalo wekhompyutha iDota 2, izazinzulu zichaza indlela iiarhente ezilawulwa ngobukrelekrele bokwenziwa zaqeqeshwa ukuba zibe nobuchule ngakumbi ekukhangeleni nasekufihleni omnye komnye kwindawo ebonakalayo. Iziphumo zophononongo zibonise ukuba iqela le-bots ezimbini lifunda ngokufanelekileyo nangokukhawuleza ngakumbi kunayo nayiphi na iarhente enye ngaphandle kwamahlakani.

I-OpenAI ifundisa ukusebenzisana kwe-AI kumdlalo wokufihla nokufuna

Izazinzulu zisebenzise indlela ekudala yaduma ngayo ukufunda koomatshini kunye nokomeleza, apho ubukrelekrele bokwenziwa bubekwe kwindawo engaziwayo kuyo, ngelixa kuneendlela ezithile zokusebenzisana nayo, kunye nenkqubo yokuvuza kunye neentlawulo zesinye okanye esinye isiphumo sezenzo zayo. Le ndlela iyasebenza kakhulu ngenxa yokukwazi kwe-AI ukwenza iintshukumo ezahlukeneyo kwindawo ebonakalayo ngesantya esikhulu, izigidi zamaxesha ngokukhawuleza kunokuba umntu anokucinga. Oku kuvumela uvavanyo kunye nempazamo ukufumana awona maqhinga asebenzayo okusombulula ingxaki ethile. Kodwa le ndlela nayo inemida ethile, umzekelo, ukudala imeko-bume kunye nokuqhuba imijikelezo emininzi yoqeqesho kufuna izixhobo ezinkulu zekhompyutha, kwaye inkqubo ngokwayo ifuna inkqubo echanekileyo yokuthelekisa iziphumo zezenzo ze-AI kunye nenjongo yayo. Ukongeza, izakhono ezifunyenwe yi-arhente ngale ndlela zilinganiselwe kumsebenzi ochaziweyo kwaye, ngokukhawuleza ukuba i-AI ifunde ukuhlangabezana nayo, akusayi kubakho luphuculo.

Ukuqeqesha i-AI ukuba idlale ukufihla kunye nokufuna, izazinzulu zasebenzisa indlela ebizwa ngokuba yi "Undirected exploration," apho ii-arhente zinenkululeko epheleleyo yokuphuhlisa ukuqonda kwazo kwihlabathi lomdlalo kunye nokuphuhlisa izicwangciso zokuphumelela. Oku kuyafana nendlela yokufunda yee-arhente ezininzi esetyenziswa ngabaphandi kwi-DeepMind xa iinkqubo ezininzi zobuntlola ezenziweyo. baqeqeshelwa ukudlala ukubamba iflegi kwi-Quake III Arena. Njengoko kule meko, ii-arhente ze-AI zazingazange ziqeqeshwe ngaphambili kwimithetho yomdlalo, kodwa ekuhambeni kwexesha zafunda izicwangciso ezisisiseko kwaye zakwazi ukumangalisa abaphandi ngezisombululo ezingenamsebenzi.

Kumdlalo wokufihla kunye nokufuna, iiarhente ezininzi ezinomsebenzi wokuzifihla zazifuneka ukuba zithintele umgca wokubonwa kwabachasi babo emva kokuqalwa kwentloko kancinci ngelixa iqela leearhente ezikhangelwayo lalingakwazi ukuhamba. Ngaphezu koko, "umgca wokubona" ​​kulo mongo yi-cone ye-135 degree phambi kwebhot nganye. Iiarhente azikwazanga ukuya kude kakhulu ngaphandle kwebala lokudlala kwaye zanyanzelwa ukuba zijonge kumagumbi enziwe ngokungenamkhethe ngokukwazi ukusebenzisa ezinye izinto zangaphandle (iibhokisi, iindonga ezishukumayo, iirempu ezikhethekileyo) ezisasazeke ngeenxa zonke ezinokuthi zisetyenziswe zombini ukwenza ikhusi kunye nokungena kuzo. .

I-OpenAI ifundisa ukusebenzisana kwe-AI kumdlalo wokufihla nokufuna

Ngenkqubo yokufunda ende, iiarhente ze-AI zafunda iindlela ezintandathu ezizodwa, nganye yazo yabanceda baqhubela phambili kwinqanaba elilandelayo lomdlalo. Ekuqaleni, amaqela afuna ukufihla intloko abaleka nje aza asukelana, kodwa emva kwemidlalo emalunga nezigidi ezingama-25, iqela labafihlayo lafunda ukuvala iipaseji ngeebhokisi laza lazakhela iindawo zokuhlala. Emva kweminye imidlalo ezizigidi ezingama-75, iqela lafumanisa iindlela zokusebenzisa iindlela zokungena kwezi ndawo zizimele. Emva kweminye imijikelo ezigidi ezili-10, abafihlayo bafunda ukutsala iirempu ukuya emphethweni webala lokudlala baze bazitshixe ukuze bathintele abachasi ukuba bangazisebenzisi.

I-OpenAI ifundisa ukusebenzisana kwe-AI kumdlalo wokufihla nokufuna

Emva kwemidlalo ezizigidi ezingama-388, abafuni bafunde ukusebenzisa iindlela ezithambekileyo zokukhwela kwiikreyiti abaziswe kubo, baze bathi, besiya ngqo kuzo, bagqobhoze kwiindawo ezifihlakeleyo zotshaba ezenziwe kwiindonga eziphathwayo. Kwaye ekugqibeleni, emva kwemidlalo ye-458 yezigidi, iqela elizimele lagqiba ekubeni kufuneka livale zonke izinto kwaye emva koko lakhe ikhusi, ekubonakala ukuba likhokelele kuloyiso lwabo lokugqibela.

Eyona nto ichukumisa ngakumbi kukuba emva kwemidlalo yezigidi ezingama-22, iiarhente zafunda ukulungelelanisa izenzo zabo kunye nokusebenza kakuhle kwentsebenziswano yabo kwanda kuphela kwixesha elizayo, umzekelo, ngamnye weza nebhokisi yakhe okanye udonga ukuze enze indawo yokuhlala kwaye wakhetha isabelo sakhe sezinto ukuze aphumelele. block, ukwenza nzima umdlalo wobunzima kubachasi.

I-OpenAI ifundisa ukusebenzisana kwe-AI kumdlalo wokufihla nokufuna

Iingcali zenzululwazi nazo zaqaphela inqaku elibalulekileyo elinxulumene nempembelelo yenani lezinto zoqeqesho (isixa sedatha edluliselwe kwi-neural network - "Batch size") kwisantya sokufunda. Imodeli engagqibekanga idinga i-132,3 yezigidi zeemidlalo kwiiyure ze-34 zoqeqesho ukufikelela kwindawo apho iqela elifihlayo lifunde ukuvimba i-ramps, ngelixa idatha engaphezulu ibangele ukunciphisa okubonakalayo kwixesha loqeqesho. Ngokomzekelo, ukwandisa inani leeparamitha (inxalenye yedatha efunyenwe ngexesha lonke lenkqubo yoqeqesho) ukusuka kwi-0,5 yezigidi ukuya kwi-5,8 yezigidi kwandisa ukusebenza kwesampulu ngamaxesha angama-2,2, kunye nokwandisa ubungakanani bedatha yegalelo ukusuka kwi-64 KB ukuya kwi-128 KB yokunciphisa uqeqesho. ixesha eliphantse libe lixesha elinye elinesiqingatha.

I-OpenAI ifundisa ukusebenzisana kwe-AI kumdlalo wokufihla nokufuna

Ekupheleni komsebenzi wabo, abaphandi bagqiba ekubeni bavavanye ukuba kungakanani na uqeqesho lwe-in-game olunokunceda ii-arhente zijongane nemisebenzi efanayo ngaphandle komdlalo. Bekukho iimvavanyo ezintlanu zizonke: ulwazi lwenani lezinto (ukuqonda ukuba into iyaqhubeka nokubakho nokuba ingabonakali kwaye ingasetyenziswa); "Ukutshixa kunye nokubuya" - ukukwazi ukukhumbula indawo yokuqala kwaye ubuyele kuyo emva kokugqiba umsebenzi owongezelelweyo; β€œukuthintela ngokulandelelanayo” - iibhokisi ezi-4 zazibekwe ngokungacwangciswanga kumagumbi amathathu angenaminyango, kodwa zineendawo ezithambekileyo zokungena ngaphakathi, ii-arhente zazifuna ukuzibhaqa zonke; ukubekwa kweebhokisi kwiindawo ezimiselwe kwangaphambili; ukudala ikhusi ejikeleze into ngendlela yesilinda.

Ngenxa yoko, kwimisebenzi emithathu kwemihlanu, iibhothi eziye zafumana uqeqesho lwangaphambili kumdlalo wafunda ngokukhawuleza kwaye wabonisa iziphumo ezingcono kune-AI eyayiqeqeshelwe ukuxazulula iingxaki ukusuka ekuqaleni. Baqhube ngcono kancinci ekugqibezeleni umsebenzi nasekubuyeleni kwindawo yokuqala, bevala iibhokisi ngokulandelelanayo kumagumbi avaliweyo, kunye nokubeka iibhokisi kwiindawo ezinikiweyo, kodwa benza buthathaka kancinci ekuqondeni inani lezinto kunye nokudala ikhava ejikeleze enye into.

Abaphandi bachaza iziphumo ezixubeneyo kwindlela i-AI efunda ngayo kwaye ikhumbula izakhono ezithile. "Sicinga ukuba imisebenzi apho uqeqesho lwaphambi komdlalo luqhube kakuhle lubandakanya ukusebenzisa kwakhona izakhono ezifundiweyo ngendlela eqhelekileyo, ngelixa usenza imisebenzi eseleyo bhetele kune-AI eqeqeshiweyo kuya kufuna ukuba isetyenziswe ngendlela eyahlukileyo, enzima kakhulu. ,” bhala ababhali abakunye nalo msebenzi. "Esi siphumo sigxininisa imfuneko yokuphuhlisa iindlela zokuphinda zisebenzise ngempumelelo izakhono ezifunyenwe ngoqeqesho xa zisiwa kwindawo ethile ukuya kwenye."

Umsebenzi owenziweyo uchukumisa ngokwenene, ekubeni ithemba lokusebenzisa le ndlela yokufundisa lingaphaya kwamandla awo nawuphi na umdlalo. Abaphandi bathi umsebenzi wabo linyathelo elibalulekileyo ekudaleni i-AI kunye "ne-physics-based" kunye nokuziphatha "okufana nomntu" okwazi ukuxilonga izifo, ukuqikelela izakhiwo zeemolekyuli zeprotheni eziyinkimbinkimbi kunye nokuhlalutya i-CT scans.

Kule vidiyo ingezantsi unokubona ngokucacileyo ukuba yonke inkqubo yokufunda yenzeke njani, indlela i-AI yafunda ngayo ukusebenzisana, kunye nezicwangciso zayo ziye zaba nobuqhophololo kwaye zintsonkothile.



umthombo: 3dnews.ru

Yongeza izimvo