I-GitHub ivule intuthuko ekusetshenzisweni komshini wokufunda ekusesheni amakhodi nokuhlaziya

GitHub kwethulwa iphrojekthi I-CodeSearchNet, esilungise amamodeli okufunda omshini namasethi edatha adingekayo ekuhlukaniseni, ekuhlukaniseni nasekuhlaziyeni ikhodi ngezilimi ezihlukahlukene zokuhlela. I-CodeSearchNet, efana ne IMAGEnet, ihlanganisa iqoqo elikhulu lamazwibela ekhodi anezichasiselo ezenza ngokusemthethweni lokho okwenziwa ikhodi. Izingxenye zamamodeli okuqeqesha nezibonelo zokusebenzisa i-CodeSearchNet zibhalwe nge-Python kusetshenziswa uhlaka lwe-Tensorflow kanye isatshalaliswa ngu ngaphansi kwelayisensi ye-MIT.

Lapho kwakhiwa i-CodeSearchNet, kusetshenziswe ubuchwepheshe bokuhlaziya umbhalo wolimi lwemvelo, okwenza izinhlelo zokufunda zomshini zinganaki nje izici ze-syntactic, kodwa nencazelo yezenzo ezenziwa ikhodi. Isistimu ye-GitHub kuyasebenza ekuhlolweni kokuhlela ukusesha kwekhodi ye-semantic usebenzisa imibuzo ku ulimi lwemvelo (isibonelo, uma ucela "ukuhlunga uhlu lwezintambo", ikhodi esebenzisa ama-algorithms ahambisanayo iyaboniswa).

Isethi yedatha ehlongozwayo ihlanganisa izixhumanisi zamazwana ekhodi ezingaphezu kwezigidi ezingu-2, ezilungiselelwe ngokusekelwe emibhalweni ewumthombo wemitapo yolwazi evuliwe ekhona. Ikhodi ihlanganisa umbhalo ophelele womthombo wemisebenzi ngayinye noma izindlela, futhi amazwana achaza izenzo ezenziwa umsebenzi (imibhalo enemininingwane inikeziwe). Njengamanje, amasethi edatha alungiselelwa iPython, JavaScript, Ruby, Go, Java ne-PHP. Kunikezwe izibonelo zokusebenzisa amasethi edatha ahlongozwayo ukuze kuqeqeshwe izinhlobo ezihlukahlukene zamanethiwekhi emizwa, okuhlanganisa I-Neural-Bag-Of-Words, I-RNN, Ukuzinaka (BERT) kanye 1D-CNN+Self-Attention Hybrid.

Ukuze kuthuthukiswe izindlela zokucinga zolimi lwemvelo, isethi ye-CodeSearchNet Challenge isilungiselwe futhi, okuhlanganisa
99 ejwayelekile imibuzo enezichasiselo zochwepheshe ezingaba yizinkulungwane ezi-4 ezichaza amakhodi okubophezela okungenzeka kakhulu kudathasethi ye-CodeSearchNet Corpus, ehlanganisa izindlela nemisebenzi engaba yizigidi ezingu-6 (setha usayizi cishe 20 GB). I-CodeSearchNet Challenge ingasebenza njengebhentshimakhi yokuhlola ukusebenza kahle kwezindlela ezithile zokusesha ikhodi yolimi lwemvelo. Ukusebenzisa amathuluzi IKubeflow zilungisiwe
isibonelo ikhodi yokusesha injini.

Source: opennet.ru

Engeza amazwana