GitHub e butse lintlafatso ts'ebelisong ea ho ithuta ka mochini bakeng sa ho batla le ho sekaseka khoutu

GitHub tsebisoa morero CodeSearchNet, e lokiselitseng mehlala ea ho ithuta ka mochini le lisebelisoa tsa data tse hlokahalang bakeng sa ho arola, ho arola le ho hlahloba khoutu ka lipuo tse fapaneng tsa lenaneo. CodeSearchNet, e ts'oanang le ImageNet, e kenyelletsa pokello e kholo ea likotoana tsa khoutu tse nang le litlhaloso tse tiisang seo khoutu e se etsang. Likarolo tsa mehlala ea koetliso le mehlala ea ho sebelisa CodeSearchNet li ngotsoe ka Python ho sebelisoa moralo oa Tensorflow le ajoa ke tlas'a laesense ea MIT.

Ha ho etsoa CodeSearchNet, ho ile ha sebelisoa theknoloji ea puo ea tlhaho ea puo ea tlhaho, e leng ho nolofalletsang mekhoa ea ho ithuta ka mochine hore e se ke ea nahanela likarolo tsa syntactic feela, empa le moelelo oa liketso tse etsoang ke khoutu. Sistimi ea GitHub sebelisoa litekong tsa ho hlophisa patlo ea khoutu ea semantic ka ho sebelisa lipotso ho puo ya tlhaho (mohlala, ha o kopa "ho hlophisa lethathamo la likhoele", khoutu e nang le ts'ebetsong ea li-algorithms tse lumellanang e bonts'oa).

Lintlha tse reriloeng li kenyelletsa likhokahano tsa likhoutu tse fetang limilione tse 2, tse lokiselitsoeng ho ipapisitse le litemana tsa mohloli oa lilaebrari tse seng li ntse li le teng. Khoutu e akaretsa mongolo o feletseng oa mohloli oa mesebetsi kapa mekhoa ea motho ka mong, 'me tlhaloso e hlalosa liketso tse entsoeng ke mosebetsi (litokomane tse qaqileng li fanoa). Hajoale, li-dataset li lokiselitsoe Python, JavaScript, Ruby, Go, Java le PHP. Mehlala e fanoe ea ho sebelisa li-dataset tse reriloeng bakeng sa ho koetlisa mefuta e fapaneng ea marang-rang a neural, ho kenyeletsoa Neural-Bag-Of-Mantsoe, RNN, Boiketsi ba ho itlhokomela (BERT) le 1D-CNN+Self-Attention Hybrid.

Ho theha mekhoa ea ho batla puo ea tlhaho, ho lokiselitsoe sehlopha sa CodeSearchNet Challenge, ho kenyeletsoa
99 tloaelehileng lipotso tse nang le litlhaloso tsa litsebi tse ka bang likete tse 4 tse hlalosang likhoutu tse ka bang teng ka har'a dataset ea CodeSearchNet Corpus, e akaretsang mekhoa le mesebetsi e ka bang limilione tse 6 (seta boholo hoo e ka bang 20 GB). CodeSearchNet Challenge e ka sebetsa e le letšoao la ho lekola katleho ea mekhoa e itseng ea ho batla khoutu ea puo ea tlhaho. Ho sebelisa lisebelisoa KubeFlow itokisitse
mohlala enjine ea ho batla khoutu.

Source: opennet.ru

Eketsa ka tlhaloso