GitHub ya buɗe ci gaba a cikin amfani da na'ura koyo don binciken lamba da bincike

GitHub gabatar aikin CodeSearchNet, wanda ya shirya nau'ikan koyo na na'ura da saiti na bayanan da suka wajaba don tantancewa, rarrabuwa da nazarin lambobi a cikin harsunan shirye-shirye daban-daban. CodeSearchNet, kama da Hoton Hotuna, ya haɗa da babban tarin snippets na lamba tare da annotations waɗanda ke tsara abin da lambar ke yi. Abubuwan da aka haɗa don ƙirar horarwa da misalan amfani da CodeSearchNet an rubuta su cikin Python ta amfani da tsarin Tensorflow da rarraba ta karkashin lasisin MIT.

Lokacin ƙirƙirar CodeSearchNet, an yi amfani da fasahohin nazarin rubutun harshe na halitta, wanda ke ba da damar tsarin koyo na na'ura don yin la'akari ba kawai abubuwan haɗin kai ba, har ma da ma'anar ayyukan da lambar ta yi. Tsarin GitHub amfani a cikin gwaje-gwajen akan tsara lambar bincike ta amfani da tambayoyi a kunne harshe na halitta (alal misali, lokacin da ake buƙatar "yanke lissafin kirtani", ana nuna lambar tare da aiwatar da algorithms masu dacewa).

Saitin bayanan da aka tsara ya ƙunshi hanyoyin haɗin bayanan lamba sama da miliyan 2, waɗanda aka shirya bisa tushen rubutun ɗakunan karatu na buɗe. Lambar ta ƙunshi cikakken rubutun tushe na ayyuka ko hanyoyin guda ɗaya, kuma sharhin yana bayyana ayyukan da aikin yayi (an bayar da cikakkun bayanai). A halin yanzu, ana shirya bayanan bayanai don Python, JavaScript, Ruby, Go, Java da PHP. An ba da misalai na amfani da tsarin bayanan da aka tsara don horar da nau'ikan hanyoyin sadarwa iri-iri, gami da Jijiya-Bag-Na Kalmomi, RNN, Hankalin Kai (BERT) da 1D-CNN+Hanyar Hankalin Kai.

Don haɓaka hanyoyin binciken harshe na halitta, an kuma shirya saitin Kalubalen CodeSearchNet, gami da
99 na hali Tambayoyi tare da bayanan ƙwararru kusan dubu 4 waɗanda ke bayyana mafi yuwuwar ɗaurin lamba a cikin kundin bayanan CodeSearchNet Corpus, wanda ke rufe kusan hanyoyi da ayyuka miliyan 6 (saita girman kusan 20 GB). Kalubalen CodeSearchNet na iya aiki azaman ma'auni don kimanta ingancin wasu hanyoyin don neman lambar yare na halitta. Amfani da kayan aiki KubeFlow shirya
misali injin binciken lambar.

source: budenet.ru

Add a comment