UMicrosoft uvule ikhowudi yethala leencwadi lokukhangela i-vector elisetyenziswa yi-Bing

Microsoft ipapashiwe ikhowudi yomthombo wethala leencwadi lokufunda ngomatshini SPTAG (I-Space Partition Tree kunye neGrafu) kunye nokuphunyezwa koqikelelo uphendlo lommelwane okufutshane. Ithala leencwadi iphuhlisiwe kwicandelo lophando loPhando lweMicrosoft kunye neziko lophuhliso lwetekhnoloji yokukhangela (iZiko leTekhnoloji yeMicrosoft Search). Ngokwesiqhelo, iSPTAG isetyenziswa kwi-injini yokukhangela ye-Bing ukumisela ezona ziphumo zifanelekileyo, kuthathelwa ingqalelo umxholo wemibuzo yokukhangela. Ikhowudi ibhalwe kwi-C ++ kunye isasazwa ngu phantsi kwelayisenisi ye-MIT. Yakha iLinux kunye neWindows iyaxhaswa. Kukho ukubophelela kolwimi lwePython.

Ngaphandle kwento yokuba iingcamango zokusebenzisa ukugcinwa kwe-vector kwiinjini zokukhangela sele zijikeleza ixesha elide, ekusebenzeni ukuphunyezwa kwazo kuthintelwe ngamandla obutyebi obuninzi bokusebenza kunye ne-vectors kunye nokulinganiselwa kwi-scalability. Ukudityaniswa kweendlela ezinzulu zokufunda koomatshini kunye ne-algorithms yokukhangela ummelwane osondeleyo kuye kwenza ukuba kube lula ukuzisa ukusebenza kunye nokulinganisa kweenkqubo ze-vector kwinqanaba elamkelekileyo kwiinjini zokukhangela ezinkulu. Ngokomzekelo, kwi-Bing, kwi-index ye-vector engaphezu kwe-150 yezigidigidi ze-vectors, ixesha lokulanda ezona ziphumo ezifanelekileyo yi-8ms.

Ithala leencwadi libandakanya izixhobo zokwakha isalathiso kunye nokuququzelela ukukhangela ii-vectors, kunye neseti yezixhobo zokugcina inkqubo yokukhangela esasazwayo kwi-intanethi ehlanganisa iingqokelela ezinkulu kakhulu ze-vectors. Ninikelwe le midyuli ilandelayo: umakhi wesalathiso wesalathiso, umphandi wokukhangela usebenzisa isalathisi esisasazwe kwiqela leendawo ezininzi, umncedisi wokuqhuba iziphatho kwiindawo, iAggregator yokudibanisa abancedisi abaninzi kwinto enye, kunye nomxhasi wokuthumela izicelo. Ixhasa ukuquka iivektha ezintsha kwisalathiso kunye nokucima ii-vectors kwi-fly.

Ithala leencwadi lithatha ukuba idatha esetyenzisiweyo yaza yanikezelwa kwingqokelela inikezelwe ngendlela yeevektha ezinxulumeneyo ezinokuthelekiswa ngokusekwe I-Euclidean (L2) okanye cosine imigama. Umbuzo wokukhangela ubuyisela i-vectors kunye nomgama omncinci phakathi kwazo kunye ne-original vector. I-SPTAG ibonelela ngeendlela ezimbini zokuququzelela indawo yevektha: SPTAG-KDT (K-Dimensional Tree)kd-umthi) kunye igrafu ebumelwaneni) kunye ne-SPTAG-BKT (k-uthetha umthi (k-uthetha umthi kunye negrafu yendawo yokuhlala). Indlela yokuqala idinga izibonelelo ezincinci xa isebenza kunye nesalathisi, kwaye okwesibini kubonisa ukuchaneka okuphezulu kweziphumo zophando kwiiqoqo ezinkulu kakhulu ze-vectors.

Ngexesha elifanayo, ukukhangela i-vector akukhawulelwanga kwisicatshulwa kwaye kunokufakwa kwiinkcukacha zemultimedia kunye nemifanekiso, kunye neenkqubo zokuvelisa izindululo ezizenzekelayo. Umzekelo, kwenye yeeprototypes esekwe kwisakhelo sePyTorch, inkqubo yeVector yokukhangela esekwe kukufaniswa kwezinto ezisemifanekisweni yaphunyezwa, yakhiwa kusetyenziswa idatha evela kwiingqokelela ezininzi zeereferensi ezinemifanekiso yezilwanyana, iikati kunye nezinja, ezathi zaguqulelwa ekubeni zifumaneke. iiseti zevektha. Xa umfanekiso ongenayo ufunyenwe ukukhangela, uguqulwa kusetyenziswa imodeli yokufunda yomatshini kwi-vector, esekelwe kuyo, usebenzisa i-algorithm ye-SPTAG, ii-vectors ezininzi ezifanayo zikhethwa kwisalathisi kunye nemifanekiso ehambelana nayo ibuyiselwe njengesiphumo.

umthombo: opennet.ru

Yongeza izimvo