Molweni nonke! Sivula uthotho lwamanqaku anikelwe ekusombululeni iingxaki ezisebenzayo ezinxulumene nokusetyenzwa kolwimi lwendalo (Inkqubo yoLwimi lweNdalo okanye ngokulula i-NLP) kunye nokudala iiarhente zokuncokola (i-chat bots) kusetyenziswa ithala leencwadi elivulelekileyo.
Imisebenzi ye-NLP ibandakanya ukumisela ithoni yokubhaliweyo, ukwahlulahlula amagama amaqumrhu, ukugqiba ukuba yintoni i-interlocutor efuna kwi-bot yakho: oda i-pizza okanye ufumane ulwazi lwemvelaphi, kunye nokunye okuninzi. Unokufunda ngakumbi malunga nemisebenzi kunye neendlela ze-NLP
Kweli nqaku, siza kukubonisa indlela yokuqhuba iseva ye-REST kunye neemodeli ze-NLP eziqeqeshwe kwangaphambili ezilungele ukusetyenziswa ngaphandle koqwalaselo olongezelelweyo okanye uqeqesho.
Ukufakela i-DeepPavlov
Imiyalelo yeLinux iya kunikwa apha nangezantsi. NgeWindows bona yethu
- Yenza kwaye uvule imeko-bume yenyani kunye nenguqulelo yangoku exhaswayo yePython:
virtualelnv env -p python3.7 source env/bin/activate
- Faka i-DeepPavlov kwindawo ebonakalayo:
pip install deeppavlov
Ukwazisa iseva ye-REST ngemodeli ye-DeepPavlov
Ngaphambi kokuba siqale umncedisi kunye nemodeli ye-DeepPavlov okokuqala, kuya kuba luncedo ukuthetha ngezinye iimpawu zoyilo lwethala leencwadi.
Nayiphi na imodeli kwiDP inezi:
- ikhowudi yePython;
- Amacandelo anokukhutshelwa - iziphumo zokufunda ezilandelelanisiweyo kwidatha ethile (ufakelo, iintsimbi zeneural network, njl.);
- Ifayile yoqwalaselo (emva koku kubhekiselwa kuyo njenge-config), equlethe ulwazi malunga neeklasi ezisetyenziswe yimodeli, ii-URL zamacandelo akhutshelweyo, ukuxhomekeka kwePython, kunye nokunye.
Siza kukuxelela ngakumbi malunga nento ephantsi kwe-hood ye-DeepPavlov kumanqaku alandelayo, kuba ngoku kwanele ukuba siyazi ukuba:
- Nawuphi na umzekelo kwi-DeepPavlov ichongiwe ngegama loqwalaselo lwayo;
- Ukuqhuba imodeli, kufuneka ukhuphele iinxalenye zayo kwiiseva ze-DeepPavlov;
- Kwakhona, ukuqhuba imodeli, kufuneka ufake iilayibrari zePython ezisetyenziswa yiyo.
Imodeli yokuqala esiza kuyiqhuba iya kuba ngeelwimi ezininzi ezibizwa ngokuba yi-Entity Recognition (NER). Imodeli ihlela amagama esicatshulwa ngokohlobo lwamaqumrhu anikwe igama lawo (amagama afanelekileyo, amagama eendawo, amagama emali, kunye nezinye). Qwalasela igama lolona guqulelo lwamva nje lwe-NER:
ner_ontonotes_bert_mult
Siqala iseva ye-REST ngemodeli:
- Faka imodeli yokuxhomekeka echazwe kuqwalaselo lwayo kwindawo esebenzayo yenyani:
python -m deeppavlov install ner_ontonotes_bert_mult
- Khuphela amacandelo emodeli esetyenzisiweyo kwiiseva ze-DeepPavlov:
python -m deeppavlov download ner_ontonotes_bert_mult
Amacandelo asetyenzisiweyo aya kukhutshelwa kulawulo lwasekhaya lwe-DeepPavlov, olufumaneka ngokungagqibekanga
~/.deeppavlov
Xa ukhutshelwa, i-hash yezinto esele zikhutshiwe ithelekiswa neeheshi zamacandelo abekwe kumncedisi. Ukuba kukho umdlalo, ukukhuphela kuyatsitywa kwaye iifayile ezikhoyo zisetyenziswa. Ubungakanani bezinto ezikhutshelweyo zinokuhluka ngokomndilili ukusuka kwi-0.5 ukuya kwi-8 Gb, kwezinye iimeko zifikelela kwi-20 Gb emva kokuvula.
- Siqala iseva ye-REST ngemodeli:
python -m deeppavlov riseapi ner_ontonotes_bert_mult -p 5005
Njengomphumo wokuphumeza lo myalelo, i-REST iseva enemodeli iya kusungulwa kwi-port 5005 yomatshini wokusingatha (i-port engagqibekanga yi-5000).
Emva kokuba imodeli iqalisiwe, i-Swagger kunye namaxwebhu e-API kunye nokukwazi ukuvavanya kunokufumaneka kwi-URL http://127.0.0.1:5005
. Masivavanye imodeli ngokuyithumela kwindawo yokugqibela http://127.0.0.1:5005/model
POST isicelo esinomxholo olandelayo we-JSON:
{
"x": [
"В МФТИ можно добраться на электричке с Савёловского Вокзала.",
"В юго-западной Руси стог жита оценен в 15 гривен"
]
}
Ukuphendula, kufuneka sifumane le JSON ilandelayo:
[
[
["В", "МФТИ", "можно", "добраться", "на", "электричке", "с", "Савёловского", "Вокзала", "."],
["O", "B-FAC", "O", "O", "O", "O", "O", "B-FAC", "I-FAC", "O"]
],
[
["В", "юго", "-", "западной", "Руси", "стог", "жита", "оценен", "в", "15", "гривен"],
["O", "B-LOC", "I-LOC", "I-LOC", "I-LOC", "O", "O", "O", "O", "B-MONEY", "I-MONEY"]
]
]
Ukusebenzisa le mizekelo, siya kuhlalutya i-DeepPavlov REST API.
DeepPavlov API
Imodeli nganye ye-DeepPavlov inengxabano enye yegalelo. Kwi-REST API, iingxoxo zithiywe, amagama azo zizitshixo zesichazi-magama esingenayo. Kwiimeko ezininzi, ingxabano sisicatshulwa esizakusetyenzwa. Ulwazi oluthe kratya malunga neengxoxo kunye namaxabiso abuyiswe yimifuziselo inokufumaneka kwicandelo le-MODELS lamaxwebhu.
Kumzekelo, uludwe lweentambo ezimbini lugqithiselwe kwingxoxo ka-x, nganye kuzo yanikwa imakishwa eyahlukileyo. Kwi-DeepPavlov, zonke iimodeli zithatha njengegalelo uluhlu (ibhetshi) lwamaxabiso acutshungulwa ngokuzimeleyo.
Igama elithi "ibhetshi" libhekisa kwindawo yokufunda koomatshini kwaye libhekisa kwibhetshi yamaxabiso azimeleyo egalelo aqhutywe yi-algorithm okanye inethiwekhi ye-neural ngaxeshanye. Le ndlela ikuvumela ukuba unciphise (kaninzi kakhulu) ixesha lokucubungula into enye yebhetshi ngomzekelo xa kuthelekiswa nexabiso elifanayo eligqithiselwe kwigalelo ngokwahlukileyo. Kodwa umphumo wokucubungula unikezelwa kuphela emva kokucwangciswa kwazo zonke izinto. Ngoko ke, xa uvelisa i-batch engenayo, kuya kufuneka ukuba kuthathelwe ingqalelo isantya somzekelo kunye nexesha elifunekayo lokucubungula into nganye yezinto zayo.
Ukuba kukho iingxabano ezininzi zemodeli ye-DeepPavlov, nganye kuzo ifumana ibhetshi yayo yamaxabiso, kwaye kwisiphumo imodeli ihlala ivelisa ibhetshi enye yeempendulo. Izinto zebhetshi ephumayo ziziphumo zokucubungula izinto zeebhetshi ezingenayo kunye nesalathisi esifanayo.
Kulo mzekelo ungasentla, umphumo womzekelo wawukuchithwa komgca ngamnye kwiimpawu (amagama kunye neempawu zokubhala) kunye nokuhlelwa komqondiso ohambelana nequmrhu eligama (igama lombutho, imali) elimele. Okwangoku imodeli ner_ontonotes_bert_mult iyakwazi ukuqaphela iindidi ezili-18 zamaqumrhu anamagama, inkcazo eneenkcukacha inokufumaneka
Ezinye iimodeli ezingaphandle kwebhokisi ze-DeepPavlov
Ukongeza kwi-NER, ezi modeli zilandelayo ziphuma-kwibhokisi ziyafumaneka kwi-DeepPavlov ngexesha lokubhala:
Isiqendu Ukuphendulwa kwemibuzo
Impendulo yombuzo kwisicatshulwa sisiqwenga sesi sicatshulwa. Ubumbeko lwemodeli: squad_en_bert_infer
Cela umzekelo:
{
"context_raw": [
"DeepPavlov разрабатывается лабораторией МФТИ.",
"В юго-западной Руси стог жита оценен в 15 гривен."
],
"question_raw": [
"Кем разрабатывается DeepPavlov?",
"Сколько стоил стог жита на Руси?"
]
}
Isiphumo:
[
["лабораторией МФТИ", 27, 31042.484375],
["15 гривен", 39, 1049.598876953125]
]
Ukufunyanwa kweStroke
Ukuchongwa kobukho besithuko kumntu lowo isicatshulwa sibhekiswa kuye (ngexesha lokubhala - kuphela ngesiNgesi). Imodeli yoqwalaselo: insults_kaggle_conv_bert
Cela umzekelo:
{
"x": [
"Money talks, bullshit walks.",
"You are not the brightest one."
]
}
Isiphumo:
[
["Not Insult"],
["Insult"]
]
Uhlalutyo lwesivakalisi
Ukuhlelwa kweemvakalelo zesicatshulwa (ezilungileyo, ezingathathi hlangothi, ezimbi). Ubumbeko lwemodeli: rusentiment_elmo_twitter_cnn
Cela umzekelo:
{
"x": [
"Мне нравится библиотека DeepPavlov.",
"Я слышал о библиотеке DeepPavlov.",
"Меня бесят тролли и анонимусы."
]
}
Isiphumo:
[
["positive"],
["neutral"],
["negative"]
]
UkuFunyaniswa kwebinzana elinye
Ukumisela ukuba izicatshulwa ezibini ezahlukeneyo zinentsingiselo efanayo. Ubumbeko lwemodeli: stand_paraphraser_en
Isicelo:
{
"text_a": [
"Город погружается в сон, просыпается Мафия.",
"Президент США пригрозил расторжением договора с Германией."
],
"text_b": [
"Наступает ночь, все жители города пошли спать, а преступники проснулись.",
"Германия не собирается поддаваться угрозам со стороны США."
]
}
Isiphumo:
[
[1],
[0]
]
Uluhlu oluhlaziyiweyo lwazo zonke iimodeli ze-DeepPavlov ezingaphandle kwebhokisi zihlala zifumaneka
isiphelo
Kweli nqaku, siye saqhelana ne-DeepPavlov API kunye nezinye zeempawu zokubhaliweyo zethala leencwadi ezinikezelwe ngaphandle kwebhokisi. Ngexesha elifanayo, kufuneka kukhunjulwe ukuba kuyo nayiphi na imisebenzi ye-NLP, isiphumo esihle siya kufumaneka xa imodeli iqeqeshwa kwisethi yedatha ehambelana nommandla wesifundo (isizinda) somsebenzi. Ukongezelela, iimodeli ezingaphezulu, ngokomgaqo, azikwazi ukuqeqeshwa kuzo zonke izihlandlo.
Kumanqaku alandelayo, siza kujonga izicwangciso zethala leencwadi ezongezelelweyo, siqalise i-DeepPavlov esuka kwi-Docker, kwaye emva koko siqhubele phambili kwiimodeli zoqeqesho. Kwaye ungalibali ukuba i-DeepPavlov inakho
umthombo: www.habr.com