Sanibonani nonke! Sivula uchungechunge lwama-athikili anikelwe ekuxazululeni izinkinga ezingokoqobo ezihlobene nokucutshungulwa kolimi lwemvelo (I-Natural Language Processing noma i-NLP kalula) futhi sidale ama-ejenti wezingxoxo (izingxoxo) sisebenzisa umtapo wolwazi ovulekile.
Imisebenzi ye-NLP ihlanganisa ukunquma imizwa yombhalo, ukuhlukanisa izinhlangano eziqanjwe igama, ukunquma ukuthi lowo oxoxa naye ufunani ku-bot yakho: oda i-pizza noma uthole ulwazi lwangemuva, nokunye okuningi. Ungafunda kabanzi mayelana nemisebenzi ye-NLP nezindlela
Kulesi sihloko, sizokubonisa indlela yokusebenzisa iseva ye-REST ngamamodeli e-NLP aqeqeshwe ngaphambilini, alungele ukusetshenziswa ngaphandle kokucushwa okwengeziwe noma ukuqeqeshwa.
Ukufakwa kwe-DeepPavlov
Lapha nangezansi, imiyalo ye-Linux izonikezwa. NgeWindows, bheka yethu
- Dala futhi wenze kusebenze indawo ebonakalayo ngenguqulo yamanje esekelwayo yePython:
virtualelnv env -p python3.7 source env/bin/activate
- Faka i-DeepPavlov endaweni ebonakalayo:
pip install deeppavlov
Kwethulwa iseva ye-REST ngemodeli ye-DeepPavlov
Ngaphambi kokuthi sethule iseva ngemodeli ye-DeepPavlov okokuqala ngqa, kuzoba usizo ukukhuluma ngezici ezithile zokwakheka komtapo wolwazi.
Noma iyiphi imodeli ku-DP iqukethe:
- Ikhodi ye-Python;
- Izingxenye ezilandwayo - imiphumela yokuqeqeshwa kwe-serialized kudatha ethile (ukushumeka, izisindo zamanethiwekhi emizwa, njll.);
- Ifayela lokumisa (ngemuva kwalokhu okuzobizwa ngalo ngokuthi ukulungiselelwa), eliqukethe ulwazi mayelana namakilasi asetshenziswa imodeli, ama-URL ezingxenye ezilandiwe, ukuncika kwePython, njll.
Sizokutshela kabanzi mayelana nokuthi yini engaphansi kwe-hood ye-DeepPavlov ezihlokweni ezilandelayo, ngoba manje kwanele ukuba sazi ukuthi:
- Noma iyiphi imodeli ku-DeepPavlov ikhonjwa ngegama lokucushwa kwayo;
- Ukuze usebenzise imodeli, udinga ukulanda izingxenye zayo kumaseva we-DeepPavlov;
- Futhi, ukuze usebenzise imodeli, udinga ukufaka imitapo yolwazi yePython eyisebenzisayo.
Imodeli yokuqala esizoyethula izoba ngezilimi eziningi Ebizwa ngokuthi I-Entity Recognition (NER). Imodeli ihlukanisa amagama ombhalo ngokohlobo lwamabhizinisi anegama ayingxenye yawo (amagama afanelekile, amagama ezindawo, amagama ezinhlobo zemali, nokunye). Lungiselela igama lenguqulo yakamuva ye-NER:
ner_ontonotes_bert_mult
Sethula iseva ye-REST ngemodeli:
- Sifaka ukuncika kwemodeli okucaciswe ekucushweni kwayo endaweni ebonakalayo esebenzayo:
python -m deeppavlov install ner_ontonotes_bert_mult
- Landa izingxenye zemodeli ye-serialized kusuka kumaseva we-DeepPavlov:
python -m deeppavlov download ner_ontonotes_bert_mult
Izingxenye ze-serialized zizolandwa kuhla lwemibhalo lwasekhaya lwe-DeepPavlov, olutholakala ngokuzenzakalelayo
~/.deeppavlov
Lapho ulanda, i-hash yezingxenye esezilandiwe ibhekwa ngokumelene nama-hashes ezingxenye ezitholakala kuseva. Uma kukhona okufanayo, ukulanda kuyeqiwa futhi kusetshenziswe amafayela akhona. Osayizi bezingxenye ezilandiwe bangahluka ngokwesilinganiso ukusuka ku-0.5 ukuya ku-8 Gb, kwezinye izimo bafinyelele ku-20 Gb ngemva kokuvula uziphu.
- Sethula iseva ye-REST ngemodeli:
python -m deeppavlov riseapi ner_ontonotes_bert_mult -p 5005
Njengomphumela wokwenza lo myalo, iseva ye-REST enemodeli izokwethulwa ku-port 5005 yomshini wokusingathwa (imbobo ezenzakalelayo ingu-5000).
Ngemva kokuqalisa imodeli, i-Swagger enemibhalo ye-API kanye nekhono lokuhlola lingatholakala ku-URL http://127.0.0.1:5005
. Ake sihlole imodeli ngokuyithumela endaweni yokugcina http://127.0.0.1:5005/model
THUMELA isicelo ngokuqukethwe okulandelayo kwe-JSON:
{
"x": [
"В МФТИ можно добраться на электричке с Савёловского Вокзала.",
"В юго-западной Руси стог жита оценен в 15 гривен"
]
}
Ekuphenduleni kufanele sithole i-JSON elandelayo:
[
[
["В", "МФТИ", "можно", "добраться", "на", "электричке", "с", "Савёловского", "Вокзала", "."],
["O", "B-FAC", "O", "O", "O", "O", "O", "B-FAC", "I-FAC", "O"]
],
[
["В", "юго", "-", "западной", "Руси", "стог", "жита", "оценен", "в", "15", "гривен"],
["O", "B-LOC", "I-LOC", "I-LOC", "I-LOC", "O", "O", "O", "O", "B-MONEY", "I-MONEY"]
]
]
Sisebenzisa lezi zibonelo, sizohlaziya i-DeepPavlov REST API.
I-API DeepPavlov
Imodeli ngayinye ye-DeepPavlov ine-agumenti yokufaka okungenani eyodwa. Ku-REST API, izimpikiswano ziqanjwa, amagama azo angokhiye besichazamazwi esingenayo. Ezimweni eziningi, ukuphikisana kuwumbhalo odinga ukucutshungulwa. Ulwazi olwengeziwe mayelana nama-agumenti namanani abuyiswe amamodeli angatholakala esigabeni sama-MODELS samadokhumenti
Esibonelweni, uhlu lwezintambo ezimbili ludluliselwe ku-agumenti x, ngayinye yazo yanikezwa umaki ohlukile. Ku-DeepPavlov, wonke amamodeli athatha njengokufakwayo uhlu (inqwaba) lwamanani acutshungulwa ngokuzimela.
Igama elithi “inqwaba” libhekisela kumkhakha wokufunda komshini futhi libhekisela kunqwaba yamanani okokufaka azimele acutshungulwa i-algorithm noma inethiwekhi ye-neural kanyekanye. Le ndlela ikuvumela ukuthi unciphise (ngokuvamile kakhulu) isikhathi imodeli icubungula ingxenye eyodwa yenqwaba uma kuqhathaniswa nenani elidluliselwe kokokufaka ngokuhlukile. Kodwa umphumela wokucubungula ukhishwa kuphela ngemva kokuba zonke izakhi sezicutshunguliwe. Ngakho-ke, lapho ukhiqiza i-batch engenayo, kuzodingeka ukuthi kucatshangelwe ijubane lemodeli kanye nesikhathi sokucubungula esidingekayo sesici ngasinye saso.
Uma kunezimpikiswano eziningana kumodeli we-DeepPavlov, ngayinye yazo ithola iqoqo layo lamanani, futhi ekuphumeni imodeli ihlale ikhiqiza iqoqo elilodwa lezimpendulo. Izakhi zeqoqo eliphumayo ziyimiphumela yokucubungula izakhi zamaqoqo angenayo ngenkomba efanayo.
Esibonelweni esingenhla, umphumela wemodeli wawuwukuhlukanisa iyunithi yezinhlamvu ngayinye ibe amathokheni (amagama nezimpawu zokubhala) futhi ihlukanise ithokheni ngokuhlobene nebhizinisi eliqanjwe igama (igama lenhlangano, uhlobo lwemali) elimele. Njengamanje imodeli ner_ontonotes_bert_mult ekwazi ukubona izinhlobo eziyi-18 zezinhlangano eziqanjwe igama, incazelo enemininingwane ingatholakala
Amanye amamodeli angaphandle kwebhokisi ka-DeepPavlov
Ngokungeziwe ku-NER, amamodeli alandelayo angaphandle kwebhokisi ayatholakala ku-DeepPavlov ngesikhathi sokubhala:
Ukuphendula Umbuzo Wombhalo
Phendula umbuzo embhalweni ngesiqephu salo mbhalo. Ukumiswa kwemodeli: squad_ru_bert_infer
Isicelo esiyisibonelo:
{
"context_raw": [
"DeepPavlov разрабатывается лабораторией МФТИ.",
"В юго-западной Руси стог жита оценен в 15 гривен."
],
"question_raw": [
"Кем разрабатывается DeepPavlov?",
"Сколько стоил стог жита на Руси?"
]
}
Umphumela:
[
["лабораторией МФТИ", 27, 31042.484375],
["15 гривен", 39, 1049.598876953125]
]
Ukutholwa Kwenhlamba
Ukutholwa kokuba khona kwenhlamba kumuntu okubhekiselwe kuye umbhalo (ngesikhathi sokubhala - ngesiNgisi kuphela). Ukumiswa kwemodeli:thuka_kaggle_conv_bert
Isicelo esiyisibonelo:
{
"x": [
"Money talks, bullshit walks.",
"You are not the brightest one."
]
}
Umphumela:
[
["Not Insult"],
["Insult"]
]
Ukuhlaziywa Kwengqondo
Ukuhlukaniswa kwemizwa yombhalo (enhle, engathathi hlangothi, embi). Ukumiswa kwemodeli: rusentiment_elmo_twitter_cnn
Isicelo esiyisibonelo:
{
"x": [
"Мне нравится библиотека DeepPavlov.",
"Я слышал о библиотеке DeepPavlov.",
"Меня бесят тролли и анонимусы."
]
}
Umphumela:
[
["positive"],
["neutral"],
["negative"]
]
Ukutholwa kwe-Paraphrase
Ukunquma ukuthi imibhalo emibili ehlukene inencazelo efanayo yini. Ukumiswa kwemodeli: stand_paraphraser_zu
Isicelo:
{
"text_a": [
"Город погружается в сон, просыпается Мафия.",
"Президент США пригрозил расторжением договора с Германией."
],
"text_b": [
"Наступает ночь, все жители города пошли спать, а преступники проснулись.",
"Германия не собирается поддаваться угрозам со стороны США."
]
}
Umphumela:
[
[1],
[0]
]
Uhlu lwamanje lwawo wonke amamodeli angaphandle kwebhokisi e-DeepPavlov lungahlala lutholakala
isiphetho
Kulesi sihloko, sajwayelana ne-DeepPavlov API kanye namanye amakhono okucubungula umbhalo womtapo wolwazi anikezwe ngaphandle kwebhokisi. Kufanele kukhunjulwe ukuthi kunoma yimuphi umsebenzi we-NLP, umphumela omuhle kakhulu uzotholakala lapho uqeqesha imodeli kusethi yedatha ehambisana nendawo yesihloko (isizinda) somsebenzi. Ngaphezu kwalokho, amamodeli amaningi nakakhulu awakwazi ukuqeqeshwa kuzo zonke izimo.
Ezihlokweni ezilandelayo sizobheka izilungiselelo ezengeziwe zomtapo wolwazi, sethula i-DeepPavlov kusuka ku-Docker, bese siqhubekela phambili kumamodeli wokuqeqesha. Futhi ungakhohlwa ukuthi i-DeepPavlov ine
Source: www.habr.com