I-RHVoice 1.6.0 ukukhululwa kwe-synthesizer yenkulumo

Isistimu yokuhlanganisa inkulumo evulekile i-RHVoice 1.6.0 yakhululwa, ekuqaleni yathuthukiswa ukuze inikeze ukusekelwa kwekhwalithi ephezulu yolimi lwesiRashiya, kodwa yabe isiguqulelwa kwezinye izilimi, okuhlanganisa isiNgisi, isiPutukezi, isi-Ukrainian, isiKyrgyz, isiTatar nesiGeorgia. Ikhodi ibhalwe ngo-C++ futhi isatshalaliswa ngaphansi kwelayisensi ye-LGPL 2.1. Isekela umsebenzi ku-GNU/Linux, Windows ne-Android. Uhlelo luhambisana nezisetshenziswa ezijwayelekile ze-TTS (umbhalo-kuya-enkulumweni) zokuguqula umbhalo ube enkulumweni: SAPI5 (Windows), Speech Dispatcher (GNU/Linux) kanye ne-Android Text-To-Speech API, kodwa futhi ingasetshenziswa ku-NVDA. isifundi sesikrini. Umqambi kanye nonjiniyela oyinhloko we-RHVoice ngu-Olga Yakovleva, othuthukisa iphrojekthi naphezu kokungaboni ngokuphelele.

Inguqulo entsha yengeza izinketho zezwi ezi-5 zenkulumo yesiRashiya. Ukwesekwa kolimi lwesi-Albanian sekwenziwe. Isichazamazwi solimi lwesi-Ukrainian sibuyekeziwe. Ukusekelwa kokusebenza kwezwi kwezinhlamvu ze-emoji kunwetshiwe. Umsebenzi wenziwe ukuze kuqedwe amaphutha kuhlelo lokusebenza lwenkundla ye-Android, ukungeniswa kwezichazamazwi zangokwezifiso kwenziwe lula, futhi usekelo lweplathifomu ye-Android 11 yengeziwe. Izilungiselelo ezintsha nokusebenza kwengezwe kumongo wenjini, okuhlanganisa ne-g2p. icala, i-word_break nokusekelwa kwezihlungi zokulinganisa.

Masikhumbule ukuthi i-RHVoice isebenzisa ukuthuthukiswa kwephrojekthi ye-HTS (HMM/DNN-based Speech Synthesis System) kanye nendlela yokuhlanganisa ye-parametric enamamodeli ezibalo (Statistical Parametric Synthesis esekelwe ku-HMM - Hidden Markov Model). Inzuzo yemodeli yezibalo izindleko eziphansi ze-overhead namandla e-CPU angafuneki. Yonke imisebenzi yenziwa endaweni ohlelweni lomsebenzisi. Amazinga amathathu ekhwalithi yenkulumo asekelwa (izinga eliphansi, ukusebenza okuphezulu kanye nesikhathi sokuphendula sibe mfushane).

Uhlangothi olubi lwemodeli yezibalo izinga eliphansi lokuphimisa, elingafinyeleli ezingeni lama-synthesizers akhiqiza inkulumo esekelwe kwinhlanganisela yezingcezwana zenkulumo yemvelo, kodwa nokho umphumela uyafundeka futhi ufana nokusakaza okurekhodiwe kumbhobho. . Uma kuqhathaniswa, iphrojekthi ye-Silero, ehlinzeka ngenjini evulekile yokuhlanganisa inkulumo esekelwe kubuchwepheshe bokufunda komshini kanye nesethi yamamodeli olimi lwesiRashiya, iphakeme ngekhwalithi kune-RHVoice.

Kunezinketho zezwi eziyi-13 ezitholakalayo zolimi lwesiRashiya, kanye nesiNgisi ezi-5. Amazwi akhiwe ngokusekelwe ekurekhodweni kwenkulumo yemvelo. Kuzilungiselelo ungashintsha isivinini, iphimbo kanye nevolumu. Umtapo wolwazi we-Sonic ungasetshenziswa ukushintsha i-tempo. Kungenzeka ukuthi uthole ngokuzenzakalelayo futhi ushintshe izilimi ngokusekelwe ekuhlaziyweni kombhalo ofakiwe (isibonelo, amagama nezingcaphuno ngolunye ulimi, imodeli yokuhlanganisa yomdabu kulolo limi ingasetshenziswa). Amaphrofayili ezwi asekelwa, achaza inhlanganisela yamazwi ezilimi ezahlukene.

Source: opennet.ru

Engeza amazwana