Kuburitswa kutsva kweSilero kutaura synthesis system

Kuburitswa kutsva kweruzhinji kweSilero Text-to-Speech neural network speech synthesis system iripo. Iyo purojekiti yakanangana nekugadzira yemazuva ano, yemhando yepamusoro yekutaura synthesis system isiri yakaderera kune yekutengeserana mhinduro kubva kumakambani uye inowanikwa kune wese munhu pasina kushandisa inodhura server midziyo.

Iwo mamodheru akagoverwa pasi peGNU AGPL rezinesi, asi kambani inogadzira chirongwa ichi haiburitse nzira yekudzidzisa mamodheru. Kuti umhanye, unogona kushandisa PyTorch uye zvimiro zvinotsigira iyo ONNX fomati. Kubatanidzwa kwekutaura muSilero kwakavakirwa pakushandiswa kwakadzama yakagadziridzwa yemazuva ano neural network algorithms uye dijitari masaini ekugadzirisa nzira.

Zvinocherechedzwa kuti dambudziko guru remazuva ano neural network mhinduro dzekutaura synthesis nderekuti dzinowanzo kuwanikwa chete mukati mekubhadhara makore mhinduro, uye zvigadzirwa zveveruzhinji zvine yakakwira hardware zvinodiwa, zvemhando yakaderera, kana kuti hazvina kukwana uye kugadzirira-kushandisa. zvigadzirwa. Semuyenzaniso, kumhanyisa imwe nyowani yakakurumbira yekupedzisira-kumagumo synthesis architectures, VITS, mushe mune synthesis modhi (kureva, kwete yemhando yekudzidziswa), makadhi evhidhiyo ane anopfuura gumi nematanhatu gigabytes eVRAM anodiwa.

Kusiyana nemaitiro azvino, Silero mhinduro dzinomhanya zvinobudirira kunyangwe pa1 x86 shinda yeIntel processor ine AVX2 mirairo. Pane 4 processor tambo, synthesis inobvumidza iwe kugadzira kubva pamakumi matatu kusvika makumi matanhatu masekonzi pasekondi mune 30 kHz synthesis modhi, mune 60 kHz modhi - 8-24 masekonzi, uye mu15 kHz modhi - anenge gumi masekonzi.

Akakosha maficha ekuburitswa kweSilero kutsva:

  • Saizi yemuenzaniso yakaderedzwa ne2 nguva kusvika ku50 megabytes;
  • Mienzaniso inoziva kumbomira;
  • 4 manzwi emhando yepamusoro muchiRussia anowanikwa (uye nhamba isingaperi yeasina kurongeka). Mienzaniso yemataurirwo;
  • Iwo mamodheru ave kukurumidza kakapetwa ka10 uye, semuenzaniso, mu 24 kHz modhi ivo vanokutendera kuti uunganidze kusvika kumasekonzi makumi maviri eaudio pasekondi pa 20 processor tambo;
  • Zvose zvingasarudzwa zvezwi zvemutauro mumwe zvakaputirwa mumuenzaniso mumwe;
  • Mamodheru anogona kugamuchira ndima dzese dzemavara sekuisa, SSML tags inotsigirwa;
  • Iyo synthesis inoshanda kamwechete mumatatu esampling frequencies kusarudza kubva - 8, 24 uye 48 kilohertz;
  • "Zvinetso zvevana" zvakagadziriswa: kusagadzikana uye mazwi asina kukwana;
  • Yakawedzera mireza kudzora kuiswa kweotomatiki kwemazwi uye kuiswa kwebhii "Π΅".

Parizvino, kune iyo nyowani vhezheni ye synthesis, 4 manzwi muRussia anowanikwa pachena, asi munguva pfupi iri kutevera iyo inotevera vhezheni ichaburitswa nekuchinja kunotevera:

  • The synthesis rate ichawedzera imwe 2-4 nguva;
  • Synthesis modhi yemitauro yeCIS ichagadziridzwa: Kalmyk, Tatar, Uzbek uye Ukrainian;
  • Mienzaniso yemitauro yeEurope ichawedzerwa;
  • Mienzaniso yemitauro yeIndia ichawedzerwa;
  • Mienzaniso yeChirungu ichawedzerwa.

Mamwe emasisitimu anoputsika ari muSilero synthesis:

  • Kusiyana neakawanda echinyakare synthesis mhinduro senge RHVoice, Silero synthesis haina SAPI yekubatanidza, nyore-kuisa-makasitoma, kana kubatanidzwa kweWindows uye Android;
  • Iyo yekumhanyisa, kunyangwe isina kumbobvira yakakwira kune mhinduro yakadai, inogona kunge isina kukwana pa-the-fly synthesis pama processor asina simba pamhando yepamusoro;
  • Iyo auto-accent solution haibate homographs (mazwi akaita se castle ne castle) uye ichiri kukanganisa, asi izvi zvichagadziriswa mune ramangwana kuburitswa;
  • Iyo yazvino vhezheni ye synthesis haishande pane processors isina AVX2 mirairo (kana iwe unofanirwa kunyatso shandura PyTorch marongero) nekuti imwe yemamodule mukati meiyo modhi inoverengerwa;
  • Iyo yazvino vhezheni ye synthesis ine imwechete PyTorch kutsamira; iyo yese kurongedza ndeye "hardwired" mukati meiyo modhi uye JIT mapakeji. Iwo makodhi makodhi emamodheru haana kuburitswa, pamwe nekodhi yekumhanyisa modhi kubva kuPyTorch vatengi kune mimwe mitauro;
  • Libtorch, inowanikwa kumapuratifomu enhare, yakanyanya kuwanda kupfuura ONNX yekumhanyisa nguva, asi ONNX vhezheni yemuenzaniso haisati yavepo.

Source: opennet.ru

Voeg