Google inoburitsa Lyra V2 yakavhurwa sosi audio codec

Google yakaunza iyo Lyra V2 odhiyo codec, iyo inoshandisa muchina kudzidza matekiniki kuti iwane yakanyanya kunaka yezwi pane inononoka kutaurirana nzira. Iyo vhezheni nyowani ine shanduko kune itsva neural network architecture, tsigiro yemamwe mapuratifomu, yakakwidziridzwa bitrate control, kuvandudzwa kwekuita uye yepamusoro odhiyo mhando. Iyo referensi kodhi yekushandisa yakanyorwa muC ++ uye yakagoverwa pasi peiyo Apache 2.0 rezinesi.

Panyaya yemhando yedhata rezwi rinofambiswa nekumhanya kwakaderera, Lyra yakanyanya kukwirira kune echinyakare macodecs anoshandisa madhijitari masaini ekugadzirisa nzira. Kuti uwane kufambiswa kwezwi kwemhando yepamusoro mumamiriro ehuwandu hushoma hweruzivo rwakafambiswa, mukuwedzera kune akajairwa nzira dzekutsikirira odhiyo uye kutendeuka kwechiratidzo, Lyra anoshandisa modhi yekutaura yakavakirwa pamuchina wekudzidza sisitimu inokubvumira kuti udzokorore ruzivo rusipo. zvichibva pamaitiro ekutaura.

Iyo codec inosanganisira encoder uye decoder. Iyo algorithm ye encoder ndeyekubvisa izwi data paramita ega ega makumi maviri milliseconds, kuamanikidza uye nekuaendesa kune anogamuchira pamusoro penetiweki nechiyero chidiki kubva pa20kbps kusvika 3.2kbps. Kudivi rekugamuchira, decoder inoshandisa modhi yekugadzira kudzoreredza iyo yekutanga yekutaura siginecha zvichienderana neyakafambiswa maodhiyo paramita, ayo anosanganisira logarithmic chaki spectrograms iyo inofunga nezvekutaura simba rekutaura muakasiyana ma frequency renji uye akagadzirira achifunga nezvekunzwa kwemunhu. model.

Lyra V2 inoshandisa itsva generative modhi yakavakirwa paSoundStream convolutional neural network, iyo inoratidzwa nezvakaderera zvinodiwa mumakomputa zviwanikwa, izvo zvinobvumira chaiyo-nguva decoding kunyangwe pane yakaderera-simba masisitimu. Mhando yakashandiswa kugadzira ruzha yakadzidziswa kushandisa zviuru zvemaawa zvekurekodha mazwi mumitauro inopfuura makumi mapfumbamwe. TensorFlow Lite inoshandiswa kuita modhi. Kuita kwekuita kwakarongwa kwakakwana pakukodha uye decoding kutaura pamafoni emhando yakaderera yemutengo.

Pamusoro pekushandisa rakasiyana generative modhi, iyo vhezheni nyowani inocherechedzwa zvakare nekubatanidzwa kwehukama neRVQ (Residual Vector Quantizer) quantizer muiyo codec architecture, iyo inoitwa kudivi reanotumira isati yatapurirana data, uye padivi remugamuchiri. mushure mekugamuchira data. Iyo quantizer inoshandura ma parameter akapihwa necodec kuita seti yemapaketi, encoding iyo ruzivo maererano neiyo bitrate yakasarudzwa. Kuve nechokwadi chemhando dzakasiyana dzemhando, quantizers inopihwa matatu bit rates (3.2 kps, 6 kbps uye 9.2 kbps), iyo yakakwirira yebhiti mwero, zviri nani mhando, asi yakakwirira iyo bandwidth zvinodiwa.

Google inoburitsa Lyra V2 yakavhurwa sosi audio codec

Iyo dhizaini nyowani yakadzikisa kunonoka kutapurirana kwechiratidzo kubva pa100 kusvika ku20 milliseconds. Kuenzanisa, iyo Opus codec yeWebRTC yakaratidza kunonoka kwe26.5ms, 46.5ms uye 66.5ms payakaedzwa bitrate. Kuita kweiyo encoder uye decoder kwakawedzera zvakanyanya - zvichienzaniswa neshanduro yapfuura, kune kukurumidza kunosvika kashanu. Semuenzaniso, paPixel 5 Pro smartphone, iyo codec nyowani inoisa uye decodes 6-ms sampuli mu 20 ms, inova ka0.57 nekukurumidza pane inodiwa pakutumira-chaiyo-nguva.

Pamusoro pekuita, isu takakwanisawo kuvandudza kunaka kwekudzoreredza inzwi - zvinoenderana nechiyero cheMUSHRA, mhando yekutaura pabit rates ye3.2 kbps, 6 kbps uye 9.2 kbps kana uchishandisa iyo Lyra V2 codec inoenderana nebit rates yegumi. kbps, 10 kbps uye 13 kbps paunenge uchishandisa Opus codec.

Source: opennet.ru

Voeg