Google imasindikiza ma codec otsegula a Lyra V2

Google yayambitsa nyimbo ya Lyra V2 audio codec, yomwe imagwiritsa ntchito njira zophunzirira pamakina kuti ikwaniritse mawu apamwamba kwambiri pamakina olumikizana pang'onopang'ono. Mtundu watsopanowu uli ndi kusintha kwa kamangidwe katsopano ka neural network, kuthandizira mapulatifomu owonjezera, kuthekera kokulirapo kwa bitrate, magwiridwe antchito abwino komanso mtundu wapamwamba wamawu. Kukhazikitsa kwa code code kumalembedwa mu C ++ ndikugawidwa pansi pa layisensi ya Apache 2.0.

Pankhani yamtundu wa data yotumizira mawu pa liwiro lotsika, Lyra ndi wapamwamba kwambiri kuposa ma codec achikhalidwe omwe amagwiritsa ntchito njira zosinthira ma digito. Kuti tikwaniritse kufala kwa mawu apamwamba pazikhalidwe zochepa za chidziwitso chofalitsidwa, kuwonjezera pa njira zochiritsira zomvera ndi kutembenuka kwazizindikiro, Lyra amagwiritsa ntchito chilankhulo chotengera makina ophunzirira makina, omwe amakulolani kubwereza zomwe zikusowa potengera mawonekedwe amawu.

Codec imaphatikizapo encoder ndi decoder. Ma algorithm a encoder amafika potulutsa magawo a data ya mawu pa ma milliseconds 20 aliwonse, kuwapanikiza ndikuwatumiza kwa wolandila pa netiweki yokhala ndi bitrate kuchokera pa 3.2kbps mpaka 9.2kbps. Pamapeto olandila, chotsitsacho chimagwiritsa ntchito njira yopangira kuti apangenso chizindikiritso choyambirira choyankhulira motengera magawo omvera, omwe amaphatikiza ma logarithmic choko ma spectrogram omwe amaganizira za mphamvu zamalankhulidwe m'magawo osiyanasiyana ndipo amakonzedwa motengera zitsanzo za kuzindikira kwa makutu a anthu.

Lyra V2 imagwiritsa ntchito njira yatsopano yopangira yotengera SoundStream convolutional neural network, yomwe ili ndi zofunikira zochepa zama computa, kulola kumasulira nthawi yeniyeni ngakhale pamakina otsika mphamvu. Chitsanzo chomwe chinagwiritsidwa ntchito popanga mawuwo chinaphunzitsidwa pogwiritsa ntchito mawu ojambulitsa mawu a maola masauzande angapo m’zinenero zoposa 90. TensorFlow Lite imagwiritsidwa ntchito popanga chitsanzo. Kachitidwe kakukhazikitsidwa kofunikirako ndi kokwanira pakuyika mawu ndikusintha ma foni am'manja pamitengo yotsika.

Kuphatikiza pa kugwiritsa ntchito mtundu wina wopangira, mtundu watsopanowu ndiwodziwikiranso pakuphatikizika kwamapangidwe a codec a maulalo ndi RVQ (Residual Vector Quantizer) quantizer, yomwe imachitidwa kumbali ya wotumiza isanatumize deta, komanso kumbali ya wolandila. atalandira deta. Quantizer imatembenuza magawo opangidwa ndi codec kukhala mapaketi, ndikuyika zambiri zokhudzana ndi bitrate yosankhidwa. Kuti apereke milingo yosiyanasiyana yaubwino, ma quantizer amaperekedwa kwa ma bitrate atatu (3.2 kps, 6 kbps ndi 9.2 kbps), kukwezeka kwa bitrate, kumapangitsanso mtundu, koma kukwezera zofunikira za bandwidth.

Google imasindikiza ma codec otsegula a Lyra V2

Zomangamanga zatsopanozi zachepetsa kuchedwa kwa ma siginecha kuchokera pa 100 mpaka 20 milliseconds. Poyerekeza, Opus codec ya WebRTC inawonetsa kuchedwa kwa 26.5ms, 46.5ms ndi 66.5ms pa ma bitrate oyesedwa. Ntchito ya encoder ndi decoder yakulanso kwambiri - mpaka nthawi 5 mwachangu poyerekeza ndi mtundu wakale. Mwachitsanzo, pa foni ya m'manja ya Pixel 6 Pro, codec yatsopanoyo imayika ndikusankha chitsanzo cha 20-ms mu 0.57 ms, chomwe chili chofulumira kuwirikiza 35 kuposa momwe chimafunikira potumiza nthawi yeniyeni.

Kuphatikiza pa magwiridwe antchito, zinali zothekanso kuwongolera kubwezeretsedwa kwa mawu - molingana ndi sikelo ya MUSHRA, kalankhulidwe kabwino ka 3.2 kbps, 6 kbps ndi 9.2 kbps mukamagwiritsa ntchito codec ya Lyra V2 imagwirizana ndi ma bitrate a 10 kbps, 13 kbps ndi 14 kbps mukamagwiritsa ntchito Opus codec.

Source: opennet.ru

Kuwonjezera ndemanga