Khodi ya makina ozindikira ndi kumasulira a Whisper yatsegulidwa

Pulojekiti ya OpenAI, yomwe imapanga ntchito zapagulu pazanzeru zopangapanga, yafalitsa zochitika zokhudzana ndi dongosolo lozindikira mawu a Whisper. Akuti polankhula mu Chingerezi dongosololi limapereka milingo yodalirika komanso yolondola yodziwikiratu pafupi ndi kuzindikira kwamunthu. Khodi ya kukhazikitsidwa kwa maumboni kutengera dongosolo la PyTorch ndi mitundu yophunzitsidwa kale, yokonzeka kugwiritsidwa ntchito, yatsegulidwa. Khodiyo imatsegulidwa pansi pa layisensi ya MIT.

Kuti aphunzitse chitsanzocho, maola 680 amayankhulidwe adagwiritsidwa ntchito, osonkhanitsidwa kuchokera m'magulu angapo okhudza zilankhulo zosiyanasiyana ndi mitu. Pafupifupi 1/3 yazomwe zimalankhulidwa zomwe zimakhudzidwa ndi maphunziro zili m'zilankhulo zina osati Chingerezi. Dongosolo lomwe lakonzedwa limagwira bwino ntchito monga katchulidwe katchulidwe, phokoso lakumbuyo, ndi kugwiritsa ntchito mawu omasulira. Kuphatikiza pa kumasulira mawu m'mawu, makina amathanso kumasulira mawu kuchokera kuchilankhulo chilichonse kupita ku Chingerezi ndikuwona mawonekedwe akulankhula mumayendedwe amawu.

Zitsanzozo zimapangidwira muzithunzi ziwiri: chitsanzo cha chinenero cha Chingerezi ndi zinenero zambiri, zomwe zimathandizanso zinenero za Chirasha, Chiyukireniya ndi Chibelarusi. Momwemonso, choyimira chilichonse chimagawidwa muzosankha za 5, zosiyana ndi kukula kwake ndi kuchuluka kwa magawo omwe afotokozedwa mu chitsanzo. Kukula kwake kwakukulu, kulondola kwakukulu ndi khalidwe la kuzindikira, komanso kukwezera zofunikira za kukula kwa kukumbukira kwamavidiyo a GPU ndi kuchepetsa ntchito. Mwachitsanzo, njira yocheperako imaphatikizapo magawo 39 miliyoni ndipo imafuna 1 GB ya kukumbukira kwamakanema, ndipo kuchuluka kwake kumaphatikizapo magawo 1550 miliyoni ndipo kumafunikira kukumbukira kwamavidiyo 10 GB. Njira yocheperako ndi nthawi 32 mwachangu kuposa kuchuluka.

Khodi ya makina ozindikira ndi kumasulira a Whisper yatsegulidwa

Dongosololi limagwiritsa ntchito zomangamanga za Transformer neural network, zomwe zimaphatikizapo encoder ndi decoder zomwe zimalumikizana wina ndi mnzake. Zomvera zimagawidwa kukhala 30-sekondi chunks, zomwe zimasinthidwa kukhala log-Mel spectrogram ndikutumizidwa ku encoder. Kutulutsa kwa encoder kumatumizidwa ku decoder, yomwe imalosera chithunzithunzi cha mawu osakanizidwa ndi zizindikiro zapadera zomwe zimalola, mwachitsanzo chimodzi, kuthetsa mavuto monga kuzindikira chinenero, kuwerengera nthawi ya katchulidwe ka mawu, kumasulira kwa mawu mu zinenero zosiyanasiyana, ndi kumasulira mu Chingerezi.

Source: opennet.ru

Kuwonjezera ndemanga