Stable Diffusion makina ophunzirira makina osinthira nyimbo

Pulojekiti ya Riffusion imapanga makina ophunzirira a Stable Diffusion omwe amasinthidwa kuti apange nyimbo m'malo mwa zithunzi. Nyimbo zitha kupangidwa ndi kufotokozera m'mawu achilengedwe kapena kutengera template yomwe mukufuna. Zida zophatikizira nyimbo zimalembedwa mu Python pogwiritsa ntchito dongosolo la PyTorch ndipo zimapezeka pansi pa layisensi ya MIT. Kumangirira ndi mawonekedwe kumayendetsedwa muchilankhulo cha TypeScript ndipo kumagawidwanso pansi pa layisensi ya MIT. Mitundu yophunzitsidwayo imatulutsidwa pansi pa chilolezo cha Creative ML OpenRAIL-M chololeza kugwiritsa ntchito malonda.

Pulojekitiyi ndi yosangalatsa chifukwa ikupitiriza kugwiritsa ntchito zitsanzo za "text-to-image" ndi "chithunzi-chithunzi" pakupanga nyimbo, koma imagwiritsa ntchito ma spectrogram ngati zithunzi. Mwa kuyankhula kwina, kusakanikirana kokhazikika kokhazikika sikuphunzitsidwa pazithunzi ndi zithunzi, koma pazithunzi za spectrogram zomwe zimasonyeza kusintha kwafupipafupi ndi matalikidwe a phokoso la phokoso pakapita nthawi. Chifukwa chake, spectrogram imapangidwanso pazotulutsa, zomwe zimasinthidwa kukhala zoyimira zomvera.

Stable Diffusion makina ophunzirira makina osinthira nyimbo

Njirayi ingagwiritsidwenso ntchito kusinthira nyimbo zomwe zilipo kale komanso kaphatikizidwe ka nyimbo kachitsanzo, mofanana ndi kusinthidwa kwa zithunzi mu Stable Diffusion. Mwachitsanzo, m'badwo utha kuyika ma spectrogram a zitsanzo ndi kalembedwe, kuphatikiza masitayelo osiyanasiyana, kusintha masitayilo kuchokera kumayendedwe ena kupita kumtundu wina, kapena kusintha mawu omwe alipo kuti athetse mavuto monga kuwonjezera kuchuluka kwa zida, kusintha kamvekedwe kake ndi kamvekedwe kake. kusintha zida. Zitsanzo zimagwiritsidwanso ntchito popanga nyimbo zoseweredwa kwa nthawi yayitali, zopangidwa ndi ndime zingapo zomwe zimayandikana, zimasiyana pang'ono pakapita nthawi. Zidutswa zopangidwa padera zimaphatikizidwa kukhala mtsinje wosalekeza mwa kusokoneza magawo amkati a chitsanzo.

Stable Diffusion makina ophunzirira makina osinthira nyimbo

Kuti apange spectrogram kuchokera ku phokoso, kusintha kwawindo la Fourier kumagwiritsidwa ntchito. Pobwezeretsanso phokoso kuchokera ku spectrogram, pali vuto pozindikira gawo (mafupipafupi ndi matalikidwe okha omwe alipo pa spectrogram), pomanganso momwe Griffin-Lim approximation algorithm imagwiritsidwa ntchito.



Source: opennet.ru

Kuwonjezera ndemanga