Ukuzinza kwe-AI kupapashe imodeli yokufunda yomatshini ebizwa ngokuba yiStable Video Diffusion enokuvelisa iividiyo ezimfutshane kwimifanekiso. Imodeli yandisa amandla eprojekthi ye-Stable Diffusion, ngaphambili inqunyelwe kwi-synthesis yemifanekiso engatshintshiyo. Ikhowudi yoqeqesho lwenethiwekhi ye-neural kunye nezixhobo zokuvelisa umfanekiso ibhalwe kwiPython isebenzisa isakhelo sePyTorch kwaye ipapashwe phantsi kwelayisenisi ye-MIT. Imifuziselo esele iqeqeshiwe ivuliwe phantsi kwelayisensi evumayo ye-Creative ML OpenRAIL-M, evumela ukusetyenziswa kwezorhwebo.
Kukho iinketho ezimbini zemodeli ezifumanekayo zokukhuphela: I-SVD (I-Stable Video Diffusion) yokuvelisa iifreyimu ze-14 kunye nesisombululo se-576x1024 ngokusekelwe kumfanekiso oqingqiweyo onikiweyo kunye ne-SVD-XT yokuvelisa iifreyimu ezingama-25. Kuyenzeka ukuvelisa ividiyo ngaphandle kokushukuma okanye ngokujikeleza okucothayo kakhulu kwekhamera, okuhlala kungekho ngaphezulu kwemizuzwana emi-4. Ulawulo oluthe ngqo lwemodeli olusekwe kwinkcazo yombhalo wendalo alukaxhaswa, kodwa unokuqale ulungiselele umfanekiso woqobo usebenzisa imodeli yakudala yeStable Diffusion 2.1 uze uyiguqulele kwividiyo usebenzisa imodeli yeSVD.
Umgangatho wevidiyo awukaboneleli ngefotorealism efanelekileyo kunye nonikezelo oluchanekileyo oluqinisekisiweyo lobuso kunye nabantu. Ngokubhekiselele ekusebenzeni, imodeli evulekileyo ecetywayo iphambi kwee-analogu zobunikazi ezivela kwi-Runway kunye nePika Labs. Imodeli inokulungelelaniswa ngokulula ukuxazulula iingxaki ezahlukeneyo, umzekelo, ingasetyenziselwa ukwenza amanani amathathu-dimensional.

Ukongezelela, sinokuqaphela ukupapashwa kwe-Video-LLaVA yesixhobo sokufunda umatshini, evumela ukuba wenze ukubonakaliswa okubonakalayo okubumbeneyo kwento, eyenziwe ngokusekelwe ekusetyenzisweni kweefoto kunye nokurekhoda kwevidiyo ngexesha loqeqesho. Inkqubo ingasetyenziswa, umzekelo, ukuqaphela ubukho bezinto ezifanayo kwimifanekiso kunye neevidiyo. Ikhowudi ibhalwe kwiPython kwaye isasazwe phantsi kwelayisensi ye-Apache 2.0.
umthombo: opennet.ru
