Stability AI e phatlalalitse mokhoa oa ho ithuta oa mochine, Stable Video Diffusion, e ka hlahisang livideo tse khutšoanyane ho tloha litšoantšong. Mohlala o eketsa bokhoni ba morero oa Stable Diffusion, oo pele o neng o lekanyelitsoe ho kopanngoa ha litšoantšo tse tsitsitseng. Khoutu ea koetliso ea marang-rang ea neural le lisebelisoa tsa tlhahiso ea litšoantšo e ngotsoe ka Python ho sebelisoa moralo oa PyTorch mme e phatlalalitsoe tlasa laesense ea MIT. Mefuta e seng e ntse e koetlisitsoe e butsoe tlas'a laesense e lumelletsoeng ea Creative ML OpenRAIL-M, e lumellang tšebeliso ea khoebo.
Ho na le mekhoa e 'meli ea mekhoa e fumanehang bakeng sa ho khoasolla: SVD (Stable Video Diffusion) bakeng sa ho hlahisa liforeimi tse 14 tse nang le qeto ea 576x1024 ho latela setšoantšo se fanoeng se tsitsitseng le SVD-XT bakeng sa ho hlahisa liforeimi tse 25. Hoa khoneha ho hlahisa video ntle le ho sisinyeha kapa ka ho potoloha butle haholo ha khamera, e tšoarellang nako e fetang metsotsoana e 4. Taolo ea mohlala e otlolohileng e thehiloeng ho tlhaloso ea mongolo oa puo ea tlhaho ha e so tšehetsoe, empa u ka qala ka ho lokisa setšoantšo sa mantlha u sebelisa mofuta oa khale oa Stable Diffusion 2.1 ebe u se fetolela ho video u sebelisa mofuta oa SVD.
Boleng ba video ha bo so fane ka photorealism e nepahetseng le tokiso e nepahetseng e netefalitsoeng ea lifahleho le batho. Mabapi le ts'ebetso, mofuta o bulehileng o reriloeng o ka pele ho li-analogues tsa thepa tse tsoang ho Runway le Pika Labs. Mohlala o ka fetoloa habonolo ho rarolla mathata a sa tšoaneng, mohlala, o ka sebelisoa ho etsa lipalo tse tharo-dimensional.

Ho phaella moo, re ka hlokomela ho hatisoa ha Video-LLaVA mochine oa ho ithuta mochine, e leng se u lumellang hore u thehe setšoantšo se kopaneng sa ntho, se thehiloeng ho sebelisoa ha lifoto le lirekoto tsa video ka nako e le 'ngoe nakong ea koetliso. Sistimi e ka sebelisoa, mohlala, ho lemoha boteng ba lintho tse tšoanang litšoantšong le livideong. Khoutu e ngotsoe ka Python mme e ajoa tlasa laesense ea Apache 2.0.
Source: opennet.ru
