Disney's AI creates cartoons based on text descriptions

Neural networks that create original videos from a text description already exist. And although they are not yet able to completely replace filmmakers or animators, there are already advances in this direction. Disney Research and Rutgers have developed a neural network that can create a rough storyboard and video from a text script.

Disney's AI creates cartoons based on text descriptions

As noted, the system works with natural language, which will allow it to be used in a number of areas, such as creating training videos. Also, these systems will help screenwriters visualize their ideas. At the same time, it is stated that the goal is not to replace writers and artists, but to make their work more efficient and less tiring.

The developers claim that translating text into animation is not an easy task, since the input and output data do not have a fixed structure. Therefore, most of these systems cannot process complex sentences. To get around the limitations of previous similar programs, the developers built a modular neural network consisting of several components. These include a natural language processing module, a script parsing module, and an animation generating module.

Disney's AI creates cartoons based on text descriptions

To begin with, the system analyzes the text and translates complex sentences into simple ones. After that, a 3D animation is created. For work, a library of 52 animated blocks is used, the list of which has been expanded to 92 by adding similar elements. To create animation, the Unreal Engine game engine is used, which relies on preloaded objects and models. From them, the system selects the appropriate elements and generates a video.

Disney's AI creates cartoons based on text descriptions

To train the system, the researchers compiled a set of descriptions of 996 elements taken from more than 1000 scripts with IMSDb, SimplyScripts and ScriptORama5. After that, qualitative tests were carried out, in which 22 participants had a chance to evaluate 20 animations. At the same time, 68% said that the system created quite decent animation based on the input texts.

However, the team acknowledged that the system is not perfect. Its list of actions and objects is not exhaustive, and sometimes lexical simplification does not match verbs with similar animations. The researchers intend to address these shortcomings in future work.



Source: 3dnews.ru

Add a comment