Runway Launches Video Generation Model Gen-3

Under pressure from KeLing and Luma, Runway, the leader in video generation that had been silent for half a year, finally couldn't sit still any longer and released its own DiT video generation model Gen-3. From demos and videos released by some insiders, the quality of the videos generated by the model appears to be even higher than Sora's, especially in terms of aesthetic and lighting performance.

Trained jointly on videos and images, Gen-3 Alpha will power Runway's text-to-video, image-to-video, and text-to-image tools, as well as existing control modes such as Motion Brush, advanced camera control, and director mode, along with upcoming tools for more fine-grained control over structure, style, and motion.

The main features of the model include:

Fine-grained temporal control: Gen-3 Alpha has been trained with highly descriptive, temporally dense captions, enabling it to make imaginative transitions and precisely keyframe elements in the scene.

Realistic human portrayal: Gen-3 Alpha excels at generating vivid human characters with a wide range of actions, gestures, and emotions, expanding new storytelling opportunities.

Trained for artists: The training of Gen-3 Alpha is a collaborative effort from a cross-disciplinary team of research scientists, engineers, and artists. It is designed to interpret a wide range of styles and cinematic terminology.

At the same time, Gen-3 also supports fine-tuning of the model for more stylistically unified and consistent characters, and functions targeted at specific artistic and narrative needs. However, this is a B2B capability, used to serve various film and television companies.