Luma Unveils Video Generation Model Dream Machine

The biggest dark horse content of last week was undoubtedly the release of the Dream Machine video generation model by Luma AI. The performance of image-to-video generation is quite stunning, with a cinematic level of quality. The resolution, motion range, and aesthetic performance are all impressively excellent, and it was immediately made available for free trial to everyone.

After the release, the community quickly got involved. In addition to generating videos from AI-generated images, videos generated from existing meme images are also very lively and funny, such as this famous Oscar group photo.

There are also many great short films, such as this beautiful girl compilation and cute monster animation.

I have run some tests myself and summarized some key findings:

Luma's text-to-video quality is not as good as Kexin's, basically unusable.
The image-to-video is quite surprising, with good consistency and motion range.
It can supplement content that is not in the picture while still ensuring the consistency of the existing style and content.
Similar to Kexin, if the model does not understand the concept, even the image-to-video quality is also very poor.
Short prompts work well, it is best to only describe how the motion content in the picture is moving.
Luma's official video also introduced some of the model's features and strengths:

The video quality generated is very high, with a resolution up to 1024 pixels.
It can understand prompts well and generate videos that match the aesthetic style.
Fast inference speed, reducing waiting time, which is conducive to rapid creative iteration.
It can generate coherent motion and movement, unlike previous models that were static and slow-motion.
It has a good understanding of physics and human movement.
In the same video, people and objects can maintain consistency.
It can generate interesting camera movements, not just static angles.

I did some research and found that the backgrounds of the Luma AI team members are indeed very impressive, possibly a startup dream team:

Alex Yu: Co-founder and CTO of Luma AI, formerly an AI researcher at UC Berkeley, focusing on real-time neural rendering and research on generating 3D models from single images.
Amit Jain: Co-founder and CEO of Luma AI, formerly worked at Apple, responsible for the multimedia experience of Vision Pro, focusing on computer vision and product design.
Jiaming Song: Chief Scientist at Luma AI, formerly worked at NVIDIA's generative AI group, leading the research on diffusion models (such as DDIM), which significantly improved the performance of generative AI.
Matthew Tancik: Head of the Applied Research team at Luma AI, helped create Neural Radiance Fields (NeRF), one of the important methods in the field of 3D neural rendering.
Angjoo Kanazawa: Chief Scientific Advisor at Luma AI, Assistant Professor in the Department of Electrical Engineering and Computer Science at the University of California, Berkeley, focusing on computer vision, computer graphics, and machine learning, with a particular interest in visual perception of dynamic 3D worlds.