Build your next story with Gemini Omni.

By Google DeepMind

Categories: AI, Product

Summary

Google DeepMind launched Gemini Omni Flash, a multimodal AI that generates realistic videos, images, and simulations from text prompts with significantly improved physics understanding. The model enables iterative video editing through conversational language, positioning AI-generated content creation as a collaborative, natural workflow rather than a single-shot generation process.

Key Takeaways

Omni demonstrates a step-change improvement in physics simulation (kinetic energy, gravity), solving problems previous systems found difficult—critical for creators needing realistic motion and interaction in generated content.
Conversational video editing is now viable: users can iteratively refine generated videos using natural language prompts (e.g., 'adjust details and style'), making the creative process collaborative rather than deterministic.
Complex idea translation at scale: simple prompts like 'make a claymation explainer of protein folding' produce scientifically accurate, multi-step visual explanations—valuable for education, marketing, and technical communication.
Omni integrates Gemini's reasoning with generative media models (Veo, Nano Banana, Genie), combining world knowledge with creative generation—a pattern showing multimodal integration is the competitive edge in AI tooling.
Available today as Gemini Omni Flash with Omni Pro coming soon—tiered release strategy suggests positioning for both consumer adoption and professional creators, establishing distribution before competitors.

Related topics

Transcript Excerpt

I'm excited to announce Gemini Omni. Our new model that can create anything from any input. It combines Gemini's intelligence with the best of our generative media models for a new level of world understanding, multimodality, and editing. Models like Veo, Nano Banana, and Genie are able to create extremely realistic videos, images, and interactive simulations. Although not perfect, they already demonstrate some impressive notions of intuitive physics. And with Omni, we've now made even more progress. It's a step change in simulating things like kinetic energy and gravity. Previous systems would have found these concepts difficult. Gemini's world knowledge and reasoning really shine in Omni. It can translate complex ideas into highly accurate videos. So for example, you can give it a simple…

More from Google DeepMind