Gemini Omni | I/O 2026 Keynote

Categories: AI, Product

Summary

Google announces Gemini Omni, a multimodal AI model that generates any output from any input with dramatically improved physics simulation—marking a critical step toward AGI by combining world understanding with generative media capabilities like video, image, and interactive simulation creation.

Key Takeaways

  1. Gemini Omni achieves step-change improvements in simulating physical concepts like kinetic energy and gravity that previous systems struggled with, enabling accurate translation of complex ideas into highly realistic video outputs.
  2. The model enables iterative creative workflows through conversational language editing rather than single-step generation, allowing users to adjust video details, style, and add elements through natural language commands.
  3. Omni combines three previously separate generative models (Veo for video, Nano Banana for images, Genie for interactive simulations) into one unified system, representing architectural consolidation toward true multimodal AGI.
  4. The model demonstrates real-world application capability: users can input personal videos (like selfies) and transform them through text prompts, making accessibility and practical usability central to the product design.
  5. Gemini Omni Flash launched immediately across Google products as the first model in the Omni family, with Omni Pro coming soon—indicating a planned tiered release strategy targeting different use cases and performance requirements.

Related topics

Transcript Excerpt

Hi everyone, it's really great to be here. Over the past year, AI capabilities have leaped forwards. We now have agents that can plan and act on our behalf, and Artificial General Intelligence is just a few years away. Today, I'm excited to share the progress we've made towards building AGI. Last year, I outlined our vision of extending Gemini's incredible multimodal capabilities to become a world model— AI that can understand and simulate the world. This is a crucial aspect of achieving AGI, and will be important for everything from building AI assistants to training robots. Now, we're taking the next big step. I'm excited to announce Gemini Omni. Our new model that can create anything from any input. It combines Gemini's intelligence with the best of our generative media models for a new…

More from Google