Google's New Gemini Omni Is Too Much.

Categories: AI

Summary

Google's Gemini Omni is a world model that treats video like code—enabling precise frame-by-frame editing and physics-based reasoning rather than just generation. The real breakthrough isn't the videos themselves, but that Gemini 3.5 Flash matches Opus 4.7 performance while being significantly faster and cheaper to run.

Key Takeaways

  1. Gemini Omni enables video-to-video editing at scale, allowing real-time character and scene modifications mid-generation—transforming video from a final output into an editable medium like text documents.
  2. Gemini 3.5 Flash is positioning itself as the lightweight foundational model that matches Claude Opus 4.7 and GPT-4.5 performance while being much faster—indicating a capability/speed inflection point.
  3. Google's world model architecture extends beyond video to theoretically ingest and output audio, text, video, and 3D worlds—signaling a shift from single-modality to truly omnidirectional AI systems.
  4. Character consistency features (similar to Sora's cameo) show limitations in Omni—suggesting generative AI still struggles with maintaining identity continuity across extended sequences.
  5. The physics understanding in Omni enables complex reasoning about spatial relationships and object interactions—moving beyond pattern matching to actual environmental simulation capabilities.

Related topics

Transcript Excerpt

Google's Gemini Omni model has landed. It was unveiled at IO 2026 and it's got better editing, better generation, better everything? Sure, yeah. We've got hands-on experiences with Omni to share. Some astonishing hits and some surprising misses. Last year, I outlined our vision of extending Gemini's incredible multimodal capabilities to become a world model. AI that can understand and simulate the world. But the Google hits kept coming with all sorts of releases and well, teases of releases. Kind of forming a plan. >> Oh, a concept of a plan. Yes. An AI-powered Ask YouTube feature, which is going to streamline learning and discovering. That's on the way. Docs live will deliver a real-time writing assistant. And I guess [music] Well, I think they also announced something else that might hav…

More from AI For Humans