What’s new in Gemma 4?
By Google DeepMind
Categories: AI, Product
Summary
Google DeepMind released Gemma 4 under Apache 2.0 open-source license with models ranging from 2B to 31B parameters, featuring native tool use, 250K token context windows, and support for 140+ languages—enabling developers to run frontier-grade AI directly on local hardware without data leaving their control.
Key Takeaways
- 26B mixture-of-experts model with only 3.8B activated parameters delivers frontier intelligence with exceptional speed, optimizing token efficiency for resource-constrained environments while maintaining reasoning quality.
- Gemma 4 supports 250K token context window enabling analysis of entire codebases and multi-turn agentic workflows in a single prompt—critical capability for enterprise code review and documentation use cases.
- Native tool-use support built into the model architecture allows developers to build autonomous agents that plan and execute actions without custom prompt engineering or external orchestration frameworks.
- 2B and 4B effective models with combined audio and vision capabilities enable real-time multimodal processing on mobile and IoT devices across 140+ languages—expanding edge AI deployments beyond text.
- Apache 2.0 open-source licensing with enterprise-grade security protocols (matching proprietary models) removes licensing friction for builders and enterprises while maintaining production-ready safety standards.
Topics
- Open-Source LLM Licensing
- Local Model Inference
- Agentic AI Architecture
- Multimodal Edge Computing
- Mixture of Experts Optimization
Transcript Excerpt
We are thrilled to announce Jimma [music] 4. Built from the same world-class research and technology behind Gemini 3. Jima 4 is a family of open models designed to run directly on the hardware you own, phone, laptops, [music] and desktop. For the first time ever, we are releasing Gemma under an open-source Apache 2.0 license. Jimma 4 is built for the agentic era. It can handle complex logic, multi-step [music] planning, and agentic workflows, making optimal use of tokens for its intelligence. Th...