From 46% to 90%: Fine-Tuning Tiny LLMs for On-Device Agents — Cormac Brick, Google

Categories: AI, Tools

Summary

Google's tiny LLM strategy improved on-device AI performance from 46% to 90% accuracy through fine-tuning and skill-based agents. Developers can now embed Gemma 4 models (under 1B parameters) directly into apps using LiteRT runtime, supporting 2.7B+ devices with full privacy and offline capability.

Key Takeaways

  1. Tiny LLMs (under 1 billion parameters) can be shipped directly within apps for full customization and offline use, eliminating dependency on cloud APIs and reducing latency.
  2. LiteRT runtime supports CPU, GPU, or NPU execution across 2.7B devices daily, with built-in Android OS integration—proven infrastructure for on-device inference at scale.
  3. System-level GenAI (Gemini Nano via AI Core) provides pre-installed, highly-optimized models that require zero app size increase—prioritize this before building custom models.
  4. Agent skills framework enables task-specific optimization on Gemma 4, allowing developers to fine-tune tiny models for boutique use cases with measurable accuracy improvements.
  5. Choose between system GenAI (zero customization work, no app bloat) or App GenAI (full control, more development effort)—the right choice depends on use-case specificity and customization needs.

Related topics

Transcript Excerpt

Yeah, so while we wait for it to come up cuz I know we're short of time, uh I'm going to talk about um agents on device. So I know whoever asked the question about skills and AI core, we have an answer to that. We've built a simple skill harness in top of AI core that you can skills on. We'll be able to show that. Also going to talk about tiny LLMs, uh which are we would call LLMs that are like smaller than a billion parameters. They're small enough to build into your app if you want to have more customization or you want to do something that isn't already available for you in AI core. So that's the gist. So uh quick overview of AI edge, well, how we think about uh small language models, uh tiny LLMs, and system gen AI. Then we're going to take a quick look at agent skills, which is someth…

More from ai.engineer