one year on
Google releases Gemma 3, claiming it's the most capable open model for a single GPU or TPU
The Gemma 3 family, ranging from 1B to 27B parameters, supports 140+ languages, 128k token context, and multimodal input, positioning it as a direct competitor in the open-weights race.
Google today announced Gemma 3, a collection of open-weight models ranging from 1B to 27B parameters, built on the same research and technology as Gemini 2.0. The company claims the 27B variant outperforms Llama-3-405B, DeepSeek-V3, and o3-mini in human preference evaluations on the LMArena leaderboard, all while fitting on a single GPU or TPU — a claim that, if true, reshapes expectations for what can be done with consumer hardware.
The models support text and visual reasoning (4B, 12B and 27B variants), a 128k-token context window, function calling, and structured output. They come pretrained in over 140 languages and offer official quantized versions. Google also released ShieldGemma 2, a 4B image safety checker built on the Gemma 3 foundation.
The model weights are now available on Kaggle and Hugging Face. With over 100 million downloads of the previous Gemma models and more than 60,000 community variants, the Gemmaverse is growing fast — and the open-weights race just got a new frontrunner.
The record
VP of Research at Google DeepMind, co-authored the blog post emphasizing Gemma 3's state-of-the-art performance and accessibility.
Director at Google DeepMind, co-authored the blog post highlighting the model's multilingual support and safety features.
One year later — open only if you can handle spoilers
Gemma 3 quickly saw wide adoption in edge and mobile applications. A year later, its single-GPU claim remains a benchmark, though later open models like Llama 4 and the Qwen3 series matched or exceeded its performance on consumer hardware.