Google unveils open source Gemma 3 model with 128k context window

by | Mar 12, 2025 | Technology

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

Even as large language and reasoning models remain popular, organizations increasingly turn to smaller models to run AI processes with fewer energy and cost concerns. 

While some organizations are distilling larger models to smaller versions, model providers like Google continue to release small language models (SLMs) as an alternative to large language models (LLMs), which may cost more to run without sacrificing performance or accuracy. 

With that in mind, Google has released the latest version of its small model, Gemma, which features expanded context windows, larger parameters and more multimodal reasoning capabilities. 

Gemma 3, which has the same processing power as larger Gemini 2.0 models, remains best used by smaller devices like phones and laptops. The new model has four sizes: 1B, 4B, 12B and 27B parameters. 

With a larger context window of 128K tokens — by contrast, Gemma 2 had a context window of 80K — Gemma 3 can understand more information and complicated requests. Google updated Gemma 3 to work in 140 languages, analyze images, text and short videos and support function calling to automate tasks and agentic workflows. 

Gemma gives a strong performance

To reduce computing costs even further, Google has introduced quantized versions of Gemma. Think of quantized models as compressed models. This happens through the process of “reducing the precision of the numerical values in a model’s weights” without sacrificing accuracy. 

Google said Gemma 3 “delivers state-of-the-art performance for its size” and outperforms leading LLMs like Llama-405B, DeepSeek-V3 and o3-mini. Gemma 3 27B, specifically, came in second to DeepSeek-R1 in Chatbot Arena Elo score tests. It topped DeepSeek’s smaller model, DeepSeek v3, OpenAI’s o3-mini, Meta’s Llama-405B and Mistral Large. 

By quantizing Gemma 3, users can improve performance, run the model and build applications “that can fit on a single GPU and tensor processing unit (TPU) host.” 

Gemma 3 integrates with developer tools like Hugging Face Transformers, Ollama, JAX, Keras, PyTorch and others. Users can also access Gemma 3 through Google AI Studio, Hugging Face or Kaggle. Companies and developers can request access to the Gemma 3 API through AI Studio. 

Shield Gemma for security

Google said it has built safety protocols into Gemma 3, including a safety checker for images called ShieldGemma 2. 

“Gemma 3’s development included extensive data governance, alignment with our safety policies via fine-tuning and robust benchmark evaluations,” Google writes i …

Article Attribution | Read More at Article Source