Google Launches Gemma 4: Native Multimodality and "Thinking" for the Open Ecosystem
Mumbai
Ahmedabad
On April 2, 2026, Google DeepMind released Gemma 4, the newest version of its open-model family. Built on the same research as Gemini 3, this release features native multimodality and advanced reasoning. The models are under the Apache 2.0 license.
The Lineup: Edge to Workstation
Gemma 4 is available in four sizes for different hardware, from phones to high-end workstations.
- Gemma 4 31B Dense: This model offers high performance, bridging the gap between server-grade capability and local execution. It is currently ranked 3rd on the LMSYS Arena text leaderboard.
- Gemma 4 26B A4B (MoE): This Mixture-of-Experts (MoE) variant has 128 experts. It uses only 3.8 billion parameters per token. This provides 26B-class intelligence with the efficiency of a smaller model.
- Gemma 4 E4B: This "Effective" 4B model is optimized for on-device reasoning and complex tasks.
- Gemma 4 E2B: Designed for speed and low latency on edge devices like smartphones.
Core Technical Advancements
Gemma 4 introduces changes designed for "agentic" workflows—AI that can plan and act autonomously.
- Native Thinking Mode: All models have a built-in reasoning mode, which allows them to process logic before giving a final answer.
- Expanded Multimodality: All models process text and images. The E2B and E4B variants also include audio processing, removing the need for separate speech-to-text models.
- Extended Context: The family supports large context windows—128K tokens for edge models and up to 256K tokens for the workstation variants.
- Dynamic Vision Resolution: Developers can configure the "visual token budget" (from 70 to 1,120 tokens) to balance detail against compute costs for tasks like OCR or video understanding.
Performance Benchmarks
According to Google AI for Developers, the 31B Dense model achieves notable results in reasoning and coding benchmarks:
- AIME 2026: 89.2%
- LiveCodeBench v6: 80.0%
- MATH-Vision: 85.6%.
Availability
Gemma 4 models are available for download on Hugging Face and Kaggle, and can be deployed via Google Cloud Vertex AI and AICore for Android developers.
Tags:
Trends
.webp)