Google's Gemma 4 Model Family Now Available Under Apache 2.0 License, Boosting Agentic AI Capabilities

Google has launched Gemma 4, its most capable open model family to date. The four new models are designed to run on a wide range of devices, from smartphones to workstations, and are being released for the first time under a fully open Apache 2.0 license.

These models leverage the same underlying technology as Google's proprietary Gemini 3 and are published under the commercially permissive Apache 2.0 license, granting developers complete control over their data, infrastructure, and models. Previous Gemma versions were subject to a more restrictive Google proprietary license.

According to Google, all Gemma 4 models deliver significant improvements in multi-step reasoning and mathematical tasks. For agentic workflows, they natively support function calling, structured JSON output, and system instructions, enabling autonomous agents to integrate with various tools and APIs.

Gemma 4 comes in four sizes, catering to everything from edge devices to high-end workstations: Effective 2B (E2B), Effective 4B (E4B), a 26B Mixture-of-Experts (MoE) model, and a 31B Dense model. All four models move beyond simple chat, capable of handling complex logic and sophisticated agentic workflows.

Model	Active parameters	Architecture	Context window	Target hardware	Offline operation	Vision (images/video)	Audio input	Quantized on consumer GPU	Arena AI ranking (open)	Special feature
E2B	“effective” 2 billion	-	128K tokens	Smartphones, Raspberry Pi, Jetson Orin Nano	✅	✅	✅	-	-	Compute and memory efficiency on edge devices
E4B	“effective” 4 billion	-	128K tokens	Smartphones, Raspberry Pi, Jetson Orin Nano	✅	✅	✅	-	-	Compute and memory efficiency on edge devices
26B MoE	3.8 billion active	MoE	up to 256K tokens	Personal computers, consumer GPUs (quantized), workstations, accelerators	✅	✅	-	✅	#6	Optimized for latency, 3.8 billion active parameters, fast token generation
31B Dense	-	Dense	up to 256K tokens	Personal computers, consumer GPUs (quantized), workstations, accelerators	✅	✅	-	✅	#3	Maximum quality, base for fine-tuning

The 31B model currently holds the 3rd position among all open models worldwide on the Arena AI Text Leaderboard, while the 26B MoE model ranks 6th. Google states that Gemma 4 outperforms models 20 times its size. For developers, this translates to high-performance results with significantly reduced hardware requirements.

Benchmark	Gemma 4 31B IT Thinking	Gemma 4 26B A4B IT Thinking	Gemma 4 E4B IT Thinking	Gemma 4 E2B IT Thinking	Gemma 3 27B IT
Arena AI (text) (As of 2/6/24)	1452	1441	-	-	1365
MMLU (Multilingual Q&A) (No tools)	85.2%	82.6%	69.4%	60.0%	67.6%
MMMU Pro (Multimodal reasoning)	76.9%	73.8%	52.6%	44.2%	49.7%
AIME 2026 (Mathematics) (No tools)	89.2%	88.3%	42.5%	37.5%	20.8%
LiveCodeBench v6 (Competitive coding problems)	80.0%	77.1%	52.0%	44.0%	29.1%
GPQA Diamond (Scientific knowledge) (No tools)	84.3%	82.3%	58.6%	43.4%	42.4%
τ2-bench (Agentic tool use) (Retail)	86.4%	85.5%	57.5%	29.4%	6.6%

The two larger models are designed for workstations and servers. The unquantized bfloat16 weights of the 31B model can fit on a single 80 GB NVIDIA H100 GPU, offering powerful local deployment capabilities.

Google's Gemma 4 Model Family Now Available Under Apache 2.0 License, Boosting Agentic AI Capabilities

Next Stories to Read

Alibaba Launches Qwen3.6-Plus: 1M Token Context, Enhanced Agentic Coding Capabilities

Anthropic's Tumultuous Week: Model Leaks, Source Code Exposure, and Botched GitHub Takedown

Google Unveils Gemma 4 Open-Weights Models for Agentic AI and Coding, Targeting Enterprise Sector

Related Tools & Resources

Skill Marketplaces

Agent Skills Catalog