News

Google's Gemma 4 Model Family Now Available Under Apache 2.0 License, Boosting Agentic AI Capabilities

Google's Gemma 4 Model Family Now Available Under Apache 2.0 License, Boosting Agentic AI Capabilities

Google has launched Gemma 4, its most capable open model family to date. The four new models are designed to run on a wide range of devices, from smartphones to workstations, and are being released for the first time under a fully open Apache 2.0 license.

These models leverage the same underlying technology as Google's proprietary Gemini 3 and are published under the commercially permissive Apache 2.0 license, granting developers complete control over their data, infrastructure, and models. Previous Gemma versions were subject to a more restrictive Google proprietary license.

According to Google, all Gemma 4 models deliver significant improvements in multi-step reasoning and mathematical tasks. For agentic workflows, they natively support function calling, structured JSON output, and system instructions, enabling autonomous agents to integrate with various tools and APIs.

Gemma 4 comes in four sizes, catering to everything from edge devices to high-end workstations: Effective 2B (E2B), Effective 4B (E4B), a 26B Mixture-of-Experts (MoE) model, and a 31B Dense model. All four models move beyond simple chat, capable of handling complex logic and sophisticated agentic workflows.

ModelActive parametersArchitectureContext windowTarget hardwareOffline operationVision (images/video)Audio inputQuantized on consumer GPUArena AI ranking (open)Special feature
E2B“effective” 2 billion-128K tokensSmartphones, Raspberry Pi, Jetson Orin Nano--Compute and memory efficiency on edge devices
E4B“effective” 4 billion-128K tokensSmartphones, Raspberry Pi, Jetson Orin Nano--Compute and memory efficiency on edge devices
26B MoE3.8 billion activeMoEup to 256K tokensPersonal computers, consumer GPUs (quantized), workstations, accelerators-#6Optimized for latency, 3.8 billion active parameters, fast token generation
31B Dense-Denseup to 256K tokensPersonal computers, consumer GPUs (quantized), workstations, accelerators-#3Maximum quality, base for fine-tuning

The 31B model currently holds the 3rd position among all open models worldwide on the Arena AI Text Leaderboard, while the 26B MoE model ranks 6th. Google states that Gemma 4 outperforms models 20 times its size. For developers, this translates to high-performance results with significantly reduced hardware requirements.

BenchmarkGemma 4 31B IT ThinkingGemma 4 26B A4B IT ThinkingGemma 4 E4B IT ThinkingGemma 4 E2B IT ThinkingGemma 3 27B IT
Arena AI (text) (As of 2/6/24)14521441--1365
MMLU (Multilingual Q&A) (No tools)85.2%82.6%69.4%60.0%67.6%
MMMU Pro (Multimodal reasoning)76.9%73.8%52.6%44.2%49.7%
AIME 2026 (Mathematics) (No tools)89.2%88.3%42.5%37.5%20.8%
LiveCodeBench v6 (Competitive coding problems)80.0%77.1%52.0%44.0%29.1%
GPQA Diamond (Scientific knowledge) (No tools)84.3%82.3%58.6%43.4%42.4%
τ2-bench (Agentic tool use) (Retail)86.4%85.5%57.5%29.4%6.6%

The two larger models are designed for workstations and servers. The unquantized bfloat16 weights of the 31B model can fit on a single 80 GB NVIDIA H100 GPU, offering powerful local deployment capabilities.

↗ Read original source