AiLiveAppeal 8.045 sec read

Google Releases Gemma 4 12B

3 June 2026By Pulse24 desk
← Back
Share →

What happened

Google DeepMind introduced Gemma 4 12B, a 12-billion-parameter multimodal model designed for agentic intelligence on laptops. This model features a novel encoder-free architecture, integrating vision and audio inputs directly into the LLM backbone, eliminating traditional separate encoders. It runs locally with 16GB of VRAM or unified memory, offering performance comparable to Google's larger 26B MoE model. Released under an Apache 2.0 licence, Gemma 4 12B also includes Multi-Token Prediction drafters to reduce latency.

Why it matters

This release lowers the hardware barrier for advanced multimodal and agentic workflows, enabling local execution on consumer laptops. Platform engineers and developers gain access to a powerful, open-licence model for on-device AI applications, reducing reliance on cloud inference for complex tasks. The encoder-free architecture cuts latency and memory usage, directly impacting performance and cost for edge deployments. This follows Google's earlier Gemma 4 releases, further expanding its open-weight model ecosystem.

Source · blog.googleAI-processed content may differ from the original.
Published 3 June 2026