Google Ships Gemini Flash-Lite

Google Ships Gemini Flash-Lite

3 March 2026

What happened

Google released Gemini 3.1 Flash-Lite, its newest, most cost-efficient Gemini 3 series model, now in preview via Gemini API and Vertex AI. Priced at $0.25/1M input tokens and $1.50/1M output tokens, it outperforms Gemini 2.5 Flash with 2.5X faster Time to First Answer Token and 45% increased output speed. The model achieves an Elo score of 1432 on Arena.ai and benchmark results, including 86.9% on GPQA Diamond and 76.8% on MMMU Pro.

Why it matters

Operational expenditure for platform engineers and procurement teams managing high-frequency AI workloads will decrease due to Gemini 3.1 Flash-Lite's reduced pricing and increased speed. The model's enhanced performance, including its Elo score and benchmark results, architects design and deploy more complex, high-volume applications for tasks like translation, content moderation, and UI generation at scale. This release follows Google's recent detailing of Gemini 3.1 Pro, indicating a quick iteration cycle for its model offerings.

AI generated content may differ from the original.

Published on 3 March 2026

Subscribe for Weekly Updates

Stay ahead with our weekly AI and tech briefings, delivered every Tuesday.

Google Ships Gemini Flash-Lite