What happened
Qwen released its Qwen3-Next-80B-A3B model on September 9, 2025, featuring 80 billion total parameters with only 3 billion active during inference. This efficiency-focused update introduces a hybrid attention mechanism, combining Gated DeltaNet and Gated Attention, and employs a highly sparse Mixture-of-Experts (MoE) architecture with many experts and a shared expert. The model also supports a native 262k context.
Why it matters
Inference costs and throughput for platform engineers will significantly improve with Qwen's new architecture. The ultra-sparse MoE design and hybrid attention allow high performance with substantially fewer active parameters, making large models more accessible. Data architects gain expanded operational scope with the 262k native context. Procurement teams should evaluate the model's efficiency gains for long-context applications.
Subscribe for Weekly Updates
Stay ahead with our weekly AI and tech briefings, delivered every Tuesday.




