DeepSeek has launched its latest AI model, the V3.1, featuring advancements in reasoning, coding, and tool use. The model supports both 'thinking' (chain-of-thought reasoning) and 'non-thinking' (direct) generation modes, switchable via a chat template. This hybrid approach marks a shift in research focus towards the 'agent era'.
The V3.1 boasts 671B total parameters, with 37B activated per token, utilising a Mixture-of-Experts (MoE) design to lower inference costs. It has a 128K token context window and was trained using FP8 microscaling for efficient arithmetic on next-gen hardware. DeepSeek claims the model can deliver answers faster than its previous R1 model. The company will adjust the costs for using the model's API starting September 6.
DeepSeek's V3.1 is available on Hugging Face and supports multi-turn conversations with explicit tokens for system prompts, user queries, and assistant responses. Benchmarks show strong performance across general knowledge, coding, maths, and tool use.