OpenAI has launched two new open-weight large language models (LLMs), gpt-oss-20b and gpt-oss-120b, designed to enhance AI accessibility for developers and enterprises. These models, optimised for NVIDIA's Blackwell platform, support a range of generative and agent-based AI applications. The gpt-oss models mark OpenAI's first open-weight release in over four years.
The models utilise a mixture-of-experts architecture and support chain-of-thought reasoning, instruction following and tool use. The gpt-oss-120b is designed for enterprise-level NVIDIA chips, while gpt-oss-20b is suitable for consumer devices with at least 16GB of RAM, including GeForce RTX GPUs. Performance can reach up to 256 tokens per second on the GeForce RTX 5090. NVIDIA has optimised these models for its GPUs, ensuring efficient inference from cloud to PC.
The collaboration between NVIDIA and OpenAI aims to make advanced AI more accessible, integrating NVIDIA's hardware and software into the open-source AI landscape. Developers can deploy the models through common AI tools like Ollama, llama.cpp, and Microsoft AI Foundry Local. The models are trained using NVIDIA's H100 GPUs and support context lengths up to 131,072 tokens.