Nvidia's Rubin CPX GPU Unveiled

Nvidia's Rubin CPX GPU Unveiled

9 September 2025

Nvidia has revealed the Rubin CPX GPU, designed for massive-context AI processing, enabling systems to handle million-token software coding and generative video applications. The Rubin CPX works alongside Vera CPUs and Rubin GPUs within the Vera Rubin NVL144 CPX platform. This integrated MGX system delivers 8 exaflops of AI compute, 7.5x more AI performance than GB300 NVL72 systems, with 100TB of fast memory and 1.7 petabytes per second of memory bandwidth per rack. A dedicated Rubin CPX compute tray will be available for those reusing existing Vera Rubin 144 systems.

The Rubin CPX GPU delivers up to 30 petaflops of compute with NVFP4 precision and features 128GB of GDDR7 memory. It offers 3x faster attention capabilities compared to GB300 NVL72 systems. The GPU integrates video decoders and encoders and long-context inference processing on a single chip. Built on the Rubin architecture, the Rubin CPX uses a cost-efficient, monolithic die design optimised for high performance and energy efficiency for AI inference tasks. It is designed to enhance long-context performance, maximise ROI, and deliver scalable efficiency in context-aware inference deployments.

Nvidia's SMART framework optimises inference across scale, performance, architecture and ROI using a full-stack disaggregated infrastructure. The Rubin CPX is purpose-built to handle million-token coding and generative video applications. It enables AI coding assistants to evolve into systems that comprehend and optimise large-scale software projects. The Rubin CPX is expected to be generally available in 2026.

AI generated content may differ from the original.

Published on 9 September 2025

Subscribe for Weekly Updates

Stay ahead with our weekly AI and tech briefings, delivered every Tuesday.

Nvidia's Rubin CPX GPU Unveiled