Nvidia unveils Rubin to power next-generation AI supercomputers

Nvidia has unveiled its Rubin platform, a new generation of AI computing architecture built around six custom-designed chips, as the company seeks to extend its lead in infrastructure powering large-scale artificial intelligence systems.

The announcement was made at CES, where Nvidia said Rubin is designed to function as a single, tightly integrated AI supercomputer, combining processing, networking and storage to reduce the cost and complexity of deploying advanced models.

The Rubin platform succeeds Nvidia’s Blackwell architecture and is positioned as a response to surging global demand for AI training and inference. Reuters has previously reported that cloud providers and AI developers are racing to expand compute capacity as models grow larger, more complex and more expensive to run.

According to Nvidia, Rubin delivers up to a tenfold reduction in inference token costs and allows mixture-of-experts (MoE) models to be trained using four times fewer GPUs than Blackwell. The company said these gains are achieved through deep co-design across hardware and software rather than incremental chip upgrades.

The platform is named after Vera Florence Cooper Rubin, the American astronomer whose work reshaped understanding of dark matter. At its core are six components: the Vera CPU, Rubin GPU, NVLink 6 switch, ConnectX-9 SuperNIC, BlueField-4 data processing unit, and Spectrum-6 Ethernet switch.

Speaking at the launch, Jensen Huang, Nvidia’s founder and chief executive, said the platform arrives as computing needs for AI workloads “are going through the roof”, adding that the company’s strategy is to deliver a new generation of AI supercomputers on an annual cadence.

Rubin introduces rack-scale systems such as the Vera Rubin NVL72, which combines dozens of CPUs and GPUs into a single unit, as well as the HGX Rubin NVL8, designed for more traditional server deployments. Nvidia said the systems are optimised for long-context reasoning, agentic AI and large-scale inference.

The company also highlighted advances in networking and storage. Its Spectrum-X Ethernet Photonics systems are designed to improve power efficiency and uptime in data centres, while a new Inference Context Memory Storage Platform, built around BlueField-4, aims to speed up AI reasoning by sharing memory more efficiently across systems.

Major cloud providers including Microsoft, Amazon Web Services, Google Cloud and Oracle Cloud Infrastructure are expected to deploy Rubin-based systems in 2026. Microsoft said it plans to use the platform in future AI data centres, including its Fairwater “AI superfactory” projects.

Specialist AI cloud provider CoreWeave said it will be among the first to offer Rubin-based capacity, integrating the platform into its managed infrastructure. Nvidia also said it is expanding its collaboration with Red Hat to deliver an enterprise-ready software stack optimised for Rubin.

Rubin-based products are expected to become available from partners in the second half of 2026, with server makers such as Dell, HPE, Lenovo and Supermicro planning systems built around the new architecture.

The launch underlines Nvidia’s push towards ever-larger, rack-scale systems as AI workloads shift from experimentation to production. As models demand longer memory, higher reliability and lower operating costs, Rubin signals where the next phase of AI infrastructure is heading.

AI & Emerging Tech

Nvidia unveils Rubin to power next-generation AI supercomputers

Topics

Author