NVIDIA Unveils Rubin Platform at CES — A Six-Chip Architecture Built for Agentic AI
NVIDIA isn't waiting for the industry to catch up to Blackwell. At CES 2025, the company unveiled Rubin — a six-chip platform architected from the ground up for agentic AI, mixture-of-experts models, and long-context reasoning. It's the clearest signal yet of where Jensen Huang thinks AI infrastructure needs to go.
The Rubin platform comprises the NVIDIA Vera CPU, Rubin GPU, NVLink 6 Switch, ConnectX-9 SuperNIC, BlueField-4 DPU, and Spectrum-6 Ethernet Switch. Each component was codesigned to work as a unified system — NVIDIA's bet that the era of bolting together commodity parts is over for serious AI workloads.
Hardware Purpose-Built for MoE and Agentic Workloads
The numbers are staggering. The Rubin GPU delivers 50 petaflops of NVFP4 compute powered by a third-generation Transformer Engine. NVLink 6 pushes 3.6TB/s per GPU and 260TB/s per rack — bandwidth that matters enormously for mixture-of-experts architectures where routing between specialized model components happens constantly.
This is NVIDIA reading the tea leaves on model architecture. MoE models like Mixtral and rumored GPT-5 variants don't use all their parameters for every token. They route inputs to specialized "expert" subnetworks. That routing requires massive interconnect bandwidth to avoid bottlenecks. NVLink 6's specs suggest NVIDIA expects MoE to become the dominant paradigm.
The Vera CPU introduces 88 custom Olympus cores with Armv9.2 compatibility. This is NVIDIA's answer to the ARM-based data center push — and a hedge against relying on AMD or Intel for the CPU side of AI infrastructure. Vertical integration continues.
The DGX SuperPOD Foundation
NVIDIA confirmed that DGX SuperPOD remains the foundational design for deploying Rubin-based systems across enterprise and research environments. This matters for procurement: organizations that have standardized on SuperPOD architecture can plan Rubin upgrades without rearchitecting their data centers.
The DGX platform's value proposition has always been integration. NVIDIA handles compute, networking, and software as a single system, eliminating the integration burden that plagues custom builds. With Rubin, that integration extends deeper — six chips designed together rather than adapted to work together.
Why Agentic AI Demands New Hardware
The explicit focus on "agentic AI" is telling. Current LLM deployments are largely stateless — prompt in, response out. Agentic systems maintain state, execute multi-step plans, use tools, and operate over extended time horizons. The computational profile is fundamentally different.
Long-context reasoning — explicitly called out by NVIDIA — means models processing hundreds of thousands or millions of tokens. That requires massive memory bandwidth and the ability to keep enormous context windows in fast-access memory. Rubin's architecture suggests NVIDIA anticipates context windows growing by another order of magnitude.
The inference economics matter too. NVIDIA claims Rubin is engineered to "reduce the cost of inference token generation." As AI shifts from training-dominated to inference-dominated workloads, the cost per token becomes the critical metric. Agentic systems that run continuously are especially sensitive to inference costs.
Timeline and Competitive Context
NVIDIA didn't announce specific availability dates for Rubin at CES, but the Blackwell-to-Rubin cadence follows the company's established two-year GPU architecture cycle. Blackwell systems are shipping now; Rubin likely enters production in late 2026 or early 2027.
The competitive landscape is intensifying. AMD's MI300 series has gained traction with hyperscalers looking to diversify supply chains. Intel's Gaudi accelerators continue finding niche deployments. Custom silicon from Google (TPUs), Amazon (Trainium), and Microsoft (Maia) gives cloud providers alternatives to NVIDIA lock-in.
But NVIDIA's advantage isn't just silicon — it's the full-stack approach. CUDA's ecosystem moat, combined with integrated networking (NVLink, Spectrum), software (CUDA, cuDNN, TensorRT), and turnkey systems (DGX) creates switching costs that pure chip competitors can't match. Rubin extends this strategy.
The Six-Chip Codesign Bet
The most significant aspect of Rubin may be what it signals about system architecture philosophy. NVIDIA is treating the entire data center rack as a single design problem. CPU, GPU, network switches, NICs, DPUs — all purpose-built and codesigned.
This approach has precedent. Apple's M-series chips succeeded partly because Apple controls the entire stack from silicon to software. NVIDIA is applying similar logic to AI infrastructure: optimize globally, not locally.
The risk is complexity. Six new chips is an enormous engineering undertaking. Delays or bugs in any component cascade through the system. NVIDIA's track record suggests they can execute, but this is ambitious even by their standards.
What This Means for AI Infrastructure Buyers
For enterprises and research institutions planning AI infrastructure investments, Rubin creates a timeline question. Blackwell systems are available now and represent a massive leap over Hopper. But Rubin's MoE and agentic AI optimizations could deliver meaningfully better economics for next-generation workloads.
The safe bet: Blackwell for current-generation model deployments, with architectural planning that assumes Rubin migration. NVIDIA's SuperPOD continuity helps here — the physical infrastructure patterns should transfer.
For hyperscalers, Rubin reinforces NVIDIA's position as the safest high-performance choice while creating planning headaches for multi-vendor strategies. The codesigned stack works best when you buy all of it.
The Takeaway
Rubin isn't just a faster GPU. It's NVIDIA's architectural thesis about where AI workloads are headed: sparse MoE models routing between experts, agentic systems maintaining state over extended operations, and inference costs mattering more than training throughput. If that thesis is correct — and the trajectory of model development suggests it is — Rubin positions NVIDIA to dominate the next AI infrastructure cycle the way they've dominated this one.