Las Vegas, Jan 5, 2026, 15:39 (PST)
Nvidia CEO Jensen Huang said at CES in Las Vegas on Monday that the company’s next-generation Vera Rubin AI platform is in full production and can deliver five times the artificial-intelligence computing of its previous chips when running chatbots and other applications. He said Rubin uses a proprietary kind of data to reach that gain, adding: “This is how we were able to deliver such a gigantic step up in performance.” The push comes as rivals such as Advanced Micro Devices and in-house chips from customers like Alphabet’s Google compete more aggressively for the market of running trained AI models at scale. Reuters
Rubin is Nvidia’s next major data-center platform after Blackwell, and the CES launch is an early marker for cloud and enterprise buyers lining up 2026 capacity. Demand is shifting from training large models to inference — the stage where companies deploy AI to answer user queries in real time — and that is where latency and cost per response become central. The Verge
Nvidia has said Rubin-based products will be available through partners in the second half of 2026, giving customers a timetable as they plan data-center builds. Much of the attention around the launch has centered on Nvidia’s pledge to lower cost per token, the chunks of text AI systems generate and consume. Tom’s Hardware
Nvidia said the Rubin platform combines six chips — the Vera CPU, Rubin GPU, NVLink 6 switch, ConnectX-9 networking chip, BlueField-4 data processor and a Spectrum-6 Ethernet switch — designed as a single system rather than standalone parts. Its flagship NVL72 configuration links 72 graphics processing units, or GPUs, with 36 central processing units, or CPUs, in a rack-scale server. Nvidia said Rubin can cut inference token costs by up to 10 times and train mixture-of-experts models — systems that route tasks to specialized sub-models — with four times fewer GPUs than its Blackwell platform. NVIDIA Newsroom
In a technical briefing, Nvidia described Rubin as an “AI factory” design that treats the whole rack, not a single server, as the unit of computing to keep performance steady when systems are fully loaded. The company said it is building end-to-end encryption and other security features into the rack to protect proprietary data used for training and inference. NVIDIA Developer
Nvidia also unveiled a BlueField-4-based storage platform aimed at “context memory” — the key-value cache that helps chatbots keep track of long conversations. The company said the system can boost tokens-per-second throughput and power efficiency by up to five times versus traditional storage by sharing that context across clusters of AI servers. GlobeNewswire
The company said its DGX SuperPOD reference architecture will serve as a blueprint for deploying Rubin-based systems across enterprise and research customers. Nvidia said DGX Rubin systems are designed to reduce the cost of inference token generation while supporting long-context reasoning workloads. NVIDIA Blog
In autonomous vehicles, Nvidia said it is releasing the Alpamayo family of open AI models, simulation tools and datasets to tackle rare “long-tail” driving scenarios that are hard to cover with standard training data. The company said the package is aimed at helping developers build reasoning-based systems that can be tested in simulation before road deployment. NVIDIA Newsroom
Mercedes-Benz said it will launch MB.DRIVE ASSIST PRO in the United States later this year, letting vehicles drive on city streets under driver supervision and challenging Tesla’s Full Self-Driving feature set. Mercedes put the price at $3,950 for three years and said the system uses about 30 sensors feeding a computer capable of 508 trillion operations per second. Nvidia said the new Mercedes-Benz CLA will use its DRIVE AV software and support over-the-air updates. Reuters
The next test for Rubin will be whether customers adopt Nvidia’s proprietary data approach and pay for tightly integrated racks, rather than leaning further into in-house chips or cheaper alternatives. In cars, the technology is still constrained by the requirement that drivers stay alert and ready to intervene, limiting how quickly city-street automation becomes a mass-market feature.