Nvidia Celebrates the New Era of AI Inference

Summary

Nvidia CEO Jensen Huang presented a series of new chips and platforms for AI inference—the phase of practical AI application—at his developer conference in San Jose. The company is doubling its revenue forecast for AI chips to at least one trillion dollars in two years. With the new ultrafast inference rack, consisting of Groq and Vera Rubin chips, Nvidia wants to demonstrate its response to criticism about high energy consumption. Huang highlighted the Austrian AI platform Open Claw as a breakthrough and announced extensive partnerships in the robotaxi and robotics sectors.

People

Jensen Huang (Nvidia CEO)
Peter Steinberger (Open Claw Developer)

Topics

AI inference and practical applications
Semiconductor technology and GPU design
Energy efficiency and cooling systems
AI agents and robotics partnerships

Clarus Lead

Nvidia is repositioning itself as an infrastructure leader for the AI inference phase—the implementation of trained models. CEO Huang announced a revolutionary inference rack that processes 700 million tokens per second, making it 350 times faster than older generations. The doubled revenue forecast to one trillion dollars signals massive demand for specialized computing power, while partnerships with robotaxi providers and robotics manufacturers dramatically expand the range of applications.

Detailed Summary

Huang frames current AI development as a fundamental paradigm shift: from the training phase to autonomous application through AI agents. Platforms like Anthropic's Claude Code enable AI to not just answer questions, but execute concrete tasks—from shopping to email management to autonomous driving. However, this era requires chips with low latency and low power consumption, not maximum computing power.

Nvidia's solution combines two components: the Groq chip, specially optimized for inference (acquired for 20 billion dollars in 2025), and the newly designed Vera Rubin chip with a revolutionary water cooling system. The high-performance rack integrates 256 Groq and 72 Rubin chips and is intended to significantly reduce energy costs—a direct response to months of criticism regarding Nvidia's energy inefficiency in inference workloads.

Particularly noteworthy: Huang lauded the Austrian Open Claw platform as a development of comparable historical significance to Linux or HTTPS. This free, open-source solution enables AI agents to be controlled without proprietary dependencies. Nvidia builds on this and develops specialized variants for applications such as climate forecasting and autonomous driving. New robotaxi partnerships with BYD, Geely Auto, Hyundai, and Nissan, as well as Disney collaborations for theme park robotics, suggest aggressive expansion into hardware applications.

Key Findings

Inference Pivot: Nvidia shifts focus from model training to the application phase with specialized hardware architecture
Forecast Doubling: Revenue to at least 1 trillion dollars in two years (previously 500 billion by end of 2026)
Technological Leaps: Vera Rubin rack 350× faster at token processing, 500× higher memory bandwidth vs. Hopper
Ecosystem Strategy: Close collaboration with Open Claw platform and partnerships in robotics/mobility

Critical Questions

Evidence (Data Quality): Huang cites performance figures (700 million tokens/second, 350× faster), but these refer to a specific comparison with Hopper. How valid is this benchmark for real inference workloads across diverse applications?
Conflicts of Interest: Huang forecasts a trillion-dollar opportunity for Nvidia—this statement is not independent of Nvidia's commercial interests. What independent market research supports this magnitude?
Causality: Huang argues that the energy problem is solved through water cooling. Are latency and memory bandwidth—the core inference requirements—really addressable through hardware alone, or do algorithmic optimizations also play a role?
Implementation Risks: The announced robotaxi infrastructure with BYD and other manufacturers is based on vehicles not yet in production. How likely is mass production within the stated timeframes?
Competitive Context: Huang highlights Groq as a specialized provider—meanwhile AMD, Cerebras, and others are expanding into inference hardware. What structural advantages guarantee Nvidia market dominance in a fragmenting market?
Open Claw and Dependency: While Huang celebrates Open Claw as an independent ecosystem, Nvidia develops proprietary variants on top of it. How sustainable is openness when the market leader dominates proprietary versions?
Energy Cost Reality: Statements about energy savings through water cooling lack concrete benchmarks. What percentage reduction is realistically expected, and under what conditions?

Source Directory

Primary Source: Ultrafast new chips, more robots and AI agents: Nvidia celebrates the new era of inference – Neue Zürcher Zeitung – https://www.nzz.ch/technologie/ultraschnelle-neue-chips-mehr-roboter-und-ki-agenten-nvidia-feiert-das-neue-zeitalter-der-inferenz-ld.1929584

Verification Status: ✓ 17.03.2026

This text was created with the support of an AI model. Editorial Responsibility: clarus.news | Fact-Check: 17.03.2026