Overview: Analyzing a rapidly growing company like Nvidia presents a significant challenge due to its diverse business areas, a wide array of product announcements, and its overarching strategy. Following CEO Jensen Huang’s keynote speech at the recent GTC Conference, the complexity of this task was amplified. As usual, Huang covered a vast number of topics during his lengthy presentation, leaving many in the audience somewhat bewildered.
However, during a thought-provoking Q&A session with industry analysts shortly thereafter, Huang provided several insights that clarified the reasoning behind the various product and partnership announcements discussed in his keynote.
Essentially, Huang stated that Nvidia is transitioning into an AI infrastructure provider, creating a platform of hardware and software for large cloud computing entities, tech vendors, and enterprise IT departments to develop AI-driven applications.
This is a dramatic shift from Nvidia’s historical role as a supplier of gaming graphics chips and its previous initiatives in machine learning algorithm development. Nonetheless, this new direction unifies a number of recent announcements and offers a clear perspective on the company’s future trajectory.
Nvidia is advancing beyond its semiconductor design roots, taking on a critical role as an infrastructure enabler for the burgeoning world of AI capabilities – or, as Huang called it, an “intelligence manufacturer.”
In his GTC keynote, Huang elaborated on Nvidia’s commitment to enhance the efficient generation of tokens for modern foundation models, connecting these tokens to the intelligence that organizations will leverage for future income streams. He referred to these undertakings as establishing an AI factory, relevant to a broad spectrum of industries.
While ambitious, the emergence of an information-focused economy – and the efficiencies that AI introduces to traditional manufacturing processes – is becoming increasingly evident. From businesses centered entirely around AI services like ChatGPT to robotic manufacturing and the distribution of conventional goods, we are undeniably entering a new economic landscape.
Against this backdrop, Huang elaborated on how Nvidia’s latest innovations promote faster and more efficient token creation. He initially spoke about AI inference, which is typically seen as less complex than the extensive AI training processes that initially propelled Nvidia to prominence. However, Huang contended that inference, especially when coupled with new chain-of-thought reasoning models like DeepSeek R1 and OpenAI’s o1, will demand roughly 100 times more computing resources than current one-shot inference methods. Therefore, there’s minimal concern that more efficient language models will diminish the need for computing infrastructure. We remain in the nascent stages of developing AI factory infrastructure.
A key yet underappreciated announcement made by Huang was for a new software tool called Nvidia Dynamo, designed to optimize the inference process for advanced models. Dynamo, an upgraded version of Nvidia’s Triton Inference Server software, dynamically allocates GPU resources for various inference tasks, such as prefill and decode, each with their own specific computing needs. It also establishes dynamic information caches, efficiently managing data across various memory types.
Employing a management approach similar to Docker’s orchestration of containers in cloud computing, Dynamo intelligently oversees resources and data essential for token generation within AI factory environments. Nvidia has referred to Dynamo as the “Operating System of AI factories.” In practical terms, Dynamo allows organizations to handle up to 30 times more inference requests utilizing the same hardware resources.
Naturally, it wouldn’t be GTC without updates on chips and hardware, and this year was no exception. Huang outlined a roadmap for upcoming GPUs, introducing an upgrade to the existing Blackwell series known as Blackwell Ultra (GB300 series), which features enhanced onboard HBM memory for improved performance.
He also revealed the new Vera Rubin architecture, showcasing a new Arm-based CPU named Vera and a next-generation GPU called Rubin, both of which include significantly more cores and improved capabilities. Huang even hinted at a subsequent generation, named after mathematician Richard Feynman, projecting Nvidia’s roadmap into 2028 and beyond.
During the following Q&A session, Huang clarified that disclosing future products far in advance is essential for ecosystem partners, as it allows them to adequately prepare for forthcoming technological advancements.
Huang stressed several partnerships announced at this year’s GTC, highlighting the notable presence of other technology vendors eager to engage in this expanding ecosystem. He noted that fully realizing AI infrastructure necessitates advancements across all traditional computing stack sectors, including networking and storage.
In this regard, Nvidia introduced new silicon photonics technology for optical networking between GPU-accelerated server racks and discussed a collaboration with Cisco. This partnership involves integrating Cisco silicon in routers and switches, aimed at incorporating GPU-accelerated AI factories into enterprise environments, alongside a shared software management layer.
For storage solutions, Nvidia has teamed up with leading hardware providers and data platform companies, ensuring compatibility with GPU acceleration, thereby broadening Nvidia’s market presence.
Lastly, expanding on their diversification strategy, Huang introduced additional endeavors in the realm of autonomous vehicles (particularly a deal with GM) and robotics, both of which he described as components of the next significant stage in AI evolution: physical AI.
Nvidia recognizes that its role as an infrastructure and ecosystem provider allows it to benefit both directly and indirectly as the broader tide of AI computing rises, even as direct competition is poised to intensify.
Nvidia has supplied components to automakers for many years and has also had robotics platforms for several years. The difference now lies in their integration with AI infrastructure, which can facilitate better training for the models deployed in those devices and provide the real-time inferencing data necessary for their operation in real-world scenarios. Although this integration with infrastructure is arguably a modest advancement, within the larger context of the company’s overall AI infrastructure vision, it aligns the numerous initiatives into a cohesive strategy.
Understanding the myriad elements unveiled by Huang and Nvidia at this year’s GTC is no simple feat, especially given the overwhelming flow of announcements and the expansive reach of the company’s aspirations. Once the various pieces fit together, however, Nvidia’s strategic vision emerges clearly: the company is preparing to assume a significantly larger role than in the past and is well-positioned to realize its ambitious goals.
Ultimately, Nvidia understands that serving as an infrastructure and ecosystem provider enables it to gain advantages both directly and indirectly as the overall AI computing tide rises, even amidst inevitable increases in competition. This strategic approach has the potential to foster even greater growth moving forward.
Bob O’Donnell is the founder and chief analyst of TECHnalysis Research, LLC, a technology consulting firm that provides strategic consulting and market research services to the technology industry and professional financial community. Follow him on Twitter @bobodtech.