robot

From GTC to the $3 Trillion Club: NVIDIA's Ascent

This article takes a detailed look back at NVIDIA's GTC Conference from 2016 to 2024, emphasizing the company's continuous innovation in the field of artificial intelligence and the significant increase in its market valuation. In 2016, NVIDIA introduced the DGX-1 and the NVLink-supported P100 GPU, marking a new era in AI. In 2017, the introduction of the V100 GPU with Tensor cores further solidified its leadership position in the AI field. By 2020, with the launch of the A100 and Megatron, NVIDIA focused on optimizing large language models (LLMs). In 2021, NVIDIA announced the development of the ARM-based Grace CPU, laying the groundwork for future data center solutions. In 2022, it launched the H100 GPU optimized for LLMs, as well as a fully upgraded Omniverse and Digital Twin technology. These innovations have not only driven a substantial increase in NVIDIA's stock price and market value but also made it one of the creators of the AI wave.

article image

Introduction: This article provides a detailed review of NVIDIA's GTC Conference from 2016 to 2024, emphasizing the company's continuous innovation in the field of artificial intelligence and the significant increase in its market valuation. In 2016, NVIDIA introduced the DGX-1 and the NVLink-supported P100 GPU, marking a new era in AI. In 2017, the V100 GPU with Tensor cores further solidified its leadership in the AI field. By 2020, with the launch of the A100 and Megatron, NVIDIA focused on optimizing large language models (LLMs). In 2021, NVIDIA announced the development of the ARM-based Grace CPU, laying the foundation for future data center solutions. In 2022, the company launched the H100 GPU optimized for LLMs, as well as a fully upgraded Omniverse and Digital Twin technology. These innovations not only drove a substantial increase in NVIDIA's stock price and market value but also made it one of the creators of the AI wave.

Body:

Annual Innovation, Yearly Moat Building: A Look at How NVIDIA Step by Step Joined the $3 Trillion Club

I. Overview of GTC Conference

A. Basic Information

  GTC, an acronym for GPU Technology Conference, is hosted by NVIDIA. The inaugural event took place in San Jose, California, in 2009, with the conference held once or twice a year, typically in March. Occasionally, it is also held in the fall. Besides the United States, the conference has been hosted in Europe, China, Japan, and other locations. The conference generally lasts about 2 hours, primarily presented by Jen-Hsun Huang, with occasional guest support, and the presentation is filled with technical content.

B. Significance

  The conference is crucial for understanding NVIDIA's development journey by conveying the latest updates in NVIDIA's hardware and software, as well as cooperation dynamics. Many investment institutions have confusion about NVIDIA's development due to a lack of in-depth study of GTC. The author has been following GTC since 2016, and this article sorts out the highlights related to AI from 2016 to 2024, showing NVIDIA's legendary journey.

II. Key Content of Past GTC Conferences

A. [GTC 2016]

  Key Product Launches

  P100 with NVLink support: Based on the CUDA core, it supports mixed-precision and 3D memory, with an FP16 performance of 21.2 TFLOPS. It is the first training GPU suffixed with 100, marking NVIDIA's GPU entering the era of multi-machine and multi-card.

  DGX-1: Equipped with 8 Tesla P100 GPUs, the performance reaches 170 TFLOPS. The initial partner was SAP, and later it was delivered to OpenAI, sowing the seeds for the AI revolution.

  Conference Significance

  In this year, AlphaGo defeated Lee Sedol, and NVIDIA officially defined GTC as a conference on deep learning and artificial intelligence, with the focus completely shifting to the field of artificial intelligence.

B. [GTC 2017]

  Key Product Launches

  V100 with Tensor Cores: The core design is optimized for Tensors, while maintaining CUDA compatibility, with key performance reaching 120 TFLOPS, nearly six times that of P100, and NVLink bandwidth reaching 300GB, pioneering a new era for AI.

  DGX-1V: Designed based on 8 V100s, performance increased from 170 TFLOPS to 960 TFLOPS.

  V100 with SXM interface: Uses higher bandwidth NVLink 2.0 interconnect, more suitable for server applications, while retaining the PCIe interface to meet different needs.

  Conference Features

  Introduced the "I am AI" opening video, with music composed by AIVA, performed by different orchestras at different stages, and content updated with technology.

  Emphasized "The More You Buy, The More You Save," reflecting business sales skills.

C. [GTC 2018]

  Key Product Launches

  T4 Inference Card: Inheriting V100 and Turing innovations, it supports mixed-precision computing and is equipped with 16 GB GDDR6 memory, with a power of 70w. It is an excellent inference card that has promoted the large-scale application of deep learning models to online services.

  TensorRT: A high-performance universal model engine that can optimize models in various ways and supports all model formats, which is user-friendly for those who do not want to bind to a specific framework.

  Triton: Separates the service functions of TensorRT, focusing on the management and deployment of inference services, and together with TensorRT, it forms a distributed inference open-source solution friendly to small and medium-sized enterprises.

  GPUs on K8s: K8s works well with GPUs, providing virtualization and user isolation for NVIDIA GPUs, enabling GPU Cloud to provide more flexible services.

  Conference Significance

  The focus of the conference shifted from training to inference, optimizing hardware and software related to inference.

D. [GTC 2019]

  Key Events and Product Launches

  Acquisition of Mellanox: Spent $7 billion to acquire the leader in the InfiniBand field, Mellanox, to complete the NVLink ecosystem, which is crucial for the construction of large-scale GPU machine rooms and the development of LLMs.

  Autonomous driving-related displays: Although autonomous driving technology has been promoted since 2017, in 2019, its in-vehicle solutions did not show advantages. The introduction time was reduced in subsequent GTC conferences, and NVIDIA began to invest in startups in the robotics field.

  General Situation of the Conference

  The AI field has not yet broken through, and no new server cards were launched. The focus of the launch was on the gaming side RTX graphics cards and Ray Tracing, and the conference was a bit dull.

E. [GTC 2020]

  Key Product Launches

  A100 for LLMs: The core architecture uses Tensor cores and has been improved in many aspects, such as memory upgrades, Tensor Core improvements, and bandwidth upgrades, with a significant increase in performance. It can be physically virtualized 1 to 7 and is cloud service friendly.

  Megatron: By integrating data parallelism and model parallelism, etc., modifications to PyTorch, successfully trained an 8.3B-sized model on 512 GPUs, and later shifted the focus to distributed multi-machine inference.

  Conference Features

  Held online due to the epidemic, hosted in Jen-Hsun Huang's kitchen, with kitchen memes.

F. [GTC 2021]

  Key Product Launches and Events

  Grace CPU: Launched based on the ARM architecture, by integrating CPU and GPU, using NVLink Chip-2-Chip interconnect technology, overall performance is improved, but the attempt to acquire ARM failed due to antitrust investigations.

  Virtual Digital Person Jen-Hsun Huang: For 15 seconds, it was replaced by a virtual digital person, showing the effect of Omniverse.

  General Situation of the Conference

  Held online, no new GPU launched, AI technology was dull, but the伏笔 for Grace CPU was laid.

G. [GTC 2022]

  Key Product Launches

  H100 with Transformer Core: In addition to being equipped with a Transformer core, it also has 80GB HBM3 memory, with an FP16 performance of up to 2000 TFLOPS, and introduces NVLink 4.0, making it the best solution for LLMs, but its application scenarios in 2022 were immature.

  Fully upgraded Omniverse and Digital Twin: Omniverse has transformed from a 3D design collaboration platform to a virtual world simulation platform, introducing the concept of Digital Twin.

  General Situation of the Conference

  AI is in a period of confusion, the overall tone of the conference is dark, and there are fewer smiles.

H. [GTC 2023]

  Key Product Launches and Events

  GH200 of Grace Hopper architecture: H100 GPU and Grace CPU are soldered onto the same board, forming a stronger combination, highlighting technological layout capabilities.

  Chat With Ilya Sutskever: A fireside chat between Jen-Hsun Huang and the Chief Scientist of OpenAI, discussing ChatGPT-related technologies, although not getting too many technical details, it has attracted attention.

  General Situation of the Conference

  The haze of the new crown has passed, ChatGPT has succeeded, the GTC conference is brightly colored, and the last year of GTC was held online.

I. [GTC 2024]

  Key Product Launches and Events

  B200 and GB200: Powerful performance, GB200 enhances the advantages of Grace-n, 1 drags 2 to reach 40 PFLOPS.

  NIMs: A complete set of model inference microservices, composed of technologies such as Triton, TensorRT, K8s on GPU, simplifying and accelerating the deployment and inference of GAI models.

  Robotics & Isaac: Displaying robot field technology, the Isaac platform provides development tools and simulation environments, the Jetson platform supports computing tasks, and combined with Omniverse, robots can be trained in a virtual environment, but the current technological maturity is not enough.