Nvidia's $26 Billion Open-Source AI Layout, Nemotron 3 Super Challenges Chinese Models

MarketWhisper

NVIDIA releases Nemotron 3 Super

NVIDIA officially launched Nemotron 3 Super on Thursday, an open-weight AI model with a total of 120 billion parameters, specifically optimized for autonomous AI agents and ultra-long context tasks. NVIDIA announced a strategic plan to invest $26 billion over five years to develop open-source AI models, directly responding to the rapid rise of open-source models in China in the global market.

Technical Architecture of Nemotron 3 Super: An In-Depth Integration of Three Rare Components

The core design of Nemotron 3 Super addresses a fundamental challenge in multi-agent systems—each tool call, reasoning step, and context fragment requires re-transmitting large amounts of data from scratch, leading to soaring costs and model drift. NVIDIA integrates three components that are rarely seen together within the same architecture:

Mamba-2 State Space Layers serve as an alternative to attention mechanisms, offering faster processing and higher memory efficiency when handling long token streams; Transformer attention layers ensure precise information recall; and the new “Latent Mixture of Experts” (Latent MoE) design compresses tokens before routing, enabling the model to activate four times as many expert modules at the same computational cost.

The model is trained natively in NVIDIA’s proprietary NVFP4 format, learning at 4-bit precision from the very first gradient update, avoiding the accuracy loss associated with high-precision training followed by compression. The context window reaches 1 million tokens, capable of fully storing codebases or approximately 750,000 English words.

Performance Benchmarks and Enterprise Applications

Below are key comparative data points for Nemotron 3 Super’s inference throughput:

  • Compared to OpenAI GPT-OSS 120B: 2.2 times faster
  • Compared to Alibaba Qwen3.5-122B: 7.5 times faster
  • Compared to its predecessor: Overall throughput increased by over 5 times

NVIDIA has fully disclosed the training process, including model weights on Hugging Face, 10 trillion carefully selected pre-training samples (training over 25 trillion samples in total), 40 million post-training samples, and reinforcement learning schemes covering 21 environment configurations. Currently, Perplexity, Palantir, Cadence, and Siemens have integrated this model into their workflows.

$26 Billion Strategic Intent: Responding to China’s Open-Source Model Rise Globally

The release of Nemotron 3 Super is part of NVIDIA’s broader strategic deployment. Bryan Catanzaro, Vice President of Deep Learning Research at NVIDIA, told Wired magazine that the company recently completed pre-training a model with 550 billion parameters, and the $26 billion open-source AI investment plan over five years was announced simultaneously.

The strategic urgency is clear: according to research by OpenRouter and Andreessen Horowitz, the global usage share of Chinese open-source models has surged from 1.2% at the end of 2024 to about 30% by the end of 2025; Alibaba’s Qwen has surpassed Meta’s Llama to become the most widely used self-hosted open-source model (data from Runpod). Reports suggest that DeepSeek’s next-generation model was trained entirely on Huawei chips, which, if true, would provide a strong incentive for developers worldwide to adopt Chinese hardware—precisely the scenario NVIDIA aims to counter through its open-source strategy.

Frequently Asked Questions

Q: How does Nemotron 3 Super compare to Qwen and GPT-OSS?
In inference throughput, Nemotron 3 Super is 2.2 times faster than OpenAI GPT-OSS 120B and 7.5 times faster than Alibaba Qwen3.5-122B. Its core innovation lies in the hybrid Mamba-Transformer MoE architecture and native NVFP4 4-bit training, enabling it to activate more expert modules at the same computational cost, resulting in over fivefold throughput improvement over its predecessor.

Q: Why is NVIDIA investing $26 billion now to develop open-source AI models?
The main motivations are twofold: first, to prevent China’s open-source model ecosystem from forming a closed loop with Chinese chips, thereby weakening NVIDIA’s core position in global AI infrastructure; second, to create stronger procurement stickiness for its hardware through open-source models optimized for NVIDIA chips. The rapid increase in China’s open-source model market share from 1.2% to about 30% underscores the urgency of this strategic move.

Q: Are the training data and model weights of Nemotron 3 Super fully 공개?
Yes, NVIDIA has 공개 the complete training process on Hugging Face, including model weights, 10 trillion carefully selected pre-training samples, 40 million post-training samples, and reinforcement learning schemes across 21 environment configurations. This level of transparency exceeds that of most commercial models of similar type.

View Original
Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.
Comment
0/400
No comments