Nvidia 被描绘为在训练与推理(inference)两端都占主导地位的 AI 晶片供应商,这一地位系于其 GPU 与约 $4.5 trillion 的市值;但创办人与投资人正愈来愈多地探索替代方案,作为对冲、实验,或只是为了在供应吃紧的市场里确保货源。Jump Trading 的 CTO Alex Davies 表示,市场预期正从「单一赢家」的格局转移,例子是 Jump 共同领投一轮 $230 million 的融资给专注推理的新创 Positron,并且也成为其客户。投资人看到的机会主要集中在推理,也就是运行已训练模型的高用量、面向使用者的阶段,在这里延迟与吞吐量会直接影响产品体验与成本。
文章将近期的交易热度视为推理专用硬体正获得可信度的证据:据称 Jensen Huang 主导以「并购吸收人才」(acquihire)的方式延揽 Groq 的晶片团队,并传出另有 $20 billion 的付款用于授权 Groq 的推理技术;同时 Cerebras 与 OpenAI 签下 $10 billion 的合约,提供快速推理晶片。文中称 Anthropic 也签署协议使用非 Nvidia 晶片,包括 Amazon 的 Trainium 与 Google 的 TPU;而 Microsoft 发布了第二代 Maia 晶片,并仍保有取得 OpenAI 晶片智慧财产的权利。新创募资也被指正在加速:D-Matrix 在 November 募得 $275 million,Etched 上个月募得约 $500 million;此外据称 SambaNova 放弃以远低于上一轮估值的价格洽谈出售,转而寻求新资金。
一个关键技术驱动因素是「reasoning」模型正在增加推理阶段的计算量,使训练与推理之间的界线更不清晰,并推升对更快推理的需求;据称在去年年初中国开源推理模型 DeepSeek 发表后,相关兴趣上升。即便如此,文中仍形容 Nvidia 异常强悍,拥有多条产品线,并承诺每年进行一次完整晶片重新设计,而与 Groq 的交易可能进一步扩大其触角;Huang 也不排除推出专用推理晶片,D-Matrix 的 CEO 则预期会在 Nvidia 于 March 的旗舰大会上看到回应。需要注意的是规模:尽管大公司正推动自研晶片(Amazon、Google、Microsoft、OpenAI),文章认为它们今天仍仰赖庞大的 Nvidia GPU 数量;因此未解的问题是,这些「裂缝」是否会大到足以支撑一个可观且长久的竞争性推理晶片市场。
Nvidia is portrayed as the dominant AI-chip supplier for both training and inference, a position tied to its GPUs and a market capitalization of about $4.5 trillion, but founders and investors are increasingly exploring alternatives as hedges, experiments, or simply to secure supply in a tight market. Jump Trading’s CTO Alex Davies says the expectation is shifting away from a single-winner dynamic, highlighted by Jump co-leading a $230 million funding round into inference-focused startup Positron and also becoming a customer. The opportunity investors see is concentrated in inference, the high-volume, user-facing phase of running trained models, where latency and throughput directly shape product experience and cost.
The piece frames recent deal flow as evidence that inference-specialized hardware is gaining credibility: Jensen Huang reportedly orchestrated an acquihire of Groq’s chip team alongside a reported $20 billion payout to license Groq’s inference technology, while Cerebras signed a $10 billion deal to provide fast inference chips to OpenAI. Anthropic is described as signing agreements to use non-Nvidia chips, including Amazon’s Trainium and Google’s TPUs, and Microsoft released a second version of its Maia chip and retains access to OpenAI chip intellectual property. Startup fundraising is cited as accelerating with D-Matrix raising $275 million in November and Etched raising about $500 million last month, while SambaNova reportedly stopped sale talks at a much lower valuation than its prior round in favor of new cash.
A key technical driver is that “reasoning” models are increasing inference-time computation, making the boundary between training and inference less clear and increasing demand for faster inference; interest reportedly rose after the debut of Chinese open-source reasoning model DeepSeek early last year. Even so, Nvidia is described as unusually formidable, with multiple product lines and a commitment to a complete chip redesign once a year, and the Groq deal could further extend its reach; Huang also would not rule out a specialized inference chip, and D-Matrix’s CEO expects a response at Nvidia’s flagship conference in March. The caveat is scale: despite big firms pursuing in-house silicon (Amazon, Google, Microsoft, OpenAI), the article argues they still rely on massive numbers of Nvidia GPUs today, so the open question is whether these “cracks” become large enough to support a sizeable, durable market for competing inference chips.