Alphabet 旗下 Google…

Alphabet 旗下 Google Cloud 在 Google Cloud Next 发表新一代 TPU，将分为 TPU 8t（训练）与 TPU 8i（推论）两款，并同步推出 750 百万美元新基金与 AI agent 开发工具。Alphabet 股价于纽约盘前上涨 1.7%。公司持续强化自研 AI 晶片布局，以对抗以 Nvidia 为主导的 AI 晶片生态，并延伸其 in-house 晶片优势。

TPU 8t 用于建立 AI 软体，TPU 8i 用于部署后的服务阶段。Mark Lohmeyer 指出，关键是以最低可能交易成本交付最低延迟，而交易量将持续上升；因此每笔交易成本必须下降。TPU 的设计将更多资料直接置于晶片，降低跨元件资料存取，对多步骤推理特别有利。TPU 8t 可组成 9,600 颗半导体的超大系统，而资料中心电力已成主要瓶颈。TPU 8t 较前代提升 124% 每瓦性能，TPU 8i 则提升 117%；内部网路优化亦提高晶片间通讯效率。Google 表示相关 AI 系统将于今年晚些时候一般可用。

竞争面向方面，Google 仍保留基于 Nvidia 晶片的现有服务，并表示将率先部署 Nvidia 下半年推出的新设计。Nvidia 正加码推动推论晶片，Jensen Huang 表示超过 20% 的 AI 工作负载可能更适合这类专用晶片；其去年12月以 200 亿美元授权并吸收 Groq 核心团队。另一边 Google 亦推出 AI agent 工具与企业导入方案：创建与追踪 AI agent，并透过 750 百万美元基金在 12 个月内协助顾问公司训练工程师、建置 agent；DeepMind 将先行透过 Gemini Enterprise 将部分模型与实务回馈机制给指定企业。

Alphabet's Google Cloud unveiled the latest TPU generation at Google Cloud Next, with two versions—TPU 8t for training and TPU 8i for inference—and simultaneously announced a $750 million fund plus AI-agent tooling. Alphabet shares were up 1.7% in New York premarket. The launch reinforces Google’s in-house AI chip strategy, positioning it more strongly against an Nvidia-led AI-compute landscape.

TPU 8t is intended for building AI software, while TPU 8i is for running services after models are built. Mark Lohmeyer said the goal is the lowest possible response latency at the lowest cost per transaction, while transaction volume keeps rising. Chip-level data locality is intended to reduce memory fetch delays, especially for multi-step reasoning. TPU 8t can be assembled into clusters of 9,600 semiconductors; power is a major data-center constraint at this scale. Performance-per-watt improves by 124% for TPU 8t and 117% for TPU 8i versus prior generation, supported by improved in-house networking. Google said these services are expected to be generally available later this year.

On the competitive front, Google will continue Nvidia-based services for existing customers and plans early deployment of Nvidia’s second-half design. Nvidia is also emphasizing inference chips, and Jensen Huang said more than 20% of AI workloads may best fit this architecture; Nvidia reportedly paid $20 billion last December for Groq technology and hired much of its team. Google also launched enterprise AI-agent tools, including collaboration features and agent progress channels, backed by a $750 million fund deployed over the next 12 months. DeepMind will give selected firms early access to Gemini, with joint support to help consultants train engineers and build enterprise agents.