← 返回 Avalaches

Google 正在重组 Project Mariner 背后的团队,因为行业注意力正从独立浏览器代理转向 coding 和 terminal-based agents。这个信号部分体现在数字上:Perplexity 的 Comet 在 2025 年 12 月仅达到 280 万 weekly active users,而 OpenAI 的 ChatGPT Agent 据称近月已跌破 100 万 weekly active users。相比 ChatGPT 每周数亿级用户规模,browser-agent usage 在统计上只属边缘量级,表明行业热度与消费者采用之间存在巨大落差。

性能趋势也更偏向 terminal-first agents。浏览器代理依赖反复截图和视觉解析,这会提高计算成本、拖慢执行速度并降低可靠性。相比之下,Claude Code 和 OpenClaw 这类 terminal-based systems 以文本运作,而文中一项行业估计称,它们达到相同结果所需步骤约少 10 倍到 100 倍。这种效率转移有助于解释,为何 Google 正把 Mariner 能力并入更广泛的 agent products,而不再将浏览器导航视为主要产品类别。

不过,市场并未转向纯 terminal 模式。一家初创公司声称,其基于视频的 computer-use model 比早期基于截图的方法高效 50 倍,这表明 GUI automation 仍在改进。文中另一项估计提出 80/20 分布:许多任务可以通过 terminal 解决,但仍有一部分稳定存在的任务需要 GUI access,尤其是 legacy software 和没有 API 的网站。因此,更广泛的统计模式是战略再分配,而非彻底放弃:AI labs 正优先押注 coding agents,同时保留 computer-use capabilities 作为次级但必要的一层。

Google is restructuring the team behind Project Mariner as industry attention shifts away from standalone browser agents toward coding and terminal-based agents. The signal is partly numerical: Perplexity’s Comet reached only 2.8 million weekly active users in December 2025, while OpenAI’s ChatGPT Agent reportedly fell below 1 million weekly active users. Against ChatGPT’s weekly audience in the hundreds of millions, browser-agent usage is statistically marginal, indicating a major gap between industry hype and consumer adoption.

The performance trend also favors terminal-first agents. Browser agents rely on repeated screenshots and visual interpretation, which raises compute cost, slows execution, and reduces reliability. By contrast, terminal-based systems such as Claude Code and OpenClaw operate in text, and one industry estimate suggests they require roughly 10x to 100x fewer steps to reach the same result. This efficiency shift helps explain why Google is folding Mariner capabilities into broader agent products rather than treating browser navigation as the primary product category.

The market is not moving to a pure terminal model, however. One startup claims a video-based computer-use model is 50x more efficient than earlier screenshot-based approaches, suggesting GUI automation is still improving. Another estimate in the article proposes an 80/20 split: many tasks can be solved through the terminal, but a persistent minority still requires GUI access, especially for legacy software and websites without APIs. The broader statistical pattern is therefore strategic reallocation, not full abandonment: AI labs are prioritizing coding agents while preserving computer-use capabilities as a secondary but necessary layer.

 

2026-03-20 (Friday) · adce8a91ab20e107d2c3f3e5010387c40258d3af