Anthropic CEO Dari… · (ↄ)avlache

Anthropic CEO Dario Amodei 重新回顾一段关于 AI 规模化与时间线的长篇访谈，并指出他在 3-years-ago 对「1 hour 的 AI 对话会变得难以与受过良好教育的人区分」的预测大致成立。他现在给出 50% 的机率认为「资料中心里的天才国家」会在 1-3 years 出现，并表示他在 10 years 内有 90% 的把握，但同时强调「we are obviously not there yet。」讨论被异常庞大的商业数字所框定：文中称 Anthropic 的营收连续 3 年大约每年成长 10x，据报在 2026-02-12 完成一轮 $30B 融资，并称投后估值约 $380B，引出关于扩展究竟买到了什么、多少生产力提升是真实的，以及经济与监管能否跟上的问题。

在技术层面，Amodei 主张强化学习的规模化在本质上与预训练没有不同：两者都会随著更多算力呈现近似对数线性的增益，呼应他们在 2017 的内部「Big Blob of Compute」观点，该观点强调 7 ingredients（compute、data quantity、data quality and breadth、training time、scalable objectives for pretraining and RL、以及 numerical-stability/normalization methods）。谈到 AI 程式设计，他估计目前端到端的生产力提升约为 15%-20%，高于约 6 months 之前的大约 5%，并预期还会加速，但会受 Amdahl's law 限制，因为瓶颈会转移；他也对比一项 METR 随机试验（2025-07），其中 16 位资深开源开发者完成 246 个真实任务，使用 AI 工具的组别虽然感觉快了 20%，实际上却慢了 19%。在经济层面，他勾勒一个简化的产业模型：大约 50% 的算力用于训练、50% 用于推理，推理的毛利率超过 50%，而亏损主要来自需求预测错误（算力买多了就被迫把多余算力投入非营收的训练；买少了则推理营收很高但会挤压训练）。他认为扩散「extremely fast but not instantaneous」，并举出营收台阶如 2023: $0 to $100M、2024: $100M to $1B、2025: $1B to about $9B-$10B、以及 2026-01 又增加数十亿；即使出现「genius country」，他也认为宏观扩散可能看起来像每年 3x-5x，或在非常大规模下至多约每年 10x。

访谈强调一个核心策略取舍：即使突破很快到来，购买算力仍需要提前 1-2 years 承诺，而在兆美元规模上，时间判断差 1-2 years 都可能致命；他给出一个风格化例子：预测 $1T 营收却买了 $5T 的算力，如果现实只有 $800B，就没有对冲空间。这与竞争对手以兆级规模描述的基础设施押注形成对比（例如一个 $500B 的「Stargate」计划、约 7 GW 的规划容量、以及引用的承诺合计约 $1.4T），并伴随财务风险主张，例如预计在 2026 亏损 $14B、到 2028 营运亏损达到营收的约 75%、到 2029 累计现金消耗约 $115B；相较之下，文中称 Anthropic 目标是在 2026 将现金消耗降至营收的约 1/3、在 2027 降至约 9%，并在约 2028 达到损益两平。政策面含义包括：反对在没有联邦方案下对州级 AI 监管实施 10-year freeze；批评田纳西一项于 2025-12-18 提出的提案，该提案将某些「emotional support」训练定为 Class A felony，刑期 15-25 years；并主张限制向中国出售晶片/资料中心，同时允许医疗治疗；他也警告不平等的取得与像 FDA 吞吐量等瓶颈。保留意见集中于不确定性与衡量：内部声称有「unambiguous」增益，对照外部结果如 -19% 速度；「on-the-job learning」可能被大上下文使用取代（例如 1,000,000 tokens 近似为 1,000,000 words，或数天到数周的阅读量）；并承认即便时间线很有信心，也不会直接转化为全押式行为，因为主宰生存风险的是时间误差而非方向。

Anthropic CEO Dario Amodei revisits a long-form interview about AI scaling and timelines, noting that a 3-years-ago prediction about 1 hour of AI conversation becoming hard to distinguish from a well-educated human has largely held up. He now assigns a 50% chance that a “genius country in a data center” emerges in 1-3 years and says he is 90% confident within 10 years, while also stressing “we are obviously not there yet.” The discussion is framed by unusually large business numbers: Anthropic’s revenue is described as growing roughly 10x per year for 3 consecutive years, it reportedly closed a $30B funding round on 2026-02-12, and the post-money valuation is stated as about $380B, setting up questions about what scaling is buying, how much productivity is real, and whether economics and regulation can keep up.

On the technical side, Amodei argues reinforcement learning scaling is not fundamentally different from pretraining: both show approximately log-linear gains with more compute, aligning with a 2017 internal “Big Blob of Compute” view that emphasizes 7 ingredients (compute, data quantity, data quality and breadth, training time, scalable objectives for pretraining and RL, and numerical-stability/normalization methods). For AI coding, he estimates current end-to-end productivity lift at about 15%-20%, up from roughly 5% around 6 months earlier, and expects further acceleration but constrained by Amdahl’s law as bottlenecks shift; he contrasts this with a METR randomized trial (2025-07) where 16 experienced open-source developers did 246 real tasks and the AI-tool group was 19% slower despite feeling 20% faster. On economics, he sketches a simple industry model where about 50% of compute goes to training and 50% to inference, inference gross margins exceed 50%, and losses are mainly demand-forecasting errors (overbuying compute forces excess into non-revenue training; underbuying yields high inference revenue but squeezes training). He argues diffusion is “extremely fast but not instantaneous,” citing revenue steps like 2023: $0 to $100M, 2024: $100M to $1B, 2025: $1B to about $9B-$10B, and 2026-01 adding several more billions; even with “genius country,” he suggests macro diffusion could look like 3x-5x per year or at most about 10x per year at very large scales.

The interview highlights a central strategic tradeoff: if breakthroughs arrive soon, buying compute still requires committing 1-2 years ahead, and being off by even 1-2 years can be fatal at trillion-dollar scale; he gives a stylized example where forecasting $1T revenue but buying $5T of compute leaves no hedge if reality is $800B. This is contrasted with competitors’ infrastructure bets described in trillion-scale terms (e.g., a $500B “Stargate” plan, around 7 GW planned capacity, and cited commitments totaling about $1.4T), alongside financial-risk claims such as a projected $14B loss in 2026, operating losses reaching about 75% of revenue by 2028, and cumulative cash burn of about $115B by 2029; by comparison, Anthropic is described as targeting cash burn declining to about 1/3 of revenue in 2026 and about 9% in 2027 with breakeven around 2028. Policy implications include opposing a 10-year freeze on state AI regulation without a federal plan, criticizing a Tennessee proposal introduced 2025-12-18 that would make certain “emotional support” training a Class A felony punishable by 15-25 years, and advocating restrictions on selling chips/data centers to China while allowing medical treatments; he also warns about unequal access and bottlenecks like FDA throughput. Caveats center on uncertainty and measurement: internal claims of “unambiguous” gains versus external results like -19% speed, the possibility that “on-the-job learning” may be replaced by large-context use (e.g., 1,000,000 tokens approximated as 1,000,000 words, or days-to-weeks of reading), and an acknowledgement that even confident timelines do not translate cleanly into all-in behavior because timing error, not direction, dominates survival risk.