4月7日，Anthropic 宣布其…

4月7日，Anthropic 宣布其新模型 Mythos 不向公众发布，而是进入“Project Glasswing”受控访问计划，创始成员共 12 家，包括 Apple、Google 与 Nvidia；Anthropic 认为 Mythos 在发现并利用安全漏洞方面已超越“除最熟练人士外的所有人类”，且仅需最少人工协助。该消息引发了明显担忧，几天后 OpenAI 又发布了自身受限的黑客导向模型 GPT 5.4 Cyber。

英国测试显示，Mythos 在较简单的安全任务中与其他模型仅持平，但在更高难度基准中领先——该测试需经过数十步才可接管目标机器，并且 Anthropic 称其已发现“数千个”高危或严重级别的零日漏洞。尽管如此，许多漏洞仍是普通且可理解的，AISLE 的 Fort 报道他用更小更旧的模型也复现出 FreeBSD 的同类漏洞，这支持了“前沿呈锯齿状、并非单一模型遥遥领先”的判断。

Anthropic 正将 Glasswing 扩展到另外 40 家数字基础设施机构，目标是在更强模型普及前让防守方尽可能修补漏洞，但安全竞争仍在短期内加速：Anthropic 称单个漏洞发现耗费近 2 万美元的 token。另一面是大量代码无人维护——包括 Linux 的部分代码、家用路由器、智能电视、冰箱和工业机械中的代码——加大了攻击风险，尽管质量有改善信号：1 月 OpenSSL 修复了 12 个由 AI 发现的缺陷，2025 年旧版 Claude 也修复了 Firefox 高危漏洞中近五分之一。

How AI hackers will shake up cyber-security image

On April 7, Anthropic announced that its new model Mythos would not be released to the public, but run through a controlled-access program called Project Glasswing whose 12 founding members include Apple, Google, and Nvidia, saying Mythos had surpassed all but the most skilled humans in finding and exploiting security flaws with minimal human help. The announcement raised major concern, and a few days later OpenAI released a closed, hacking-oriented model of its own, GPT 5.4 Cyber.

British testing found Mythos only on par with other models on simpler security tasks, but ahead on a harder benchmark requiring dozens of steps to take over a target machine, and Anthropic says it found “thousands” of high- or critical-severity zero-day vulnerabilities. Still, many flaws were ordinary enough that humans could find them, and Fort at AISLE reported reproducing a FreeBSD bug with smaller older models, supporting his view that the frontier is jagged rather than dominated by one model.

Anthropic is expanding Glasswing to 40 additional digital-infrastructure organisations so defenders can patch bugs before stronger models spread, yet the race is still a near-term one: Anthropic says one bug-finding run cost nearly $20,000 in tokens. Meanwhile much code in the wild has weak maintenance—such as parts of Linux, routers, smart TVs, fridges, and industrial machinery—so risk remains, even as quality indicators improve (OpenSSL fixed 12 AI-found bugs in January, and an older Claude version fixed almost one-fifth of Firefox’s 2025 high-severity fixes).

Source: How AI hackers will shake up cyber-security

Subtitle: The technology could eventually favour the defenders—but expect a bumpy ride

Dateline: 4月 16, 2026 05:30 上午

Attachments