Anthropic的非公开模型Myt…

Anthropic的非公开模型Mythos已经在生物专家挑选的最困难数据计算任务中解决了约三分之一，包括从原始DNA反向构建细胞类型的任务，该任务超越了已测试人类的表现。这预示着AI正从“纸面型”能力迈向可能赋予用户生物领域超人类能力的阶段，风险情景已从病毒合成、生成新型神经毒素扩展到“镜像生命”等极端用途，并涉及社会级灾难。

文章指出生物安全风险已高于网络安全风险，因为单一工程化病原体可能造成数十亿人死亡，人类无法承受试错代价；与软件可快速修补不同，生物系统高度脆弱且难以纠偏。尽管当前公开模型主要表现为“应试能力”，但Anthropic警告它们可能很快就能指导实验新手，而在抗越狱测试中仍有90%的新手参与者成功从应拒绝回答的模型中提取病毒学信息，显示现有防线脆弱。

文章把控制手段分为三类：拒绝危险请求、剔除敏感训练数据、监管双用途供应链，但这些都被认为不足，因为强模型可能从原理推导出缺失知识，且像病原体制造这种场景中“离线工具+现成技术”让实体世界监控非常困难。因此被强调需要基础性AI科学突破（如训练后干预、偏向错误答复、定位并禁用合成生物学相关神经元），在此之前应严格限制高风险模型的访问，尤其是开源模型，因为其一旦流出难以召回，而合法科研（如DeepMind的Isomorphic Labs）仍需在更严格安全协议下继续。

Anthropic’s non-public Mythos model has already solved about one third of the hardest biology data-computation tasks curated by experts, including reverse-engineering a cell type from raw DNA data beyond tested human performance. This signals a shift from paper-style AI competence toward potentially granting users superhuman biological capabilities, with scenarios ranging from virus synthesis and novel neurotoxins to “mirror life,” including extinction-scale misuse risk.

The analysis argues biosecurity now exceeds cyber risk because a single engineered pathogen could cause billions of deaths with no room for iterative learning; unlike software, biological systems are not quickly patched. Public models still look mainly exam-focused, yet Anthropic warns they may soon guide lab novices, and one jailbreak study found 90% of novice users could still extract virology answers from models that should have refused, while analogous content-removal efforts have also failed.

Control options are portrayed as partial: query refusal, removing sensitive training data, and monitoring dual-use suppliers, because capable models may reconstruct removed knowledge, and pathogen creation is comparatively easy and dispersed in the physical world. Consequently, foundational AI breakthroughs are urged—post-training model interventions, response-biased-to-error methods, and neuron-level shutdown for synthetic-biology activations—while strict access limits (especially for open-source models that cannot be recalled) are maintained to preserve legitimate scientific work under security protocols.

Source: The world must stop AI from empowering bioterrorists

Subtitle: The threat from new pathogens is an even graver danger than AI-backed hackers

Dateline: 5月 07, 2026 06:28 上午