年3月，Wall Street Jo…

2026年3月，Wall Street Journal报导美国中央司令部在打击伊朗行动中使用 Anthropic 的 Claude，执行「情报评估、目标识别与战场情境模拟」。这发生在 Donald Trump 下令联邦机构停用 Claude 的数小时后；但该系统已深度嵌入五角大厦流程，替换需时数月。Claude 亦被用于1月逮捕 Nicolás Maduro 的行动。真正关键仍未公开：Claude 是否实际标示打击地点、估算伤亡，且目前没有强制揭露义务。

Anthropic 在去年11月与 Palantir 合作，将 Claude 置入美国政府情报与国防决策支援。到今年1月，Anthropic 又向五角大厦提出1亿美元（原文：$100 million）方案，主打以语音把指挥官意图转成无人机群协同指令。虽然投标被拒，需求范围却显示用途不只摘要，而是涵盖「目标相关态势共享」与致命无人机群从发射到终止的全流程。这些进展发生在监管真空中，而大型语言模型的幻觉问题仍具结构性与持续性。

历史案例已示警：在加萨使用的 Lavender 系统以1到100分评估个体；当分数跨过门槛即标记为目标。调查指出其错误率约10%，约3,600人被误判。学者指出，AI 在战争中的「加速」效应会以更大规模、更少人类审查推动决策，且过去约15年军事AI可视性持续下降。依《日内瓦公约》第36条，新武器须先审查，但会持续更新学习的AI使规则难落实。美国无人机计划也曾历经近15年外泄、媒体与诉讼压力，才在2016年首次公布伤亡数据；AI战场治理需要更早、更多的透明化压力，以避免灾难性错误后才补救。

In March 2026, the Wall Street Journal reported that US Central Command used Anthropic’s Claude during strikes on Iran for intelligence assessments, target identification, and battle simulation. This happened only hours after Donald Trump ordered federal agencies to stop using Claude; however, the system was already deeply embedded in Pentagon workflows, and replacement was expected to take months. Claude was also used in the January operation that captured Nicolás Maduro. The core operational question remains undisclosed: whether Claude directly flagged strike locations or estimated casualties, and there is currently no mandatory disclosure duty.

In November, Anthropic partnered with Palantir to place Claude inside US government intelligence and defense decision-support systems. In January, Anthropic submitted a $100 million proposal to the Pentagon for voice-controlled autonomous drone swarming, centered on translating commander intent into coordinated digital instructions. Although the bid was rejected, the contract scope indicates functions beyond summarization, including target-related awareness sharing and launch-to-termination control for potentially lethal swarms. These developments are occurring in a regulatory vacuum, while large language model hallucinations remain a structural and persistent reliability problem.

A prior warning case is Lavender in Gaza: it scored individuals from 1 to 100, and once a threshold was crossed, people were flagged as targets. Investigative reporting cited an error rate of about 10%, implying around 3,600 mistaken targets. Scholars argue that AI’s acceleration effect in war scales decisions while reducing human scrutiny, and military AI opacity has deepened over roughly the last 15 years. Article 36 weapon reviews under the Geneva framework are hard to apply when AI systems continuously update through learning. US armed-drone transparency itself required nearly 15 years of leaks, press pressure, and lawsuits before casualty numbers were published in 2016; AI warfare now needs earlier and stronger transparency pressure to prevent catastrophic error before accountability arrives.