← 返回 Avalaches

GPT‑5.2 被定位为在长上下文推理、工具使用、视觉与程式设计上皆为最先进。它在 OpenAI 的 MRCRv2 长上下文评测中居首,在 Tool Decat

GPT‑5.2 被定位为在长上下文推理、工具使用、视觉与程式设计上皆为最先进。它在 OpenAI 的 MRCRv2 长上下文评测中居首,在 Tool Decathlon 与 τ²‑Bench Telecom 等长程工具呼叫基准上领先,并在衡量复杂编码任务的 SWE‑Bench Pro 上排名第一。在视觉任务中,它将图表推理与 UI 理解错误降低超过 50%,显示在结构化视觉资料上的准确度有大幅提升。

多个客户——Notion、Box、Databricks、Hex、Triple Whale 与 Zoom——回报在模糊、资料量大的工作流程上表现更强、代理执行更可靠,这与基准成绩的提升一致。GPT‑5.2 已在 Responses 与 Chat Completions API 中提供,并会依任务难度自动调整推理强度。使用者也可明确将推理努力设定为 none、low、medium、high,或新增的 “xhigh” 等级以处理最困难的问题。

能力提升伴随更高成本:GPT‑5.2 比 GPT‑5/5.1 贵 40%,价格为每 100 万输入 token 1.75 美元、每 100 万输出 token 14 美元,且对快取输入提供 90% 折扣。它支援 Priority Processing、Flex Processing 与 Batch API 使用。OpenAI 也发布新的提示词指引并更新 Prompt Optimizer,帮助使用者更有效地取得这些增益。

GPT-5.2 is positioned as state of the art across long-context reasoning, tool use, vision, and coding. It tops OpenAI’s MRCRv2 long‑context eval, leads Tool Decathlon and τ²‑Bench Telecom for long‑horizon tool calling, and ranks first on SWE‑Bench Pro for complex coding. In vision tasks, it reduces chart‑reasoning and UI‑understanding errors by over 50%, indicating a large accuracy jump on structured visual data.

Multiple customers—Notion, Box, Databricks, Hex, Triple Whale, and Zoom—report stronger performance on ambiguous, data‑heavy workflows and more reliable agent execution, consistent with the benchmark gains. GPT‑5.2 is available in both the Responses and Chat Completions APIs, and automatically scales its reasoning to task difficulty. Users can explicitly set reasoning effort to none, low, medium, high, or the new “xhigh” tier for the hardest problems.

The model’s improved capability comes with higher cost: GPT‑5.2 is 40% more expensive than GPT‑5/5.1, priced at $1.75 per 1M input tokens and $14 per 1M output tokens, with a 90% discount on cached inputs. It supports Priority Processing, Flex Processing, and Batch API usage. OpenAI also released new prompting guidance and updated the Prompt Optimizer to help users capture these gains efficiently.

2025-12-12 (Friday) · 20073f17a2cb0ab17af28579bb93f0e53dd186db