这篇文章指出,AI 正在加速数学研究并改变工作方式。AlphaEvolve 先前在一个排列群的 Bruhat intervals 中意外发现高维立方体(hypercubes),研究者原本只想找别的结构,却在 2026-01-03 的 preprint 中发现那里其实一直存在一个巨大且未被注意到的结构。Williamson 说,过去类似成果需要大量工程投入,而如今借助 LLM,他可以把原本要两周的实验压缩到 20 分钟;虽然多数时候不成功,但 AI 已成为探索数学新领域的有力工具。
在代数几何中,Ravi Vakil、Balázs Elek 与 Jim Bryan 利用 Gemini 的专门版本与 DeepMind 工具研究球面嵌入 flag varieties 的问题,并发现某些性质比预期更早出现,似乎在趋近无限之前就已成立。他们先让 AI 解较简单情形,得到一份逐行可核对、清楚且正确的证明,再由人类搭配 AI 补完一般情形,最终在 2026-01-12 的 preprint 中成功证明。这被视为一个典型案例:专家与大型科技公司的合作,让数学推进更快,而且可透过逐步检查确保正确性。
文章同时强调风险与限制:AI 生成的错误内容正在污染学术生态,Hamkins 甚至表示期刊系统正被大量垃圾内容淹没。因而数学家转向 formal proof 与 autoformalization,希望用机器可验证的形式语言处理证明;Tao 认为,若有验证,AI 的可靠性才足以支撑严肃应用,但这一流程目前仍耗时且需要高深数学功底。
The article argues that AI is rapidly reshaping mathematical research. AlphaEvolve had already uncovered a surprisingly large hypercube structure inside Bruhat intervals in a permutation-group setting, reported in a January 3, 2026 preprint, and Williamson noted that what once demanded major engineering effort can now be explored far more quickly; with LLMs, he can run an experiment in 20 minutes that used to take 2 weeks, even if most attempts fail.
A second example comes from algebraic geometry, where Ravi Vakil, Balázs Elek, and Jim Bryan used Gemini-based tools from DeepMind to study sphere embeddings in flag varieties. Their AI-assisted work suggested that a stability phenomenon appears much earlier than expected, and after the model produced a clear proof of a simpler case, they used it to complete the general argument, publishing the result in a January 12, 2026 preprint. The episode highlights both speed and verifiability: the proof could be checked line by line. (Key numbers: 2026-01-12)
The piece also stresses the downsides. Mathematicians warn that AI-generated nonsense is polluting journals, which is why formal proof checking and autoformalization are gaining traction; Tao says AI without validation is too unreliable for serious use. At the same time, educators fear that students will lose training opportunities because AI can solve many assigned problems instantly, even as researchers believe the technology may someday surpass humans in measurable ways while still lacking the long-term planning needed for the hardest mathematical “Everests.”