天文学家使用由 European Space Agency(ESA)的 David O’Ryan 与 Pablo Gómez 开发的 AI 辅助方法,在 Hubble Legacy Archive 内系统性搜寻罕见天体。名为 AnomalyMatch 的神经网路在 2.5 天(约 60 小时;原始:two and a half days)内筛查近 1 亿(100,000,000)个影像裁切,找出近 1,400 个异常目标;专家复核后确认其中超过 1,300 个为真实异常,且超过 800 个先前未见于科学文献。以候选数计的粗略比率约为 1,400/100,000,000=0.0014%(约每百万 14 个),显示在 35 年的 Hubble Space Telescope 档案中,稀有目标仍可被高效挖掘。
异常目标主要为正在并合或交互作用的星系,呈现不寻常形状或拖曳长尾的恒星与气体;另有大量为重力透镜,前景星系的重力弯曲时空,使背景星系光线形成圆环或弧(gravitational arcs)。研究亦回收多种罕见类型:巨大星团块的星系、具气体「触手」的 jellyfish galaxies、以及边缘视角的行星形成盘,外观类似汉堡或蝴蝶;并有数十个目标完全无法分类。相较于专家手动搜寻与 citizen science 的扩量检视,该流程将海量档案的初筛压缩为可由专家审阅的高分清单。
该成果置于资料量快速增长的趋势下:Euclid 自 2023 年起对覆盖约三分之一夜空的数十亿星系进行巡天;NSF–DOE Vera C. Rubin Observatory 将启动 10 年的 Legacy Survey of Space and Time,累积超过 50 petabytes 的影像;NASA 的 Nancy Grace Roman Space Telescope 预定不晚于 2027 年 5 月发射。此类 AI 工具将把「从海量资料中找针」的工作由全面人工检视转为可扩展的机器初选加专家验证,提升既有档案与未来巡天的科学产出。
Astronomers used an AI-assisted method developed by European Space Agency (ESA) researchers David O’Ryan and Pablo Gómez to systematically search the Hubble Legacy Archive for rare objects. A neural network called AnomalyMatch sifted through nearly 100 million (100,000,000) image cutouts in 2.5 days (about 60 hours; original: two and a half days), flagging nearly 1,400 anomalous sources; expert inspection then confirmed more than 1,300 as true anomalies, with more than 800 previously undocumented in the scientific literature. The rough candidate rate is about 1,400/100,000,000 = 0.0014% (about 14 per million), showing that even across 35 years of Hubble Space Telescope archives, rare targets remain recoverable at scale.
Most anomalies were merging or interacting galaxies with unusual morphologies or long stellar-and-gas tails; many others were gravitational lenses, where a foreground galaxy warps background light into circles or arcs (gravitational arcs). The search also surfaced other rare classes, including galaxies with huge clumps of stars, jellyfish galaxies with gaseous tentacles, and edge-on planet-forming disks with hamburger-like or butterfly-like appearances, plus several dozen objects that resisted classification. Compared with manual expert trawls and citizen-science expansion of human inspection, the pipeline compresses a vast archive into a high-scoring shortlist for specialist verification.
The result sits within a steepening data-growth trend: Euclid began in 2023 a survey of billions of galaxies across about one third of the night sky; the NSF–DOE Vera C. Rubin Observatory is expected to start a 10-year Legacy Survey of Space and Time and collect more than 50 petabytes of images; and NASA’s Nancy Grace Roman Space Telescope is scheduled to launch no later than May 2027. Tools like AnomalyMatch shift the “needle in a Universe-sized haystack” problem from exhaustive human review to scalable machine triage plus expert confirmation, increasing scientific yield from both legacy archives and incoming surveys.