在 2026 年 1 月，UpGua…

在 2026 年 1 月，UpGuard 研究人员发现了一个可公开存取的资料库，其规模之大，甚至在多年来例行的资料外泄发现中也格外突出：大约 3 billion 笔 email-password 条目，以及约 2.7 billion 笔包含 Social Security numbers（SSNs）的纪录。该资料集的拥有者不明，研究人员认为它可能是由多个较旧外泄事件汇整而成，可能包含 2024 National Public Data breach，这凸显了被重复利用的外泄资料如何随时间累积成更庞大的身分风险池。

外泄资料由 Hetzner 托管；UpGuard 于 2026 年 1 月 16 日通报，资料于 2026 年 1 月 21 日被移除。由于敏感性与规模，研究人员分析了 2.8 million 笔样本纪录，而非下载完整资料库，接著使用模式讯号（包含密码中的文化参照）估计其中很大一部分资料可能对应到约 2015 年的 US 使用者。在抽样的 SSN 纪录中，约 1 in 4 看似有效；虽然此样本在统计上不足以确定推估整体，但若将简单的 25% 比例套用到 2.7 billion 笔含 SSN 纪录，将意味著约 675 million 个可能有效的 SSNs。

关键风险不仅是过去已发生的外泄，也包括延迟利用：一些被联络的个人虽有资料外露，但尚无已知的身分盗用，这意味著存在潜在的攻击面。即使较旧的凭证仍有价值，因为密码重复使用依然存在；而 SSNs 是高价值识别码，且一生中很少变更，让长尾伤害在未来多年或数十年内都具可能性。文章将此描述为更广泛历史趋势的一部分，从 2015 与 2017 起的重大外泄事件延续至今，在不完整可见性、资料品质不确定与庞大绝对数量的共同作用下，即使特定外露资料库已下架，仍形成持续的人口规模风险。

In January 2026, UpGuard researchers discovered a publicly accessible database so large that it stood out even against years of routine breach findings: roughly 3 billion email-password entries and about 2.7 billion records containing Social Security numbers (SSNs). The dataset’s owner was unclear, and researchers believe it may have been aggregated from multiple older leaks, potentially including the 2024 National Public Data breach, highlighting how recycled breach data can accumulate into a far larger identity-risk pool over time.

The exposed data was hosted by Hetzner; UpGuard reported it on January 16, 2026, and the data was removed on January 21, 2026. Because of sensitivity and scale, researchers analyzed a sample of 2.8 million records rather than downloading the full trove, then used pattern signals (including password cultural references) to estimate much of the material likely maps to US users around 2015. In sampled SSN records, about 1 in 4 appeared valid, and while the sample is not statistically sufficient to project with certainty, a simple 25% ratio applied to 2.7 billion SSN-containing records would imply about 675 million potentially valid SSNs.

The key risk is not only past compromise but delayed exploitation: some contacted individuals had exposed data yet no known identity theft, implying a latent attack surface. Even older credentials retain value because password reuse persists, while SSNs are high-value identifiers that rarely change across a lifetime, making long-tail harm plausible for years or decades. The article frames this as part of a broader historical trend from major breaches in 2015 and 2017 onward, where incomplete visibility, uncertain data quality, and large absolute numbers combine into persistent population-scale risk despite takedown of a specific exposed database.