到2026年5月,33岁的Chad Markey,一名来自Dartmouth的医学院学生,在没有拿到面试的情况下,连续花了六个月编码、发邮件并提交资讯请求;尽管他投了82份申请、拥有顶尖发表纪录并获得强烈推荐信。其MSPE记录显示他曾三次「自愿」休学,合计约22个月,并因「个人原因」将第三学年延长一年,但他主张这些缺席均由脊椎关节强直引起,系医疗必需。此情况发生在COVID后访谈改为线上、住院医师申请量暴增之后;在2025–26周期,约1,500个项目(约30%)据称使用了Thalamus的Cortex。部分州有AI筛选规范(伊利诺伊、纽泽西、科罗拉多尚未生效),加州要求定期做偏差测试,但都未向申请者提供明确个案解释。
Cortex在2025年9月截止后很快出现成绩显示错误。Thalamus后来称在超过4,000次客户询问中仅确认10起不准确,并在超过12,000次询问中又未发现额外错误,声称准确率为99.3%。然而临床人员仍回报「极度不准确」成绩,且《The Laryngoscope》上的研究文件记录持续性错误。Markey以四种NLP模型做压力测试亦显示句子措辞可改变结果:将「个人原因」改为医疗原因时,情感分数上升;在一个包含6,000名等同资格申请者、用逻辑回归挑选前12%的合成测试里,录取机率提高了66%。
接著Markey的策略从理论转向证据:他依据Medicratic专利建立逆向模型、联系各个项目,并提出新罕布夏隐私法的资料存取请求(可回应期限45天)。在3月20日Match Day前,他透过冷邮件拿到10个项目的面试邀请,最终匹配到Columbia University的精神科住院医师训练,预计7月入职。Thalamus最终表示他申请的项目并未使用Medicratic功能,Cortex并不为申请者打分或排序,AI主要用于成绩标准化;Thalamus也计划推出未来需自愿参与的AI筛选器,但对缺乏技术资源的申请者,整体申诉与救济程序仍存疑。
In May 2026, Chad Markey, a 33-year-old Dartmouth medical student, spent six months coding, emailing, and filing requests after repeated rejections, despite 82 applications, strong publication record, and strong recommendation letters. His Medical Student Performance Evaluation stated he had three voluntary leaves of absence totaling about 22 months and had extended his third year for personal reasons, but he says they were medically necessary because of ankylosing spondylitis. This happened after the post-pandemic shift to virtual interviews, when residency applications surged. In the 2025–26 cycle, about 1,500 programs—roughly 30%—were said to use Thalamus’s Cortex. Some states regulate AI screening (Illinois, New Jersey, Colorado not yet effective), and California requires periodic bias testing, yet none give applicants clear individualized explanations.
Cortex quickly showed grade-display errors after the September 2025 deadline. Thalamus later said only 10 verified inaccuracies were found in more than 4,000 customer inquiries, and then none in more than 12,000 inquiries, citing 99.3% accuracy. Yet clinicians reported wildly inaccurate grades, and a study in The Laryngoscope documented persistent errors. Markey’s stress tests using four NLP models showed wording could alter outcomes: replacing "personal reasons" with medically accurate language increased sentiment score. In a synthetic test of 6,000 equally qualified applicants, using logistic regression to select the top 12%, applicants with medically framed leave language were 66% more likely to be selected.
Markey then moved from theory to evidence: he built a reverse-engineered model from Medicratic patents, contacted programs, and filed a data-access request under New Hampshire law allowing a 45-day response. Before Match Day on March 20, cold emails produced interview offers from 10 programs, and he eventually matched at Columbia University psychiatry residency, starting in July. Thalamus later stated that none of his programs used Medicratic components, and Cortex does not score or rank applicants; AI was mostly used for grade normalization. Thalamus also planned a future opt-in AI screener, but due-process and recourse remain unclear for applicants without technical resources.