← 返回 Avalaches

在这篇关于Eka机器人的报道中,作者在Kendall Square观察到一台机械爪能像人一样先轻轻试探再抓取光源,甚至能在桌面上追逐并拧紧灯泡。作为一名写机器人十余年的作者,他强调这与市场上“几打”商业化机械臂相比明显不同:多数机器臂即使远程操控也较笨重、缺乏灵巧。Eka位于马萨诸塞州剑桥,与MIT同城,创始人Pulkit Agrawal与Tuomas Haarnoja将目标描述为把机械灵巧性从仅“可操作”推进到可泛化的“类人”或更高水平。文章特别指出,Agrawal称“数万亿美元”级的人类手工劳动价值流通于手部活动,意味着若机器人能替代这些动作,市场影响将十分巨大。

在历史线上,OpenAI于2018年10月之前为ChatGPT约四年时推出了Dactyl:用Shadow Robot的手抓取数字化Rubik’s Cube,并通过强化学习反复训练,在虚拟环境中尝试成千上万次以达到“接近人类灵巧度”。然而它仍依赖精确角度、缺少恢复能力,且只能处理一种带传感器的魔方。Agrawal与Haarnoja当时指出“sim-to-real gap”是关键障碍。随后,Haarnoja在DeepMind做小型仿人体机器人足球,Agrawal在MIT做“由上方抓取并考虑重力反馈”。到2021年底,Agrawal已让虚拟手能倒置操作2,000种物体,显示从纯仿真到高维操作任务的探索已形成可扩展性。

Eka选择不同路径:不是先让大量人类示范,而是让机器人在仿真世界中自主“花费数千小时”试错,像AlphaZero一样自主发现策略。创始人们公开了“vision-force-action”模型和具触觉传感的夹爪,模型在仿真中加入质量、惯性等物理约束,以同时学习像素变化和抓取中的力学反馈。鸡块分拣演示(需快速又不致损)显示了速度与“随场景调整投掷”式的临机应变。尽管存在观点认为结合人类示范与仿真可更快收敛,Eka的表现说明触觉物理智能是面向餐饮、零售、家庭等高频手工领域的关键;Agrawal甚至认为,用足够细致的执行器和传感器,未来可达成组装iPhone级的精细操作。

In the report on Eka, the robot claw demonstrated near-natural manipulation: it nudged objects gently, chased and repositioned a light bulb, then screwed it into a socket, and handled varied items such as earbuds, brushes, and keychains. The author, who has covered robots for more than a decade, says this performance is unusually natural compared with the majority of the “few dozen” market robot arms and calls it a possible “ChatGPT moment” for physical tasks. Eka, based in Cambridge, Massachusetts, has two founders, Pulkit Agrawal and Tuomas Haarnoja. Their ambition is not merely human-level dexterity but broader, reusable robotic manipulation. Agrawal argues that trillions of dollars flow through human hand tasks, so gains in robot manipulation could have large economic implications.

Historically, OpenAI’s Dactyl (around October 2018, about four years before ChatGPT) used a Shadow Robot hand in simulation and reinforcement learning to solve a Rubik’s Cube, learning through many thousands of finger motions. A human may take around three seconds for a speed solve, while the simulated approach could evaluate thousands of variations in the same time; yet Dactyl remained brittle. It failed when the cube slipped or initial placement was off, and handled effectively only one instrumented cube. Agrawal and Haarnoja highlighted the sim-to-real gap. Haarnoja’s DeepMind work focused on humanoid football agents in simulation, while Agrawal at MIT focused on top-down grasping with gravity-aware control. By late 2021, Agrawal had trained a virtual hand to manipulate 2,000 objects upside down, strengthening the case that simulation alone could scale learning.

Eka now favors extensive in-simulator self-practice over large-scale human demonstration datasets commonly used in VLA pipelines. Instead of outsourcing many hours of human hand-motion capture, the robots “invent” strategies through reinforcement learning, then transfer them to reality. Eka reports custom tactile grippers and a vision-force-action model that includes physical quantities such as mass and inertia, learning both visual changes and force interactions in grasping. Their chicken-nugget task showed throughput plus context-sensitive improvisation, including short-distance throws when a container moved out of reach. Observers also saw human-like behaviors, such as fingertip probing before contact and recovering from fumbles. The team still keeps training details private, but they suggest this path could enable manipulation beyond household chores, including fine tasks like iPhone assembly, while the field remains unsure whether simulation-only or mixed-learning approaches will dominate.

2026-04-30 (Thursday) · 35edcecf866431be02f0fd4e29f5b2f89d729985