← 返回 Avalaches

中国 AI 团队在影片生成方面已领先美国竞争对手,这是生成式 AI 中一个快速成长的领域,应用于广告、电子商务和娱乐。北京的 ByteDance 与快手透过在大量短影音资料库上训练模型建立了优势,而 OpenAI、Google 和 Anthropic 等美国领导者仍主导大型语言模型与程式码领域,但在影片品质与可用性上似乎较弱。开发者表示,这种差距反映的是训练资料的取得,尤其是中国短影音平台所建立的大型专有影片资料池。

文章强调了几个领先系统,包括快手的 Kling、ByteDance 的 Seedance 2.0,以及 MiniMax 的 Hailuo,这些系统都在 Arena 的使用者投票排行榜上名列前茅;Google 的 Veo 3 仍具竞争力,得益于可使用 YouTube 影片素材。使用者与创作者表示,中国工具更顺畅、更能遵循提示词,也更逼真,音讯同步与声音稳定性也更强。Director AI 创办人 Ben Chiang 表示,美国模型在影片生成方面往往表现不佳,且经常遇到内容控制错误;而电影制作人 George Won 说,Seedance 2.0 即使在快速镜头移动时,也能保留脸部细节与光线。

商业影响十分重大,因为影片生成所需的 token 和运算量远高于文字或音讯,使规模扩张成本高昂;OpenAI 于 3 月停用了 Sora,部分原因就是运算成本。中国模型对许多创作者来说也更便宜、更灵活,不过自 2 月以来对 Seedance 2.0 的强劲需求已导致存取受限和等待时间很长,而且 ByteDance 已要求部分美国企业客户预付约 2mn 美元。法律与政策风险仍然显著,ByteDance 因基于角色的影片面临版权威胁;但商业上行空间正在扩大,据报有一家零售商曾想要 100,000 支商品页影片,而企业现在认为 AI 影片已足以支撑大规模品牌广告。

6c8111ba8f86.png



Chinese AI groups have moved ahead of US rivals in video generation, a fast-growing corner of generative AI used in advertising, ecommerce, and entertainment. Beijing-based ByteDance and Kuaishou have built advantages by training models on huge libraries of short-form video, while US leaders such as OpenAI, Google, and Anthropic still dominate large language models and coding but appear weaker in video quality and usability. Developers say the gap reflects access to training data, especially the large proprietary video pools created by Chinese short-video platforms.

The article highlights several leading systems, including Kuaishou's Kling, ByteDance's Seedance 2.0, and MiniMax's Hailuo, all of which rank highly on Arena's user-voted leaderboard; Google's Veo 3 remains competitive, helped by access to YouTube footage. Users and creators say Chinese tools are smoother, better at following prompts, and more realistic, with stronger audio synchronization and voice stability. Director AI founder Ben Chiang said US models are often poor at video generation and frequently hit content-control errors, while filmmaker George Won said Seedance 2.0 preserved face detail and lighting even during fast camera moves.

The business implications are large because video generation needs far more tokens and computing than text or audio, making scale expensive; OpenAI discontinued Sora in March partly because of computing costs. Chinese models are also cheaper and more flexible for many creators, although strong demand for Seedance 2.0 since February has caused restricted access and long wait times, and ByteDance has asked some US enterprise clients for about $2mn upfront. The legal and policy risks remain significant, with ByteDance facing copyright threats over character-based videos, and the commercial upside is growing, as one retailer reportedly wanted 100,000 product-page videos and companies now see AI video as good enough to support brand advertising at scale.
2026-05-18 (Monday) · 2b2b6941a445b17f757272d73add02442da41044