Cover image

OmniAvatar’s Metrics & Training: Under the Hood of Next-Gen Avatars

TL;DR for operators OmniAvatar is best read as a shift from “make the mouth move” to “make the person perform.” The paper introduces an audio-driven avatar video generation system that takes a reference image, an audio clip, and a text prompt, then generates facial and semi-body video with synchronised speech, adaptive body motion, and prompt-controlled scene elements.1 ...

June 24, 2025 · 16 min · Zelina