AI System Can Generate Lifelike Videos from Single Photo

ByteDance has unveiled OmniHuman-1, an AI system that can create lifelike videos of people talking, singing, or playing instruments from a single photo. According to research published on the online archive arXiv, OmniHuman significantly outperforms existing methods and delivers high-quality results across various scenarios.

The tool supports image inputs of any aspect ratio and can generate realistic human videos based on weak signal inputs, especially audio. Researchers have shared sample videos showcasing the tool’s capabilities, including hand and body movements, animated characters, animals, and historical figures brought back to life.

Experts praise OmniHuman for its impressive capabilities, with clinical associate professor Freddy Tran Nager saying that the tool is “very impressive” and could be used in educational settings or by content creators looking for a respite. However, Samantha G. Wolfe, an adjunct professor at NYU’s Steinhardt School of Culture, Education and Human Development, notes that the risks associated with AI-generated videos also increase as they become more sophisticated.

The ByteDance team trained OmniHuman on over 18,700 hours of human video data, combining multiple types of inputs such as text, audio, and physical poses. While not the first AI tool to generate videos from a single photo, OmniHuman’s access to extensive training data sets it apart from other tools in Nager’s eyes. The implications of this technology for businesses, governments, and individuals are still being debated, but one thing is clear: AI-generated humans are becoming increasingly sophisticated.

Source: https://www.forbes.com/sites/lesliekatz/2025/02/05/tiktok-owners-new-ai-tool-makes-lifelike-videos-from-a-single-photo