Google has launched Veo 3, a new AI video synthesis model that can create high-definition videos with synchronized audio. This breakthrough marks a significant improvement over previous models, which often resulted in silent or short clips. The latest model is capable of producing eight-second clips with voices, dialog, and sound effects.
The recent launch of Veo 3 has sparked interest, particularly among those who have been monitoring the progress of AI video synthesis. In March 2023, a viral video example using an open-source model called ModelScope was first reported, followed by Will Smith’s parody in February 2024. The Viral Video Test, led by developer Javi Lopez, has now applied Veo 3 to recreate this scene.
The outcome is striking yet imperfect: the audio output appears to be crunching sound effects rather than typical chewing noises. This phenomenon stems from Veo 3’s training data featuring numerous examples of chewing mouths with crunchy sounds. The model’s inability to replicate realistic mouth movements highlights the limitations and complexities of generating convincing AI outputs.
While Veo 3 is not yet a perfect solution, its capabilities demonstrate significant strides in AI video synthesis. As generative models continue to evolve, it will be essential to assess their strengths and weaknesses to unlock their full potential for various applications.
Source: https://arstechnica.com/ai/2025/05/googles-will-smith-double-is-better-at-eating-ai-spaghetti-but-its-crunchy