Google Introduces Veo 2.0 To Compete With OpenAI's Sora

Dec 17, 2024

∙ Paid

Just days after OpenAI rolled out Sora to the public, Google responded by dropping its latest and most advanced AI video model yet, Veo 2.0. This new version of Veo is packed with some really cool new features, including 4K resolution, improved camera controls, and far greater overall quality compared to its predecessor.

The timing of Veo 2.0’s release makes everyone wonder: Is Veo 2.0 better than Sora?

If it’s the first time you’ve heard about Veo, it’s Google’s AI video model capable of generating videos from text descriptions. The first version of Veo was introduced in May 2024 but was never made publicly available. Now, Google has unveiled Veo 2.0 with significant enhancements and broader functionality.

What’s New in Veo 2.0?

Google introduces three new features in Veo 2.0.

Enhanced realism and fidelity
Advanced motion capabilities
Greater camera control options

To demonstrate the capabilities of Veo 2.0, Google conducted human evaluations against other leading video generation models like Meta’s Movie Gen, Kling v1.5, Minimax, and Sora Turbo.

Evaluators viewed 1,003 video samples created using prompts from Meta’s MovieGenBench dataset. Videos were compared at a resolution of 720p with varying durations: Veo’s samples were 8 seconds long, VideoGen’s samples were 10 seconds, and other models produced 5-second outputs.

Veo 2.0. Participants viewed 1003 prompts and respective videos on MovieGenBench, a benchmark dataset released by Meta. All comparisons were done at 720p resolution. Veo sample duration is 8s, VideoGen’s sample duration is 10s, and other models’ durations are 5s. We show the full video duration to raters. — Image credit: Google

Looking at the tables above, you can see that Veo 2 performs best on overall preference and for its capability to follow prompts accurately.

Of course, knowing about Google’s not so pretty track record when it comes to product announcements, you need to take these benchmarks with a grain of salt. It’s always important to get your hands on these AI video generators before making any conclusions.

X user Blaine Brown performed a nice experiment where he asked various video models to generate videos of a chef’s hand slicing a steak. This is very challenging for AI models. Hands, consecutive slicing physics & movement, interpretation of ‘steak done perfectly’, steam, juices, etc.

Here’s the prompt and the final results:

Prompt: A pair of hands skillfully slicing a perfectly cooked steak on a wooden cutting board. faint steam rising from it.

Keep reading with a 7-day free trial

Subscribe to Generative AI Publication to keep reading this post and get 7 days of free access to the full post archives.