Generative AI Publication

Generative AI Publication

Share this post

Generative AI Publication
Generative AI Publication
Google Introduces Veo 2.0 To Compete With OpenAI's Sora
Copy link
Facebook
Email
Notes
More

Google Introduces Veo 2.0 To Compete With OpenAI's Sora

Jim Clyde Monge's avatar
Jim Clyde Monge
Dec 17, 2024
∙ Paid
4

Share this post

Generative AI Publication
Generative AI Publication
Google Introduces Veo 2.0 To Compete With OpenAI's Sora
Copy link
Facebook
Email
Notes
More
1
Share

Just days after OpenAI rolled out Sora to the public, Google responded by dropping its latest and most advanced AI video model yet, Veo 2.0. This new version of Veo is packed with some really cool new features, including 4K resolution, improved camera controls, and far greater overall quality compared to its predecessor.

The timing of Veo 2.0’s release makes everyone wonder: Is Veo 2.0 better than Sora?

If it’s the first time you’ve heard about Veo, it’s Google’s AI video model capable of generating videos from text descriptions. The first version of Veo was introduced in May 2024 but was never made publicly available. Now, Google has unveiled Veo 2.0 with significant enhancements and broader functionality.

What’s New in Veo 2.0?

Google introduces three new features in Veo 2.0.

  1. Enhanced realism and fidelity

  2. Advanced motion capabilities

  3. Greater camera control options

What’s New in Veo 2.0? Google introduces three new features in Veo 2.0. Enhanced realism and fidelity Advanced motion capabilities Greater camera control options
Image by Jim Clyde Monge

To demonstrate the capabilities of Veo 2.0, Google conducted human evaluations against other leading video generation models like Meta’s Movie Gen, Kling v1.5, Minimax, and Sora Turbo.

Generative AI Publication is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Evaluators viewed 1,003 video samples created using prompts from Meta’s MovieGenBench dataset. Videos were compared at a resolution of 720p with varying durations: Veo’s samples were 8 seconds long, VideoGen’s samples were 10 seconds, and other models produced 5-second outputs.

Veo 2.0. Participants viewed 1003 prompts and respective videos on MovieGenBench, a benchmark dataset released by Meta. All comparisons were done at 720p resolution. Veo sample duration is 8s, VideoGen’s sample duration is 10s, and other models’ durations are 5s. We show the full video duration to raters.
Image credit: Google

Looking at the tables above, you can see that Veo 2 performs best on overall preference and for its capability to follow prompts accurately.

Of course, knowing about Google’s not so pretty track record when it comes to product announcements, you need to take these benchmarks with a grain of salt. It’s always important to get your hands on these AI video generators before making any conclusions.

X user Blaine Brown performed a nice experiment where he asked various video models to generate videos of a chef’s hand slicing a steak. This is very challenging for AI models. Hands, consecutive slicing physics & movement, interpretation of ‘steak done perfectly’, steam, juices, etc.

Here’s the prompt and the final results:

Prompt: A pair of hands skillfully slicing a perfectly cooked steak on a wooden cutting board. faint steam rising from it.

Keep reading with a 7-day free trial

Subscribe to Generative AI Publication to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Jim Clyde Monge
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More