Google's Veo 3 Now Supports Image-To-Video
The world's most capable AI video generator just got more powerful with the support of image inputs. You can now set the first or last frame of a video before generating them.
Google just dropped one of the most highly anticipated features for its latest video model, Veo 3. It now supports video generation from an image input.
This means you can upload a portrait photo and feed a script into the prompt field to generate a video of the subject talking. You can also make them sing, narrate promotional lines, or even crack jokes.
This update also solves one of the trickiest workflows in AI video — creating consistent characters across multiple scenes. You can generate multiple portraits from a trained image model and feed them into Veo 3 to turn them into video clips.
The image input feature is part of the new updates rolled out in June 2025. When you open the Flow app on Google Labs, you’ll see this message:

You can now make your images talk with Veo 3
Veo 3 first-frame to video now supports speech. Upload a picture of your character and give them a voice. Note that audio is still a beta feature, so your videos might not always return sound.
Reminder: Make sure that you have the necessary rights to any content that you upload. Do not generate content that infringes on others’ intellectual property or privacy rights.
Flow is the evolution of VideoFX, a Google Labs experiment that launched last year. It helps you create cinematic clips, seamlessly transition them into scenes, and have enough consistency to tell a story.
Here’s a list of all the capabilities of Flow:

On the initial launch, only the Text to Video feature was supported. Only today that they add support for the Frames to Video tool.
How The Frame to Video Works
To get started, head over to the Google Labs Flow page and log in with your Google account. On the Flow homepage, click the “Create with Flow” button.

You’ll be redirected to the video editor page. In the input type dropdown, select “Frames to Video.”

For the starting frame, you can either generate an image using the Imagen 4 model or upload your own.

In this example, I used an image I created with Flux Labs AI for an AI-generated product video ad. Here’s the exact prompt I used, in case you want to recreate it:
Prompt: Young woman in a sun hat applying SkinProtect sunscreen to her cheek. She gently blends it in, revealing no residue. Shot in natural daylight, realistic skin texture, soft beach background blur, handheld camera feel. The sunscreen is held on her right hand, her left hand is applying the sunscreen to her face. She is smiling and feeling great.

You can crop the frame to focus on the area you want to include in the final output.
There’s also an optional end frame and camera movement. For camera movement, you can control how the camera behaves in the scene. If you want a slow zoom-in, choose “Dolly in.” If you want the camera to revolve around the subject, choose “Orbit left” or “Orbit right.”
In this example, I set it to “Static” to remove all camera movement and keep the focus on the subject.

For the video model, there are multiple options to choose from:
Veo 3 Fast: Fastest render speed with decent quality.
Veo 2 Fast: Older fast model with slightly lower quality.
Veo 3 Quality: Highest quality output with but the most expensive one.
Veo 2 Quality: Good quality with faster render time than Veo 3.

Next, add your video prompt. Describe how you want the subject to move, talk, or behave in the clip. Since Veo 3 supports native audio generation, you can write what the subject should say or even add sound effects.
Prompt: A woman in a sun hat applying SkinProtect sunscreen to her cheek. She says the line “I really love this sunscreen. It’s not sticky at all and totally protects my skin. You can enjoy the sun anytime, anywhere” in a happy and excited tone
Click the submit button and wait for the video to finish generating. Here’s what the final output looks like:
The original resolution is 720p, but there’s an option in the download menu to upscale it to 1080p.

What’s more awesome is that upscaling the video to 1080p costs 0 credits. I’m not sure how long this will stay free, so take advantage of it while you can.
You can learn more about Google Flow in this article.
Final Thoughts
Google just built the most powerful video model so far.
Veo 3 is by far the best in the market in creating high-quality videos with consistent characters, scenes, and full dialogue support. Flow was released only a couple of weeks ago, and Google has been pushing out new features at an incredibly fast pace.
Video is going to take over the AI space in 2025. YouTube and other social media platforms are flooded with AI-generated videos. People are getting more creative with their prompts, and Veo 3 makes it easy to create viral videos.
That’s exactly why I created ReelPal AI. It’s an AI video generator that works a lot like Google Flow. It supports video models like Veo 3, Kling 2.1, Hunyuan, and Hailuo. You can add voice narration, background music, and sound effects — all in one platform.
I’m still waiting for Veo 3’s frame-to-video API to roll out on Fal AI or Replicate. Once that happens, it’s going to speed things up and give content creators more flexibility in how they build and automate their video workflows.
Hi there! Thanks for making it to the end of this post! My name is Jim, and I’m an AI enthusiast passionate about exploring the latest news, guides, and insights in the world of generative AI. If you’ve enjoyed this content and would like to support my work, consider becoming a paid subscriber. Your support means a lot!