OpenAI Releases GPT-4o With Native Image Generation
GPT-4o with native image generation allows you to generate images, edit a single image with text prompt, and even merge multiple photos.
Just when I thought Google would hold the throne for a while as the best AI image editing model with the recently released Gemini 2.0 Flash, I was wrong. Today, OpenAI released GPT-4o with native image generation. This new model allows you to generate images, edit a single image with text prompts, and even combine multiple images into a single photo.
Unlike the previous image generator in ChatGPT powered by Dall-E 3, the new image generator is part of the GPT-4o model. Yes, GPT-4o is an “omnimodal” model capable of processing and generating text, audio, and images.
The shift from separate models to native integration within GPT-4o is a huge architectural advancement, enhancing performance and capabilities through tighter coupling of language understanding and visual synthesis.
Initial access to this new feature is rolling out to Plus, Pro, Team, and Free ChatGPT users starting in March 2025. Access for Enterprise and Education users, as well as API access for developers, is expected to follow soon.
If you want to learn more about how it works, check out the white paper here.
How to Access
There are few ways to try the new model:
Keep reading with a 7-day free trial
Subscribe to Generative AI Publication to keep reading this post and get 7 days of free access to the full post archives.