OpenAI has announced the launch of 4o image generation in ChatGPT.
GPT‑4o image generation excels at accurately rendering text, precisely following prompts, and leveraging 4o's inherent knowledge base and chat context—including transforming uploaded images or using them as visual inspiration. These capabilities make it easier to create exactly the image you envision, helping you communicate more effectively through visuals and advancing image generation into a practical tool with precision and power.
OpenAI says it trained models on the joint distribution of online images and text, learning not just how images relate to language, but how they relate to each other. Combined with aggressive post-training, the resulting model has surprising visual fluency, capable of generating images that are useful, consistent, and context-aware.
Improved Capabilities ● Text rendering A picture is worth a thousand words, but sometimes generating a few words in the right place can elevate the meaning of an image. 4o's ability to blend precise symbols with imagery turns image generation into a tool for visual communication.
● Multi-turn generation Because image generation is now native to GPT‑4o, you can refine images through natural conversation. GPT‑4o can build upon images and text in chat context, ensuring consistency throughout. For example, if you're designing a video game character, the character's appearance remains coherent across multiple iterations as you refine and experiment.
● Instruction following GPT‑4o's image generation follows detailed prompts with attention to detail. While other systems struggle with ~5-8 objects, GPT‑4o can handle up to 10-20 different objects. The tighter binding of objects to their traits and relations allows for better control.
● In-context learning GPT‑4o can analyze and learn from user-uploaded images, seamlessly integrating their details into its context to inform image generation.
● World knowledge Native image generation enables 4o to link its knowledge between text and images, resulting in a model that feels smarter and more efficient.
Access and Availability 4o image generation rolls out starting today to Plus, Pro, Team, and Free users as the default image generator in ChatGPT, with access coming soon to Enterprise and Edu. It's also available to use in Sora. For those who hold a special place in their hearts for DALL·E, it can still be accessed through a dedicated DALL·E GPT.
Developers will soon be able to generate images with GPT‑4o via the API, with access rolling out in the next few weeks.
Creating and customizing images is as simple as chatting using GPT‑4o - just describe what you need, including any specifics like aspect ratio, exact colors using hex codes, or a transparent background. Because this model creates more detailed pictures, images take longer to render, often up to one minute.
Check out the videos below for more details...
Add Comment
Would you like to be notified when someone replies or adds a new comment?
Yes (All Threads)
Yes (This Thread Only)
No
Notifications
Would you like to be notified when we post a new Apple news article or tutorial?