
OpenAI's New GPT Image Model API📸 Today OpenAI released their new GPT Image one model via API! 🌟 Last month, ChatGPT introduced Image Generation, and it quickly became a hit with over 130 million users creating 700 million images in the first week. Now, developers can integrate high-quality images into their tools and platforms. 🌐 Access is available from any developer tier, but you must validate your ID through the OpenAI API. Adobe, Airtable, Figma, Gamma, and others are already on board! 📊 Pricing details: $5 per million tokens (input), $10 per million tokens (image input), and $40 per million tokens (output). This translates to 2-19 cents per image depending on quality. 🎨 Features include a playground with examples, inpainting for image editing, adjustable aspect ratios, quality settings, and transparency support. 🔍 Limitations: some text and visual consistency issues. Check out the full walkthrough! 00:00 Introduction to GPT Image One API 00:26 Features and Integration 00:56 Moderation and Pricing 01:30 Playground and Examples 02:23 Setup and Inpainting 03:19 Image Specifications and Limitations 04:39 Conclusion and Call to Action
--- type: transcript date: 2025-04-23 youtube_id: k-G71JZA75A --- # Transcript: OpenAI's New GPT Image Model API in 5 Minutes 📸 Today, OpenAI just released their brand new GPT image 1 model through their API. Just to go through the blog post a little bit, they introduced image generation in Chad GPT last month and it quickly became one of the most popular features, over 130 million users around the globe, created more than 700 million images in just the first week. Today, they're bringing this same experience to their API with the model string of GPT image 1. This is going to enable developers to easily integrate highquality professional-grade images directly into their own tools as well as platforms. Right now, you are going to be able to access this model from any developer tier from OpenAI. Now, the one thing to note with this is you are going to have to actually validate your identification through the OpenAI API. In terms of some companies that already have this available within their product include Adobe, Air Table, Figma, Gamma, and more. As you might expect for the image generation API, there are some guardrails in terms of the image generation. Now, they do have some moderation parameters that you can pass into your request. So, you can set auto mode where there will be standard filtering or you can specifically set a low where there will be less restrictive filtering. I think a huge question that a lot of people had is pricing. It's $5 per million tokens of input, $10 per million tokens of image input. You can pass in both images as well as text. It's going to be $40 per million tokens of output. They say this roughly translates to 2,7 or 19 cents per generated image for low, medium, and highquality square images, respectively. Once you've gone through and you validated your ID, you can access this from the playground at platform.opai.com/playground/im images. The nice thing with the playground is there is a number of different really great examples in terms of how you can use this. Here's an example of a business card, a logo, as well as instructions in terms of how to add it on the business card. Within here, you can also select the different aspect ratio as well as the quality. Additionally, you can specify how many images you want to generate. One thing to note with the playground is you still are going to be incurring API cost. Even though it is a playground, it isn't an area where you can just try all of this out for free. You still will be build for all of those generations that you create. Here's an example of that. You can go through all of these different types of examples. Or alternatively, if you want to create those Gibli images that had gone viral, you can go ahead and do that. But there are a number of different great examples within here where you can go and try all of this out. Now, in terms of setting it up, it's super straightforward. You can go ahead use the OpenAI SDK. And to make a request, it's as simple as specifying the GPT image model as well as the prompt. Additionally, what you can do is edit particular parts of an image by uploading an image and a mask indicating which area should be replaced. A process known as impainting. They do have an example of that where here we have an image of a pool. We specify within painting what we want to have within the pool. And then there is this example of the flamingo. Here is an example on how you can mask an image. That is a super cool feature that I think we're going to see in a ton of different applications because what we're going to be able to do is instead of just reprompting the same image again and again once we get an image that we like and we just want to refine from there, we can go ahead and leverage something like impainting. There are some requirements for masking like the mask does have to be the same format as well as size and the mask image must also contain an alpha channel. So in terms of the available aspect ratios that you can specify, you can get square, portrait as well as landscape. And then the quality options like I mentioned earlier are low, medium as well as high. The images that are generated are either JPEG or WEBP images. And you can also specify the output compression. If you want to specify the compression level to be higher, you can do that. And additionally, it does support transparency. If you want to make a transparent background from your generation, you'll be able to do that as well. In terms of some limitations for complex prompts, these can take up to 2 minutes to process. The other thing to note is for text, it does do quite well with text, but although it's significantly improved, like they mentioned from the Dolly series, the model can still struggle precisely with text placement as well as clarity. Another thing that the model can struggle with is consistency. They mentioned that the model can struggle to maintain visual consistency for recurring characters or brand elements across multiple generations. Now, in terms of cost as well as latency, obviously for the lower quality images, those aren't going to take as long. And for a square lowquality image, that's going to be 272 tokens. Whereas for a more expensive generation, like on portrait mode with the high setting, that's going to be 6,240 tokens. Just keep that in mind in terms of that $40 per million tokens of output. That's going to be roughly the pricing that you're looking at. Otherwise, that's pretty much it for this video. I just wanted to do a really quick call out that this API is now available. But otherwise, if you found this video useful, please comment, share, and subscribe.
Weekly deep dives on AI agents, coding tools, and building with LLMs - delivered to your inbox.
Free forever. No spam.
Subscribe FreeNew tutorials, open-source projects, and deep dives on coding agents - delivered weekly.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.