
Pushing the frontier of cost-effective reasoning. OpenAI has launched O3 Mini, a cost-efficient AI model for reasoning tasks, available to ChatGPT users and developers. This model excels in STEM areas such as science, math, and coding, offering features like function calling, structured outputs, and developer messages. O3 Mini provides different reasoning effort options (low, medium, high), but lacks vision capabilities. Itβs rolling out to select developers and ChatGPT Plus, Team, and Pro users, replacing O1 Mini. The update includes higher rate limits and the ability to fetch up-to-date information from the web. O3 Mini demonstrates higher performance and speed, particularly in competitive coding and complex reasoning tasks. The model aims to provide high-quality AI at lower costs, maintaining robust reasoning capabilities. 00:00 Introduction to O3 Mini 00:10 Key Features and Capabilities 01:13 Availability and Access 02:55 Performance Benchmarks 04:38 Future Prospects and Conclusion 05:46 Demo and Final Thoughts
--- type: transcript date: 2025-01-31 youtube_id: HWIHfQV7U7M --- # Transcript: OpenAI's O3 Mini in 6 Minutes π it is now here openingi has just released 03 mini this is rolling out to chat GPT users as I'm recording this in this video I'll read through the blog post and we'll look at some of the benchmarks 03 mini is the newest and most cost efficient model in the reasoning Series in both chat GPT as well as the API today this was originally previewed as the last announcement right before Christmas of last year they mention that this powerful fast model advances the boundaries of what small models can achieve it delivers exceptional stem capabilities with particular strengths in science math coding all the while maintaining the low cost as well as the reduced latency of opening eyes they mentioned that this is the first small reasoning model that supports highly requested developer features including function calling structured outputs and developer messages this makes a production ready out of the gate through many will support streaming choose the reasoning effort options between low medium and high effectively what those toggles mean is it gives you the ability on how hard you want O3 mini to think about a particular task before it gives you that final response what time of recording o03 mini does not support Vision capability so developers should continue using opening eyes 01 model for visual reasoning tasks 03 mini is rolling out to the chat completions API assistance API and badge API today to select developers in the API usage tiers 3 to five you would have had to spend at least $100 through their API before now in terms of chat GPT they mentioned this is rolling out to chat GPT plus team and pro users that can access o03 mini starting today with Enterprise access coming in a week o03 mini will replace A1 Mini in the model picker offering higher rate limits and lower latency making it a compelling choice for coding stem and logical problem solving as part of this update we're tripling the rate limit for plus and team members from 50 messages per day with mini to 1050 messages per day with 03 mini in addition to that they now have the ability where you can pull in upto-date information from relevant web sources this is similar to something like perplexity or their search feature or even deep seek which just came out with that capability where before it reasons it's actually going to search and then fetch information from the internet so you could get information about today's events or whatever it might be also starting today free plan users can try out opening eyes O3 mini by selecting the reason option in the message composer or by generating a response they mentioned O3 mini uses medium reasoning effort to provide a balanced trade-off between speed and accuracy and all paid users will have the option of selecting mini high in the model picker for a higher intelligence version that takes a little bit longer to generate responses Pro users will have unlimited access to both 03 mini as well as 03 mini high they mentioned that this model has been optimized for stem O3 mini with medium reasoning efforts match 0 one's performance in math coding and science while delivering faster responses evaluations by expert testers show that O3 mini produces more accurate and clever answers with stronger reasoning abilities than 01 mini testers preferred 03 min's responses to 01 mini 56% of the time and observed a 39% reduction in major errors on difficult real world questions finally with medium reasoning effort mini matches the performance of on some of the most challenging reasoning including Aime as well as GP QA first off if we look at competition math we do see that on high list does outperform all of the previous reasoning models we do see that on medium it does almost outper for 01 with lesser reasoning effort depending on medium or low we do see that it doesn't quite outperform 01 mini depending on the Benchmark so in terms of PhD level science questions we do see that this model does perform very well so this is going to be very fast it outperforms 01 mini but it doesn't quite outperform 01 preview or 01 in terms of Frontier Mass we do see that it scores a 99.2% which is an improvement on both 01 mini as well as 01 which both had 5.8 and 5.5 respectively where this is really great is in terms of competition codes we do see that this is the best model out there right now with an ELO of 2130 outperforming basically all of the models even with low reasoning effort we see that it does come just shy of 01 in terms of the software engineering bench we do see that it's a 49.9 on high reasoning and I'll just quickly go through some of these other metrics as well in terms of the human preference evaluation we do see that the win rate over 01 mini is considerably higher we see that the time to First token average is less than 01 minis by the looks of it look like it's going to give you a response about 2 and 1/2 seconds faster in terms of what's next they mentioned that the release of open AI 03 mini marks another step in open AI mission to push the boundaries of cost effective intelligence by optimizing reasoning for stem domains while keeping cost low or making high quality AI even more accessible the model continues on track of our driving down the cost of intelligence reducing per token pricing by 95 % since the launch of GPT 4 while maintaining top tier reasoning capabilities as AI adoption expands we we remain committed to Leading at the frontier building models that balance intelligence efficiency and safety at scale at time of recording I do not quite have access to 03 mini now I do have a chat GPT plus tier but as it mentioned within the blog post it will be rolling out today it's A110 per million tokens of input it is $440 per million tokens of output in terms of the knowledge cut off 403 mini this goes up to October 2023 and for the context window you can put up to 200,000 tokens or receive up to a maximum of a 100,000 tokens and to put this into perspective this is the same context window that you get with the 01 model last just for some Vibe checks I'll show you what 03 mini looks like in one of the applications that I've been working on if I just say generate me a Hacker News clone in react I'll go ahead and I'll ask for that question we see that it's loading it's thinking so it thought for about 7 Seconds there before it generated this response for us and if we take a look here we have a Hacker News clone here it's actually going and getting the relevant sources this is live information I should be able to open these links and that does work it brings me right to the blog post that just came out moments ago otherwise let me know what your thoughts are on this model what are you going to be using this for are you going to be using this model and swapping it out from other models within the applications that you're building let me know your thoughts in the comments below but otherwise if you found this video useful please like comment share and subscribe otherwise until the next one
Weekly deep dives on AI agents, coding tools, and building with LLMs - delivered to your inbox.
Free forever. No spam.
Subscribe FreeNew tutorials, open-source projects, and deep dives on coding agents - delivered weekly.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.