
OpenAI's Latest Updates: Embedding Models Explained In this video, I discuss the latest release from OpenAI, particularly focusing on the new text embedding models. I discuss what embeddings are, their importance in modern natural language processing, and detailed insights into the three small and large models. The new models provide stronger performance and are significantly cheaper than previous models. I also discuss the flexibility in choosing the size of the embeddings, allowing for trade-offs between cost and performance. Finally, I review other updates such as reductions in GPT-3.5 turbo prices, an updated GPT-4 turbo preview, an updated moderation model, and a new platform for managing API usage. Upcoming videos will provide a practical guide on using these new models.
--- type: transcript date: 2024-01-26 youtube_id: pTIHIGJFMGk --- # Transcript: OpenAI's Text-Embedding-3 in 7 Minutes in this video we're going to be diving into the latest release from open a their newest embedding models before I dive into the new updates I first wanted to just quickly clarify what embeddings are for those who might not be aware or those that are just looking for a refresher what embeddings are are essentially taking text or code and it's converting it into a mathematical Vector which is a series of numbers now each of those numbers represents the essence of the input by obtaining that Essence within this numerical representation this is really at the heart of natural Lang processing this allows machines to really grasp the subtleties of language by understanding the context and meaning behind words and sentences but its core embeddings that are numerically similar are also semantically similar so if we just look at the example here you can see that Bine buddies and moo are grouped closely together similar to Wolf and Kine companions now on the far left corner here you see that a quarterback throws a football is not really related to the Animal Kingdom in the right hand side here for for lack of a better term now if you just scroll down a little bit there's also this really nice interactive visualization that shows embeddings of text samples So within here you can see and you can scroll in and out to look at this and you can actually drag around and you get a better idea on what vectors and embeddings look like in three-dimensional space in this example you can see that there's animals within the red there's athletes within the green within the pink there is Villages and then purple transportation and then film the nice thing with this example is you can see that all all of the different items that are related are grouped together as they should be and that's how you can think about when you're setting up your embeddings so the announcement today are that there are two new embeddings models there's the text embeddings 3 small model as well as the text embeddings 3 large model let's touch on the text embedding 3 Model first there is a stronger performance even for this new smaller version the text embeddings a to2 model is the model that was released in December 2020 when you compare the text embeddings 3 Model to text embeddings 8 at2 the average score that was used to Benchmark this the miracle metric increased from 31.4 to 44% for the commonly used Benchmark of the mte it's increased from 61 to 6 one of the more exciting things with this model is it's five times cheaper than text embedding A8 to2 so it's just a fraction of a penny to embed over a thousand tokens the other thing to note is they don't have plans to deprecate text embeddings a to2 so don't worry if you're still using those end points and they're not easy to change in what you've set up don't worry about it just quite yet now to get into the larger model that they released so text embeddings 3 large this is the next generation of their embeddings model and it has a considerable amount of Dimensions the dimensions go as high as 3,072 it has twice as many dimensions as Tex 82 and obviously the performance of this model is considerably higher so if you look on the Miracle average here you see that it really is Leaps and Bounds ahead of both Ada as well as the Tex betting 3 model and then you can also see that performance on the the mte average as well the other thing with the Tex embeddings 3 large model is it is only fractionally more expensive than the ada2 model so great news for AI developers all around the other really cool thing for these models is there is native support for shortening embeddings now if we just hop back to this example that we were looking at earlier if we just look at this and if you can imagine if there's higher dimensions on each access that's going to occupy more space so with more space arguably you're going to be able to have more accuracy in being able to pinpoint and be more finite with how related different items are to one another now with that being said the thing with that is say if you go for the largest Dimension and that you're using that over 3,000 Dimension you're going to be incurring higher compute costs by continually having to query more vectors of a greater size and what these new models allow you to do is you're able to ad hoc decrease the size of the dimensions so the benefit of having this flexibility is now you're able to choose between the trade-off of performance and cost say if you're using the latest and greatest with the most dimensions that's going to cost you more but it's also going to take a longer time to parse all the queries say if you have hundreds of thousands or potentially millions of vectors that need to be parsed through that's going to incur a higher cost when you're going to retrieve that as well as a higher cost of embedding and then also the performance is going to take longer so so say if you're looking for something fast and you're able to create really high quality embeddings you'll likely be able to get away with using a smaller model but like everything in program in there's going to be a trade-off so do you want accuracy or do you want speed that's generally going to be what it boils down to but with the native support for shortening it it gives you the flexibility where you can shorten down the dimension of the Tex embeddings 3 large model all the way down to 256 from that 1,70 even when you shrink down the newest Tex embeddings 3 large model to 256 and you compare it to the Tex embeddings a to2 model which had 1536 in terms of Dimensions it still performs that to say if you're using a vector database like pine cone server lless by having the smaller size it's both going to be faster as well as cheaper to run the embeddings and all you have to do to change the dimensions is when you're using their model is you just have to specify the different dimension that you want to use within the API parameter that you pass in to quickly touch on some of the other announcements they're going to be decreasing GPT 3.5 turbo once again this time by 50% and they're also introducing a new GPD 3.5 turbo 0125 which includes various improvements for higher accuracy see at responding in different requested formats say maybe if you're asking for Json or yaml or something like that hopefully this new Improvement offers better quality results for those types of things there's also a new gbd4 turbo preview that just came out this model completes cenation more thoroughly than their previous model that they mentioned here and the intent of the preview model is to reduce the quote unquote laziness where the model doesn't complete a task you might have experienced this laziness where say you're asking it to return a portion of code to you and it will return all of your code but then there's a particular function right in the middle there where there's a comment and it says fill in your function here which isn't always the most helpful and it can actually be quite annoying hopefully this new preview model does resolve that quote unquote laziness for those types of requests and there's also a note within here is say if you want the newest releases of gbd4 you're able to set the model to gbd4 Turbo preview within the model name Alias and you'll be able to pick up all of those releases as they come so there's also an updated moderation model model if that's something that you're interested in and then finally they're also launching a platform to give developers more visibility into the usage of their API keys so you'll be able to see the number of tokens that you're using across all of the different models and all of the different keys will make managing your openai applications that much easier so if you're interested in seeing how to try out these models I'm going to be putting out some videos in the coming days where I'll be showing you how to integrate this into real life demonstration projects that you can go ahead and use as a starting off point for using these new embeddings models so that's it for this video hopefully you found this video useful if you did please like comment share and subscribe otherwise until the next one
Weekly deep dives on AI agents, coding tools, and building with LLMs - delivered to your inbox.
Free forever. No spam.
Subscribe FreeNew tutorials, open-source projects, and deep dives on coding agents - delivered weekly.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.