
Discover the newly launched Mistral 3, a powerful 24 billion parameter model known for its speed and efficiency. Licensed under Apache 2, it’s perfect for commercial applications, modifiable, and deployable on diverse hardware like RTX 4090 and MacBook with 32GB RAM. Competing well against larger models like Lama 3.3, Mistral 3 offers low latency, strong multilingual support, and superior performance in various tasks including agent-centric applications. It's available on platforms like Hugging Face, Olama, Kaggle, and more, ensuring accessibility and easy customization. 00:00 Introducing Mistral 3: A New 24 Billion Parameter Model 00:18 Performance and Efficiency of Mistral 3 00:55 Accuracy and Benchmark Comparisons 01:31 Model Specifications and Features 03:00 Use Cases and Practical Applications 04:09 Accessing Mistral 3 04:26 Conclusion and Final Thoughts
--- type: transcript date: 2025-01-30 youtube_id: VK3FB279kfs --- # Transcript: Mistral Small 3 in 5 Minutes we have a new model from mistol mistol has just released mistol 3 which is a new 24 billion parameter model this model is designed for both speed and efficiency it is published under the Apache 2 license meaning that you can modify and deploy it wherever you want even for commercial applications so what's great with this model is you could run this with an RTX 490 or even a MacBook with 32 gabt of ram once quantized as they point out within the blog post they mention that mistol 3 is competitive with large models such as Alama 3.3 as well as quen 32b it's an excellent open replacement for opaque proprietary models like gbd4 mini MW 3 is on par with llama 3.3 70b instruct while being three times faster on the same Hardware they mentioned that this is a great base model for building acrude reasoning capabilities we look forward to seeing the open source community and how they adopt and customize it another thing to note is that this model can get 81% accuracy on the n mlu at 150 tokens per second which is quite fast obviously this will vary depending on the hardware setup that you have but to put this into perspectiv for the MML if I take a look at artificial analysis here this puts it right in line with CLA 3.5 hiu as well as GPD 40 mini and at 81% we can see that llama 3.1 70b which is a model that's about three times the size has an 84 on the mlu it's about 3 percentage points away all the while being about three times smaller in terms of some of the specifics that they call out on the model cards low latency function calling for agentic applications this could very well be a really good model for that type of thing but also looks to have very strong support across a number of different languages French English German Spanish just to name a few the key feature is that this is Agent Centric so it offers the best inclass agent capabilities with Native function calling as well as Json outputting in terms of the context window it is 32,000 in terms of the human so this scores an 84.8 if we take a look at artificial analysis and we look for human Val value this is right in line with llama 3.3 70b they mentioned here that they gave a th proprietary coding questions generalist prompts and evaluators were tasked with selecting their preferred model response from anonymized Generations produced by MW 3 as well as another model we can see that mistol does perform very well across most of the models here we can see that the are still some preferences towards the Llama 3.3 70b score for some General promps within here we can see how it compares to Gemma 2 quen 2.5 32b llama 3.3 70b as well as GPD 40 mini it does show right here that it doesn't quite outperform the Llama 3.3 70b or g540 it does show a pretty wide margin where it is best in class for its size and when you compare to models that are considerably bigger it does perform quite well in comparison another interesting thing that they call out within the blog post is when to use mistol 3 so they do mention for fast responses low latency like I mentioned one of the big advantages of the model is that since this wasn't trained with synthetic data or reinforcement learning it serves as a strong base if you're looking to create your own specialized fine tunes especially in Niche domain maybe you're looking to fine tune a model for legal Tas or Healthcare information or whatever it might be now another benefit is this is not going to be locked in or inaccessible by an API so you can download it you can run it locally and you can really just own the data and not have to worry about potentially sending this out to a different provider if that is a concern this is one that I've personally run into where a lot of organizations don't actually even want to send a lot of sensitive information this could be non-material public information or what have you where you might want to have a powerful model but you actually don't want to send the model across a network request if you're an employee at an organization Mak the case to get a new Macbook with a ton of ram because you'll increasingly be able to run a lot of these very powerful large language models directly on your machine in terms of where you can access the model you can get it on hugging face if you want to pull it down you can also access it on AMA right now you can also access this on kegle together AI or fireworks they do also mention that this is coming soon to Nvidia MIM Amazon Sage maker grock data bricks as well as snowflake but otherwise that's pretty much it for this video I just wanted to do a quick one and show you this new release from mistol if you found this video useful please like comment share and subscribe otherwise until the next one
Weekly deep dives on AI agents, coding tools, and building with LLMs - delivered to your inbox.
Free forever. No spam.
Subscribe FreeNew tutorials, open-source projects, and deep dives on coding agents - delivered weekly.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.