
In this video, I review the latest model from Mistral, the Mistral Large 2. I'll discuss its impressive 128,000 context window, its training on dozens of languages including more than 80 coding languages, and its performance benchmarks. Comparing it with models like LLAMA 3.1 and GPT 4o, Mistral Large 2 stands out in various coding tasks and instruction following. I also cover its licensing options and availability for use on major hyperscalers. Links: https://chat.mistral.ai/ https://console.mistral.ai/ 00:00 Introduction to Mistral Large Two 00:19 Model Specifications and Training 00:47 Performance Benchmarks 01:30 Usage and Licensing 01:48 Comparative Analysis 03:40 Access and Availability 03:56 Conclusion and Call to Action
--- type: transcript date: 2024-07-24 youtube_id: 6MdxODJ3s3s --- # Transcript: Mistral Large 2 in 4 Minutes in this video I'm going to be going over mistol large 2 which is the latest model from mol which just came out on the heels of the Llama 3.1 and the 405 billion parameter model that meta a put out yesterday and mrol large is a really impressive model for a number of reasons it context window is 128,000 just like llama 3.1 and similar to llama 3.1 it's also trained on a number of specific languages in this case instead of I believe it's eight in the casee of llama 3.1 this is trained on dozens of languages and in addition to that it's specifically trained on more than 80 coding languages including python Java Etc that's where the performance of this model really does a shine if we go down to the performance of the model itself and we look at some of the coding benchmarks here we can take a look at the human eval Benchmark it outperforms even quad 3.5 SAA which a ton of people love as a model for coding answers and in terms of the mlu it has a score of 84% now the big thing with this model is it's only a n23 billion parameter model it's still a really large model you're not going to be able to run this unlik any consumer Hardware you will have to have specialized hardware for this it's designed for a single node inference just like I had mentioned so it size of 123 billion parameters allows you to run the large throughput on a single node now the one thing to note with this model is that it's released under their research license this allows you for usage and modification for research and non-commercial uses if you do want to use the model for commercial purposes you are able to self- deploy it and you can get a license by contacting them now if we go back to the benchmarks this is going to be a model that's going to be really strong at coding you can see all of the different languages here so when you compare it to even gb40 or llama 3 it does outperform even the 405 billion parameter model across a ton of different benchmarks here you can see here that it basically outperforms llama 3.1 405 across the board with the exception of bash when you compare the model 2 gp40 it comes within Striking Distance of most of these metrics here which you can see and even on the Java score it does outperform and it is best-in class for Java a few other things with the model is it has drastically improved performance for instruction following as well as conversation capabilities so now if we look at the comparison to some of the different models here we see that mystal to large here it's right up there with some of these Frontier models like Cloud 3.5 sonit as well as GPT 40 like I mentioned this model is trained on a diverse set of languages you can see the ranking of the different languages here so when you compare it to other models again it is right up there in terms of performance now the one thing to know with this model is it is quite a bit smaller than some of these models that it's being compared to while we don't know the exact parameter count of something like gbd4 or we do know that llama 3.1 is a 45 billion parameter model so to have this type of performance on a Model that's about 30% the size goes without saying is very impressive the big thing with this model is it does outperform all of the frontier models on function calling and Tool use so that's going to be something that I think a lot of people will definitely find interesting with this model because not only does it have the general capabilities when you compare it to gp4 or Cloud 3.5s we do see that the performance is best in class in terms of the places where you can access the model you can access it on a platform that they have here which is available on their website I'll put the link within the description of the video as well and then the model is also going to be available on all of the major hyperscalers here I encourage you to try out the model see how it performs I'm going to be trying it in a couple different applications that I have here it performs try out the model let me know what you think if you found this video useful please like comment share and subscribe otherwise until the next one
Weekly deep dives on AI agents, coding tools, and building with LLMs - delivered to your inbox.
Free forever. No spam.
Subscribe FreeNew tutorials, open-source projects, and deep dives on coding agents - delivered weekly.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.