In the video, I provide a concise overview of Mistral AI's flagship model, 'Mistral Large,' highlighting its key functionalities, such as advanced function calling, multilingual capabilities, and an extended token context window. I delve into its top-tier MMLU ranking, its unique advantages compared to competing models, and its availability on the Azure platform. Additionally, I touch upon pricing, performance benchmarks, and introduce its counterpart, 'Mistral Small,' which shares some features with the larger model. The video concludes with a practical guide on initiating with the Mistral API and mentions complementary tools like Langchain, aiming to equip viewers with essential information to leverage Mistral's AI capabilities effectively. 00:00 Introduction to Mistral Large 00:05 Availability and Ranking of Mistral Large 00:31 Capabilities and Features of Mistral Large 00:42 Function Calling Feature in Mistral Large 01:10 Partnership with Microsoft and Self Deployment Option 01:28 Benchmark Metrics and Performance of Mistral Large 02:00 Introduction to Mistral Small 02:04 Function Calling and JSON Format in Mistral Small 02:31 Getting Started with Mistral API 04:31 Streaming and Other Implementations of Mistral 04:52 Pricing of Mistral Models 04:59 Conclusion and Final Thoughts
--- type: transcript date: 2024-02-27 youtube_id: _NXPqpBLP60 --- # Transcript: Mistral Large: The Latest Flagship Model from Mistral AI in this video I'm going to be showing you Mr Large which is the flagship model from Mist AI that just came out Mr Large is now available on the mistal AI API and it is also available through Azure this is a top tier model in terms of the MML U score Mr Large ranks between gp4 and CLA 2 as they mentioned in the blog post it can be used for complex multilingual reasoning tasks including text understanding transformation and code generation in terms of pricing for this model it is also 20% cheaper than GT4 some of the strengths and capabilities it's natively in English French Spanish German and Italian and it has nuanced understanding of grammar and cultural context it is a 32,000 token context window and another feature that a lot of people are going to be excited about is its native capability for function calling it allows developers to leverage these llms and use them to parse and invocate functions so say if you're asking for something like what is the weather you could trigger a function that goes and retrieves that real-time information of the weather or just about anything so it could be making a call to your SQL database it could be making a call to your proprietary API that sort of thing it's a pretty big unlock for developers building applications mistol announced that they're partnering with Microsoft to provide their models through Azure which is similar to what open aai allows developers to access as well for very particular use cases you can contact them further details on how you can self- deploy this but I'd imagine this type of thing is only going to be really available to Enterprise customers in terms of some of the other Benchmark metrics outside of mlu you see that gbd4 does outperform by a variance of 5.4 on mmu and by a similar margin on ha s with that being said on the metrics where Mr Large does outperform gbd4 doesn't have a stated Benchmark mistal large is a step above the mixl Urus model that they released as well as llama 270b and alongside with mistal large they're also releasing mistal small which does outperform their mixl model and lower latency and what's nice with the mistal small model it also does have that function calling feature now we have another option when we're looking to leverage function calling within our llm applications there's a lot of tooling whether it's with Lang chain or llama index on how you can essentially take that natural language and take that response from the llm and hopefully parse that there's tooling out there with things like L chain or llama index where you can use tools like output parsers to take that output from the llm and parse it into a usable format like Json for your application so this goes without saying that it's a really nice feature that's built right into their inference API to get started with mistol you can go ahead and make an account there is a very straightforward setup process you will have to set up your billing information to be able to interact with their API just to go through their platform you can have up to 10 API keys there is also a usage dashboard if you've used their API in various months you'll be able to see it plotted across here I use this briefly when I was experimenting with it last month and you'll be able to see all the different models and your usage across some here now in terms of rate limits you're going to be able to get five requests per second or 2 million tokens per minute and 10 billion tokens per month now if you want to increase these limits don't hesitate to reach out to them if you just click docs and you go over to client code they have some examples on how you can get started with python JavaScript or curl request I'll just show you how to get started with JavaScript so if I just take this code here you can just simply init Dy if you've done installed now if we head back to their documentation what you will need to install is their package so if we just copy this and on install this we just wait for a moment for that to install we see that it's there then we're going to go within our index TS here and we can just simply copy our code here we're going to paste that in here now we're going to have to create a EnV within ourv we're going to copy this mistal API key just like this put an equal sign then we're going to go over to the platform here and generate a new API key here once we have our API key all set up I'm just going to go install. EnV similarly with mpm you can just npm install. EnV and what we're going to do within here we're just going to import EnV and then env. config so we'll just save that out and then all you have to do is fun index.ts and then we should see here that we get the message back from the inference API so there we have it that's a simple example on how to get started with their API now with that being said if you head over to their API documentation you do also have the ability to stream if you're going to be building something like a chat application you can use that stream command just make sure that you go ahead and set up streaming to true there is also a mystal implementation within Lang chain that you can go ahead and experiment with or put in within your Lang chain application another option you can explore is for sell AI SDK it will give you a simple interface like this you can go ahead play around with their model in terms of pricing it is $8 per million tokens for their large model and then it is $2 per million tokens for their small model that's it for this video if you found this video useful please like comment share and subscribe and otherwise until the next one
Weekly deep dives on AI agents, coding tools, and building with LLMs - delivered to your inbox.
Free forever. No spam.
Subscribe FreeNew tutorials, open-source projects, and deep dives on coding agents - delivered weekly.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.