
Learn The Fundamentals Of Becoming An AI Engineer On Scrimba; https://v2.scrimba.com/the-ai-engineer-path-c02v?via=developersdigest Links: https://artificialanalysis.ai/models/llama-3-3-instruct-70b https://ollama.com/blog/structured-outputs https://console.groq.com/playground?model=llama-3.3-70b-specdec https://github.com/meta-llama/llama-models/blob/main/models/llama3_3/MODEL_CARD.md https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct https://x.com/Ahmad_Al_Dahle/status/1865071436630778109 https://x.com/rowancheung/status/1865107802508730853 https://x.com/AIatMeta/status/1865079068833780155 https://x.com/ArtificialAnlys/status/1865130350923649534 Meta's New Llama 3.3: A Game-Changer in AI with 70 Billion Parameters In this video, I cover Meta's surprise announcement of Llama 3.3, a new AI model with 70 billion parameters. This model offers similar performance to the Llama 3.1 405B but is more cost-effective and easier to run. I compare Llama 3.3 to other frontier models like GPT 4.0 and discuss its standout performance in math and instruction-following benchmarks. I also provide information on how to test this model on platforms like Grok and Hugging Face and delve into its training specifications and independent evaluations. Join me as I explore the capabilities and benefits of this new AI model. 00:00 Surprise Announcement from Meta 00:02 Llama 3 Model Overview 00:34 Benchmark Comparisons 00:44 Cost Efficiency and Availability 01:06 Installation and Hardware Requirements 01:43 Model Performance and Evaluations 02:15 Independent Evaluations and Analysis 03:04 Personal Testing and Observations 04:00 Providers and Hosting Options 04:46 Conclusion and Final Thoughts
Weekly deep dives on AI agents, coding tools, and building with LLMs - delivered to your inbox.
Free forever. No spam.
Subscribe FreeNew tutorials, open-source projects, and deep dives on coding agents - delivered weekly.
--- type: transcript date: 2024-12-07 youtube_id: -dnGa6Oms5I --- # Transcript: Llama 3.3 70B in 5 Minutes a surprise announcement from meta today llama 3.3 a new 70 billion parameter model that delivers the performance of our 405b model but is easier and more coste efficient to run if we take a look at this model and if we compare it to some of the other Frontier models that are out there whether it's GPD 40 gemini or llama 405b if we look at the mmu now this is comparable to what we had within llama 3.1 and the 7 billion perimeter model but this is very close to those models that we see from Google as well as open AI now in terms of some of the other benchmarks like instruction following long context this model is at the frontier now in terms of some other aspects it outperforms GPT 40 on math as well but the big thing here is that this model is about 25 times cheaper than GPT 40 GPT 40 is $250 per million tokens of input and $10 per million tokens of output whereas this model is 10 cents per million tokens of input and 40 cents per million tokens of output right off the bat if you're interested in testing out this model it is available on Gro right now they do already have it integrated with their speculative decoding that they recently added as well you can also install it from AMA now mind you for a 70 billion parameter model this generally speaking isn't going to run on a typical laptop so you will need some specialized Hardware to run that right now it is text only for now and it is available for download at al.com ddownload you can also find this on hugging face as well I'll put the links to all of the tweets as well as all of the links that I'm showing you within the description of the video meta mentioned that improvements in the Llama 3.3 were driven by a new alignment process and the progress in online RL techniques this model delivers similar performance to a llama 3.1 405b with cost effective inference that's feasible to run locally on a common developers workstation now in terms of the model card the Contex length is still 128,000 tokens and this model was trained on 15 trillion tokens and the knowledge cut off is December 2023 now artificial analysis also performed their first round of independent evaluations on llama 3.3 and they're seeing a jump in the artificial analysis Quality Index from 68 to 74 does look based on Independent evaluations that what meta is claiming does look to be true this is a really great chart from artificial analysis that gives a visualization on the jump from today we saw llama 3.1 70b at 68 on their Quality Index previous to this announcement now it's at 74 this is right up there with MW large just like I mentioned llama 3.1 405b and also outperforming GPD 40 and this is the version that just came came out recently as well as of course a number of other models as well finally I did test this myself on a tool that I have been working on so this artifacts tool and from what I found is for a 70 billion perimeter model this model did perform really well I didn't find that it performed quite as well as something like Sonet 3.5 for code generation but in terms of it following Direction and not to mention the cost as well as speed like you can see within this example it's really compelling to use for a model so for code generation it does quite well in a number of seconds you can see that I get to that working application of a calculator after a few iterations to ultimately get there but generally speaking I found that it does follow directions really well as well as in terms of the code that it generates it's all coherent but I'd really encourage you to try out this model there are a number of different options where you can try this out as well as a number of different providers out there that have this hosted now in terms of some of the providers out there that are hosting the model at time of recording there's deep infra hyperbolic Gro as well as fireworks and together AI you can see their respective output speeds as well as price all here listed out finally if you're interested in comparing this in some other aspects to other models I'll leave a link to this within the description of the video artificial analysis is a tool that I absolutely love it's filled with data here you can check out a number of different benchmarks from whether it's providers or different models themselves they run evaluations on who's hosting the models as well there's just a ton of really great information within here that you'd probably find helpful if you're interested in this sort of thing but otherwise that's pretty much it for this video kudos to the team over at meta for this release I look forward to playing around with this a little bit more but otherwise if you found this video useful please like comment share and subscribe otherwise until the next one
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.