
Repo: ⭐ https://github.com/mendableai/firesearch Introducing FireSearch: The Open Source Deep Research Template Built with Next.js, Firecrawl and LangGraph In this video, the creator introduces FireSearch, an innovative deep research tool built using Next.js, Firecrawl, and LangGraph. The video demonstrates the functionality of FireSearch by running queries to compare the latest iPhone 16 and Samsung Galaxy S25, as well as Claude 4 and OpenAI's best model. The tool uses LLM to analyze queries, fetch results from Firecrawl API, and summarize the information. The creator also explains the query breaking, search strategies, and how to use the configuration file to customize search parameters. Detailed instructions on how to get started with FireSearch, such as setting up API keys and running the tool, are provided. The video concludes with a call to action for feedback, suggestions, and support. API Keys: https://www.firecrawl.dev/app/api-keys https://platform.openai.com/api-keys Links: https://www.firecrawl.dev https://www.langchain.com/langgraph Timestamps 00:00 Introduction to Fire Search 00:10 Demonstration of Fire Search 00:45 Gathering and Analyzing Data 01:00 Summarizing and Validating Results 01:57 Inspiration and Architecture 02:17 Breaking Down the Query 04:21 Handling Missing Information 06:00 Getting Started with Fire Search 07:01 Configuration and Customization 07:36 Conclusion and Call to Action
Weekly deep dives on AI agents, coding tools, and building with LLMs - delivered to your inbox.
Free forever. No spam.
Subscribe FreeNew tutorials, open-source projects, and deep dives on coding agents - delivered weekly.
--- type: transcript date: 2025-06-03 youtube_id: wX_5Y8aAN84 --- # Transcript: FireSearch: An Open-Source Deep Research Template Built with Next.js, Firecrawl and LangGraph In this video, I'm super excited to open source fire search, the first deep research tool that I built in Nex.js with Firecrawl as well as Langraph. Just to demonstrate how this works, I'm going to send in a query and I'm going to say compare the latest iPhone 16 and Samsung Galaxy S25 and then compare Claude 4 to OpenAI's best model. When I send that in, the first thing that's going to occur is an LLM is going to analyze the query that I sent in. We see it looks like you're interested in comparing the latest iPhone 16 as well as Galaxy S25 as well as compare Quad 4 to OpenAI's best model. Within here, what it's going to do is it's going to come up with a couple different search terms. You can think of these search terms as if it was something that you would Google. Then what we're going to do from there is once we have the results from that search query with the fire crawl search API, not only can we get the metadata from the pages, we can also get the markdown or HTML and the contents of the page themselves. Once we get that payload of all of the different pages, we're going to be using an LLM to summarize and analyze whether we have any information that's related to the query that we're asking. Once we have all of that information gathered, we're going to go and assess based off all of those different sources, do we have enough information that fulfills all of the answers that might have been asked within the input box. So, here is what the result looks like. We have the comparison of the iPhone 16 as well as the Samsung Galaxy S25. We can also see the inline references within the markdown renderer. Here we can see where all of that different information is being gathered from. And then finally, at the bottom, we have this Perplexity style follow-up questions where we can just one-click ask different questions that are related to the answers that were generated. Within our sidebar here, we'll have all of the different websites and the context that it used to generate the answer. And then we'll also have some little things like the number of characters on each of the pages. Overall, the implementation that I developed for this was inspired by Grop Deep Research. While there are some notable differences, obviously that was the starting point of when I saw their deep research. I was really inspired to build something not the same but similar in a lot of different ways. Now I want to dive into a different visual and just show you the architecture on how all of this works. When we send in a query like something like compare the Samsung Galaxy S25 and the iPhone 16, what's going to happen is we're going to break that into a number of different subqueries. I do also have a configuration file which I'm going to show you near the end of the video just so if you want to swap out the number of search queries or the number of web pages per search query and all of those variables that you might want to play around with. As a first step, we're basically going to say, okay, based on this query, how many different sub questions do we need for this? In this example, I broke it out into three different search queries. We have the iPhone 16 Pro specs, the Samsung Galaxy S25 Ultra specs, and then we also have a query particular to actually compare the two, maybe to see if there are already reviews or web pages that are out there that do exactly what we're looking for. What we're going to do for each search query is we're going to be sending in a different request to Firecrawl's search API. In this example, we have three different queries. And then based off those three different queries, we're going to get the top search results. For each search query, we're going to have a number of different web pages. Say for the iPhone 16 Pro specs feature, that search query, maybe it returns Apple, the Verge, as well as CNET. Whereas for the Samsung Galaxy query, maybe it's GSM Arena, Techraar, as well as the Samsung website. And then what we're going to do is based on all of the different contents within these pages, we're going to be performing a search functionality as well as a summarization functionality to see whether is the information that we're looking for that was within our initial input query. How this works is we're going to have a confidence score and this is going to be determined by an LLM. based on the contents of the page, we're going to be going and sending in those query to an LLM to assess and summarize what we actually have within the page and whether it meets that criteria. That number, that's going to be the value where we have that particular threshold that we can dial up or down within the configuration file if you want to edit these things for say if it's a little bit too eager or not eager enough to actually validate the different answers. From there, what we're going to do is we're going to determine whether we need more information. Say in this example, we didn't actually get the price of the S25 Samsung phone. We're going to have a little bit of a gentic behavior and say, okay, based on what we've searched for, as well as the results, do we have all of the information? Basically, what we're doing in the answer validation step is just to make sure that we have all of those different parts of the question actually answered. Let's say based on the search query, for whatever reason, we didn't actually get the pricing info of the S25 Samsung phone. But what we're going to do is we're going to take an alternative strategy for that search query. We had the original query of Galaxy S25 price and there was no specific pricing found with that particular query. Instead, with our alternative strategy, we might try things like looking up the MSRP of the retail price. Maybe if there's a pricing leak, say if we're looking up a phone that isn't out yet, and then also maybe a price comparison between the previous years as well as the current year, just to hopefully yield some more results that we didn't initially get from the first queries that we had sent in. And again, what we're going to do is we're going to be sending in those API calls with those responses to Firecrawl's search endpoint. And then based off all of the different web pages, we're going to determine whether or not we have that pricing information. And once all the answers are found, we're going to synthesize all of those different responses from the web pages. We're going to generate those follow-up questions, and then we're going to list all of the different citations. This could take several minutes, but depending on the query as well as how the configuration is set, this could potentially run for multiple minutes. If you put in something very difficult or obscure that isn't public and on the internet, it can go through and just loop through and try its best to find and ultimately give you an answer. To get started, you can head on over to the Fire Search repo, which I'll put within the description of the video and also as a pin comment within the video. While you're on the page, I'd love a star if you do like these types of examples. And all that we need to get started is we're going to need two different API keys. So, we're going to need our firecrawl API key, which you can get 500 free credits when you do sign up. And then we're going to be leveraging OpenAI in this example. From there, what we're going to do is once you've cloned down the repository, we're going to create ourv.local file. I'll also have an example within the repository. And then you can go ahead add in your firewall API key, your OpenAI API key, and everything should be good to go. And then from there, you can go ahead and npm install everything, npm rundev, and then you can ultimately play around with exactly everything that I showed you within this. If you have any different ideas of what you'd like to see within this example or if you have any ideas in terms of how to improve it, let me know within the comments of this video or open an issue on GitHub. There is some more information within the readme. And the one key thing that I do want to point out within this, if you want to change out some of the configuration, you can go ahead and look for the lib/config.ts file. And within here, this is going to be where we determine how many different search queries do we want to break up from that initial input. How many sources do we want to have per scrape? Do we want per search? Do we want to skip over content that is below a certain minimum content length? How many retries? As well as the confidence interval and the scrape timeout. I'd encourage you to try out a number of different queries, but really play around with the configuration to see what works best. But otherwise, that's pretty much it for this video. If you found this video useful, please comment, share, and subscribe. And also, if you don't mind, hammer that star button on the repo if you do like these types of open- source examples. And additionally, if you have any questions or any ideas for projects that you'd like to see, let me know within the comments of the video. Otherwise, that's it for this one. and I'll catch you in the next video.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.