
Building an Efficient LLM Chat Application with Go and HTMX In this video, I will guide you through building a large language model (LLM) chat application using Go and HTMX. Inspired by fellow YouTuber Web Dev Cody's frustrations with TypeScript and Next.js, I'm exploring how these alternatives can offer more resource-efficient solutions. I'll compare memory usage between a Next.js and Go-based chat application, provide setup instructions, and dive into the code step-by-step, including setting up WebSockets, integrating OpenAI, and deploying the app with Railway. Follow along and see the benefits and trade-offs of using Go and HTMX for your projects. Repo: https://github.com/developersdigest/llm-golang-htmx-chat Railway Referral Link: https://railway.app?referralCode=P2pUW5 00:00 Introduction to Building an LLM Chat Application 00:18 Why Go and HTMX? 02:43 Setting Up the Project 03:52 Creating the HTML Structure 07:51 Implementing WebSocket and JavaScript Logic 09:56 Building the Go Backend 16:00 Deploying to Railway 17:11 Conclusion and Final Thoughts
--- type: transcript date: 2024-07-31 youtube_id: bjlVqw7ALls --- # Transcript: Go + HTMX + OpenAI: Create a Lightweight AI Chat Application all right in this video I'm going to be showing you how to build out an llm chat application leveraging go as well as HTM x a lot of the content on my channel has been primarily building with JavaScript as well as typescript I'll leverage something like nextjs or I'll leverage a react front end to build out these applications but I thought it'd be interesting to look at go and htx now what spurred this idea was web have Cody another YouTuber he put out this video talking about some of his frustrations with typescript as well as nextjs and just the typescript ecosystem as a whole and the main theme throughout the video that he talks about is how a lot of the different and sometimes relatively simple applications that he has often take up a ton of memory for what seems to be something that should be relatively simple and shouldn't just idly take up as much memory as they do the one thing that stood up to me is as soon as I deploy this application is the amount of memory that it uses is really low so this is running a websocket server this is serving the HTML this is doing everything that you saw in that initial demonstration of the application what I wanted to do with this is I wanted to pull off the shelf a nextjs chat application and just see how much resources it consumed on the server and this is a nextjs project that isn't even doing anything right now there's no network traffic as you can see and it's just sitting hovering at about 100 megabytes the interesting thing with that is this idle time for my project is 11 megabytes whereas the nextjs application is about a 100 megabytes now mind you as you start to use the application these numbers are obviously going to grow from here so right off the bat you do get a little bit more bang for your buck with building with something a little bit simpler I'd encourage you once in a while try out something new and just see what you like about it what you don't now there are a number of things within this where I definitely won't be a ble to move as fast in go in HTM X as I can with something like nextjs and that's a trade-off that you have to make if you want to iterate and build something really quickly nextjs and the typescript environment is at the top of the game you can iterate and build things really quick in that ecosystem but if you want to build just these efficient projects you can definitely start at a good point building out with a go backend at least in my opinion now if you have any ideas on what you'd like to see combinations of different Tech whether it's rust whether you want to go even and build something with C or what have you let me know in the comments below I'm up for trying out these different types of projects once in a while here and just seeing the pros and cons and working through them and doing a little bit of a tutorial on them so without further Ado the first thing that we're going to do within our go project we have a really simple directory all that we really have in here we have a read me which I'm going to have instructions to this GitHub repository which I'll link within the description of the video if you're interested we have our main.go we have our go dood we have our go do suum and I'll just delete that here and I'll show you how this will all work our go dood is going to have our dependencies and all that we're going to be using is fiber which is a go framework as well as web sockets from fiber as well then we're going to have a really simple Docker configuration this is going to be how we ultimately deploy it to railway like you saw in the demonstration there and then finally we're going to have a simple HTML file as well we really have a four or five files here but the main ones are going to be main.go as well as index.html and you don't really need to concern yourself too much with the other ones to get started with installing our dependencies here we can just go mod tidy what that will do is it will pull in all of our indirect dependencies simp so you can think of this go mod file is something akin to your package.json once we have that we're just going to clear out our terminal here make everything a little bit bigger first I'm going to show you the index.html within our index HTML It's relatively straightforward so we have our HTML structure just like you typically see our head body we have a script tag that we're going to be leveraging a little bit once we get further down here the first thing that we're going to be importing is HTM X this is what we're going to be leveraging for both our Ajax requests as well as our websocket functionality the way that our application is going to work is we're going to make a request to our server and then ultimately it's going to stream back all of those tokens that we get back from openai in this case case and it's going to stream to the front end of our application within HTM X there is also a specific websocket extension that you can just include within your project within a CDN just like that the next thing that we're going to use is we're going to use a library to render the chat that we get back from the llm within a markdown container so that's what we're going to be using this marked JS library for from there we're going to load the entire CDN from Tailwind there are definitely a number of ways on how you can optimize this further within your go Application but just for example sake we're going to keep it simple and we're just going to load in this script and all of the different styles that we're ultimately going to be leveraging within our HTML from there we're just going to include highlight JS which is a code syntax Library which you can integrate further into your application if you'd like once we have that what we're going to do is we're going to declare a container within here this is going to be how we set up the web socket connection and the important piece with this is when you're building this out locally you do have to swap out if you have Local Host 8000 just make sure that you swap this out right here ultimately once you deploy this to railway you'll just have to point this to The Domain where you're going to be hosting this this is just a generated domain from Railway and I'll show you where to grab this from just a little bit later in the video once we have that we're going to have our container this is going to be where we render all of the different messages like you saw on screen there from there we're just going to have a simple form when you're building out an application with htx the way that it works is with these HTML attributes in this case we're leveraging the websockets package so we'll see these different classes like WS connect WS send and then we'll see HX extend WS those are the attributes that bind to the logic that are loaded in within the head of our document from the cdns for HTM X once we have that what we're going to be doing is we're going to have a little bit of JavaScript and the reason that we have this is because that we're sending in a stream of M messages from the back end we're going to be having to parse and append all of these different messages to the front end here I'm just going to go through it line by line so you can see and understand what's happening here all right so the first thing that we're going to do is we're going to set up and declare some variables to track the current message and its state essentially within our application once we have that we're going to configure our marked Library this is going to be used just for the message that we have coming in from the server which is going to be coming in from the llm that's going to be all of those different tokens we're going to be putting those within a markdown container and that's going to be how it renders and has that nice syntax highlighting and you can definitely uh make it a little bit more prettier from here this is just a really simple uh implementation of this but there are a ton of benefits of using a library like this within your application chat GPT uses something like this Claude all of the llm providers do use some form of markdown down to render within the chat streams all of those different coding blocks and all of the sort of nice little pieces of interface that come in as the messages get streamed back once we have that we're going to have a simple function to call and parse all of the content for our markdown and then that's going to be how we render it to the screen next we're going to be setting up this event handler so HDMX has a number of different event handlers that you can leverage within your application what we're going to be doing is we're going to be waiting for the message to be received from the server once we have the message come through we're just going to grab the ID for the container on where all the messages are going to be appended to once we have that we're going to parse all of the details from the message as they come through for our first we're going to start a new message this is going to be where we render that AI as the one that's responding to you you could just Swap this out to an icon or whatever you'd like if you want or a different name you can also swap out this prefix if you also do it on the back end as well and then what we're going to be doing is we're going to be appending that to the container and then once we remove that prefix from our response back from the back end that determines whether it's the start of a new message at that point that's going to be where we append all of those different messages that we get back as the tokens come in from open AI through the websockets into the front end of our application and then ultimately getting rendered within that markdown of that response container that we have set up then as the tokens come in we're going to be rendering that markdown container of the response on our screen there then the last thing that we're going to be adding within our HTML file and within our script here is this form submission Handler this isn't to be confused with what we had already set up with these HTML attributes to bind to the htx logic on how we're going to be interacting with our backend this is going to be how we append the current message above that message that we're ultimately going to get back from the llm and render to the content and then at the end here we're just going to be resetting all of the logic that we had here to know that it's a new message and we're going to be putting that next message within a different state essentially we want to have all of the message within this current stream to be rendered within that same markdown container that's going to be how we reset it each time that we send new messages from there what we're going to be doing is we're going to be setting up our main.go file within here it's pretty straightforward we're going to be using a number of different standard libraries then we're also going to be using the go fiber framework as well as the websocket extension within fiber first thing that we're going to do within our file is we're going to declare a constant instead of leveraging a package to interact with the openai API is we're just going to be making our request directly to their chat completions endpoint next what we're going to be setting up is a global variable this is going to be where we store our open AI API key that's going to be something that you're going to need for for this project to work you can just go over to openai and grab your API key and then alternatively if you want to use something like Gro you can also Swap this out for their chat completion endpoint or if there's another openi compatible llm provider you can also swap in their URL here as well as the API key and then ultimately the model once we get to that portion as well from there what we're going to be doing is we're going to be declaring a number of different structs if you're coming from typescript a struct you can think of almost akin to something like an interface within typescript they're very similar but you will notice that they are different in terms of the syntax obviously as well as how you declare them this is just to give you an idea on all of the different variables that we're going to be using within our application and the types of variables that we have we're going to have the role on whether it's an AI message or a human message then we're going to have the content of the message that we're going to be sending in our requests we're going to obviously have the model name all of the different messages then we're going to have that we do want it streaming so on and so forth so you get the idea on all of the different strs that we're declaring here once we have that we're going to be setting up a variable and this is going to be how we map all of the different current websocket connections that's something to be mindful is as your app grows or what have you just make sure that you provision the adequate resources and all of that but that's essentially what we're going to be having here we're going to be declaring this client's variable and that's going to be where we map through all of those different websocket connections so our server has the context of knowing who's actually connected to the websocket server once we have that we're going to have our main function within our main function we're going to do is we're going to get our environment variable for our open aai key there's going to be different spots on where you can plug this in whether you're running this locally or ultimately when you deploy this to Railways next we're going to be setting up different pieces of fiber if you're coming from typescript or nodejs you can think of fiber similar to something like Express what fiber allows you to do is you can serve up your static assets say if you had images or css files or HTML files you can serve those all up by pointing them to a particular directory and then to set up the different routes within our application it's super straightforward so we can set up these different paths on where we want the route to point to and then we're going to have the Handler for each respective route so for the route we're going to have the handle home and then for the websocket path that's going to be where we handle all of websocket logic from there we're going to be setting up a little bit of a dynamic setup for our Port if you're running it within an environment you can set it up to get the environment Port otherwise we can just default the port to 880 if you're running it locally from there to start our server all that we need to do is app. listen and then we can just log that out to the terminal that the app has started from there we're going to be setting up the different handlers so the first thing that we have here is a Handler to send our HTML file to the front end so if someone hits that Home Path there we're just going to serve up that HTML file and then alternatively if they're connecting to our websocket path we're going to be setting up that websocket connection here here is going to be how we manage the connection for all of the different web sockets so when someone new comes to our application what this function is going to do is it's going to Loop through all of the different incoming requests and then it's going to map it to the client so if it doesn't already exist on the client we're going to add it to essentially that hashmap of that user to declare that new websocket connection and then what we're going to be doing here is we're going to be setting up something that's known as a go routine this is going to be how we're able to handle a number of different websocket connections across all of the different users that are interacting with our application this is going to be how we handle that concurrency of all of the different requests that are coming in is we're going to be streaming those responses back by having this go routine in place where we're going to be streaming back the response here that dovetails into our next function and that's going to be where we actually stream that response back this is going to be where we declare all of that logic to interact with open AI if you have a system message you can include that here otherwise you can include all of the different facets of interacting with the llm all within here if you're using function calling or what have you you can scaffold it out within this function as well once you've declared all of the different pieces on how you're going to interact with open AI whether you want to swap about the model or function calling or what have you we're going to be taking what we've declared here and we're going to be converting that request into Json which is going to be what we're ultimately going to pass to the openai endpoint here we're sending our request to openai if we do have an error within our response we're going to log that out to the terminal otherwise we're just going to continue on and ensure that the response body is closed when the function returns from there that's going to be where we read the stream what we're going to be doing we're going to Loop through the stream and then as the stream comes through we're going to be parsing those different responses and then ultimately we're going to be sending all of those different request to the front end this is going to be how we send the different messages here at the start of the message if it's the first token that's going to be how we prefix it with AI and then we're going to be sending that content that is within the token that we receive from the open AI m point to the front end and that's how we have the streaming effect and that's pretty much it for our application to get set up on Railway it's really straightforward so I'll link the documentation in the description of the video and then depending if you have Mac OS or if you have Windows there are some different commands on how you can install Railway but if you're on Mac for instance you can Brew install Railway then you'll run through a command once it's installed to authenticate then you can Railway log in and then to get set up with a new application it's really straightforward you can just Railway a knit you can name your project and then ultimately you can Railway up and then that's going to deploy it to railway for you and then you'll see within the terminal hopefully you'll have all the success messages of it compressing it and sending it into Railway and then once it's set up you can just open up your application it should look something like this you can put in your openi variable within variables here so just put it in open aore API key paste in your environment variable there and then to get a general ated domain you can go over to settings and then you can just generate a domain there will be a button that you can click there and then there you have it you have a go HTM X AI application that's it for this video I just wanted to try something new and show you a different approach on how you can build potentially a really efficient little AI application but if you found this video useful please like comment share and subscribe otherwise until the next one
Weekly deep dives on AI agents, coding tools, and building with LLMs - delivered to your inbox.
Free forever. No spam.
Subscribe FreeNew tutorials, open-source projects, and deep dives on coding agents - delivered weekly.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.