
Setting up and Accessing Your Ollama Inference Server Locally and Globally The video tutorial presents a detailed step-by-step guide on how to set up and access an Ollama inference server with models such as Mistral, Mixtral and Lllama-2 from anywhere globally. The video establishes the Next JS application using Lang chain expression language, the JavaScript version next year acid, and ngrok for deployment on Vercel. The video explains how to install Ollama, work with local models, set up ngrok and set up and handle post requests. It also shows how to format messages, set up a simple template, create a stream, and manage responses' streaming. The guide further demonstrates how to use client components with phospor icons and AI library, set up chat components, handle URL changes, and other aspects. It culminates in setting up a basic express server, deploying the app to Vercel, and running it, accessing Ollama instance from anywhere with instructions to deploy and interact with the user interface. Repo: https://github.com/developersdigest/ollama-anywhere Links: https://ollama.com/ Links: https://ollama.com/library/mistral Links: https://ngrok.com/
--- type: transcript date: 2024-02-14 youtube_id: FsZfyqIIjtc --- # Transcript: Access Your Local Ollama LLMs Anywhere in this video I'm going to be showing you how you can set up and access your olama inference server from anywhere in the world so I'm going to be showing you how to set up this nextjs application we're going to be using Lang chain expression language the JavaScript version nextjs for the web app and Gro and then we're going to be using versel for deployment by the end of the video you'll have a way that you can access all of your local models you'll be able to select all of the different models that you have downloaded locally and one thing to mention you're going to be able to both deploy this on verell for free as well as use the free tier of enro to facilitate all of this so to get this setup you're going to have to install and sign up for a couple things if you haven't already so you're going to have to install AMA and then you'll also have to have at least one local model installed then next you're going to have to go to enr.com install it on your computer then we're going to need our off token from ourv this is going to be what we put in front of our local server to access our local endpoint if you're looking to deploy this go ahead and make a free account on versell that's going to be what we use to deploy the application and then from there you can either reach for the repo from the description of the video which I'll be posting or you can go ahead and get get started from scratch from the versel AI SDK so if you go within the versel AI SDK you can go within examples so the example page is a great place to start off an application there's a ton of different examples in here so we can just go within the next Lang chain example and from there you can go ahead and MPX create next app within your vs code so we're going to be working out of two files for our web app so we're going to be working out of the route. TS which is in our API / chat route and then we're going to be working within the page. TSX so we're just going to have one page for our application here so the first thing that I wanted to do is run through the back end that we're going to do is we're going to import some necessary modules here so if you pull down the repo from my GitHub repository you can just go ahead and mpm install everything and you'll be able to set all of this up so we're going to declare that we're going to specify the run time to the edge for when we deploy this to versel so from there we're just going to set up our post request and we're going to destructure a few different variables so we're going to be sending across the messages array we're going to be sending across our AMA URL and then we're going to be sending across the selected model that we have then from from there we're going to be passing in the base URL to AMA and instead of just passing in our Local Host this is going to be where we pass in the server that's established from enro and then we're also going to be declaring which model we want to use here we're just going to be setting up a basic function to format our messages that we're going to be including within our chat history then we're going to be setting up a simple template so you can put anything you want within here in this I just said you are a chatbot named AMA but you can put in other things like if you want to have it respond in a certain way or a certain tone or things like that you can put within the template here once that's set up we're going to be setting up an array of all of the previous messages except the current message then we're going to be setting up a variable for the current message as well and from there we're going to be setting up the prompt template that we're going to be passing into our llm so the template that we had just set up here then we're going to be setting up our output parser we're going to declare our chain with the Lang chain expression language just like you see here so we're going to pipe in the llm we're going to pipe in the output parser and then this is going to be where we create our stream so we're going to be passing in the formatted previous messages that we add and then we're also going to be passing in the current message for our input then once we have that set up going to be streaming back those responses like you saw at the start of the video to the front end of the application so from there we're going to head over to our page. TSX we're going to declare that we're going to be using client for this component we're going to declare a handful of components that we're going to be using so we're going to be using Shad CN UI for a handful of components we're going to be using the AI library to communicate with the back end then we're going to be using the phospher icons library for a couple icons that we have here we're going to set up our chat component we're going to go ahead and utilize the custom hooks that are within the AI library from verel and then we're going to set up some of our own hugs these are primarily for what we send to the back end so for our AMA URL as well as the selected model and we're going to create a couple Handler so the first Handler that we have is going to be how we handle the change of the URL so this is going to be how you can go ahead and set up and put in your url if you want to change it with in the web interface so you have that link at the top and then you can paste in your enro link directly within there it's also set up with local storage so you don't have to continually put it in then from there we're going to have a Handler simply to change our model next we're going to set up a simple function to toggle the visibility of that input where you can put in your enro url while we have the ability to put in the input manually if we'd like it it's also going to go ahead and parse a query parameter of the URL and go ahead and create that server so once we actually declare our enro server we're going to have some links that I will output in the console where we can go ahead and just click one of those links and it will have that key from enro that it will pass within here within the constructed URL and then we don't need to go ahead and manually put in anything we'll just be able to click that link and it will look to that query parameter it will it will construct this URL and then that will be how we connect to our local running AMA server and then similar thing here we're just going to set it within local storage if there is that URL parameter and from here what we're going to do is so I have enhanced handle cmit and the reason that I have this is because the handle submit comes baked in to the versel AI SDK so if you want to pass in additional things from the front end to the back end like in our case the AMA URL and the model this is going to be how you can send that back with the handle submit so from there we're just going to be setting up our jsx so we're going to be setting it all up with tail one classes so it's going to be pretty straightforward we're going to have our logo and right at the top we're going to have our dropdown then from here we're going to have the button to toggle the input so if you want to manually put in that endro link you can just click the button and then paste it within here and then when we click that link we're going to have that conditional input where it will show that input you can put in your enro link or if you're just curious what AMA URL you're connected to you can go ahead and click it and take a look there and from there we're just going to be mapping through all of the messages so we're going to be able to declare whether it's user or whether it's the AI response and then we're just going to have our footer which is going to have our input for our message as well as the button to submit from there we're going to be setting up a simple Express server so what we're going to do is we're going to go ahead import Express and gr n.v and the reason that we're using Express is to keep our application persistent so we're going to go ahead and declare a very basic express application you can set this really to whatever Port that you want so once you deploy your nextjs application to versel you can come in here and you can put your deployment URL here or in my case I just put the subdomain for my versell application on my domain so once we have that set up we're just going to be setting up a basic Express server but the main piece to pay attention to is going to be this engro portion so what we're doing here is we're setting up to forward our Port 11434 which is the default Port that set up on your computer when you're running AMA and then we're also going to be passing in our enro o token so to get your o token you can just head on over to your off token and you can put it within your EnV here so once we have that we're going to use a little bit of reject to parse that first portion of our URL and then we're going to be constructing the URLs here so all that we're doing here is to create these quick links here so if you're running this locally you can go ahead and click this to be able to see your local version once you have that deployment URL there you can click this to be able to interact with your web UI or if you just want to check on that enro establish that connection with AMA as it should you can go ahead and click that and you should be able to see that olama is running if it's established correctly so that's pretty much it to get this all started all you have to do is you can go ahead and node index.js to get this all started once you've installed all of the dependencies and then similarly if you want to deploy this to verell you can go ahead versell rod and deploy it or if you want to run it locally you can just go ahead and mpm runev and then once you have that running you can head back to that link there so you see here that we have our local host and this is that URL that I was talking about so this is going to be how it declares that enro server so if I just say hello world we can see those streaming responses on our Local Host then if you deploy it to versel you can go ahead click your link here open it up here so the other thing with this you can go ahead and copy this and then you can send it to your cell phone if you want so if you have your computer on AMA on as well as that enro server setup you'll be able to go ahead and access your AMA instance from anywhere hopefully you found this video useful if you did please like comment share and subscribe otherwise until the next one
Weekly deep dives on AI agents, coding tools, and building with LLMs - delivered to your inbox.
Free forever. No spam.
Subscribe FreeNew tutorials, open-source projects, and deep dives on coding agents - delivered weekly.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.