
Getting Started with Open Web UI: A Self-Hosted Interface for Large Language Models In this video, I'll guide you through setting up Open Web UI, a feature-rich, self-hosted web interface for large language models. You can use it to interact with local Olama models or OpenAI compatible models like GPT-4o and Groq (Llama-3, Mixtral, etc.). I'll show you deployment options using Docker or Kubernetes, and explain how to use its extensive features such as uploading files, recording voice, and generating responses. Additionally, I'll demonstrate how to integrate different models, configure API endpoints, and tweak advanced settings, all while showcasing the user-friendly interface and helpful documentation. By the end of this video, you'll be able to effectively use Open Web UI for managing and interacting with your language models locally or on your own infrastructure. 00:00 Introduction to Open Web UI 01:40 Interface Walkthrough 03:03 Advanced Settings and Configurations 04:55 Image Model Demonstration 05:33 Prompt and Document Management 06:30 Getting Started with Setup 07:56 Conclusion and Final Thoughts Link: https://github.com/open-webui/open-webui
--- type: transcript date: 2024-06-05 youtube_id: oMU00csM4EM --- # Transcript: Open WebUI: Self-Hosted Offline LLM UI for Ollama + Groq and More in this video I'm going to be showing you how to get started with open webui which is a versatile featur self-hosted webui for large language models with open UI you can use your llama models or you can also interact with open AI compatible models and what's great about this project is it's all set up to be deployed you can either choose the route of going with Docker or you can go the route of deploying this on kubernetes if you'd like it has support for both CPU as well as GPU depending on the hardware that you have you can also separate your hardware and the web UI itself let's say you have your GPU hosted somewhere else and you want to interact with that endpoint you'll be able to do that as well so first I'm going to give you a little bit of an overview of the platform itself then I'm going to dive in and show you how to get set up from scratch so just to give you an idea as soon as you install this you'll be able to see all of your local models so if you have a llama running all your models will automatically pre-populate within the drop down here as you see I have llama 3 selected and if I just go tell me a joke back and then we also have our chat history on the left hand side here and then in addition to that you have all these different features if you want to edit the text if you want to copy it if you want to have it read aloud you can do that you'll be able to get up and running really quickly and I think this is going to be a really great option for a ton of people that are interested in using local model some other options here it will give you the generation information so how long it took to generate how many tokens was used you can say whether it's a good or bad response and then you you can either click continue or regenerate the response that's a little bit about the interface itself the other nice thing is out of the box you've have the ability to upload files say if you want to upload a PDF or an HTML file or whatever it might be you can go ahead and throw that in there as well you can also record your voice so if I just say hello world and then I submit that you can get a response that way and then there's a ton of little subtle things that are built within this toggle the bar back and forth just like you would something like chat GPT if you want to make a new chat you can just click that button but the thing that I think is a really nice feature is I like having the ability to swap out between both my local models and then also try something at an endpoint potentially say if I want to just try something out with GPT 3.5 turbo and I just say hello world what you'll be able to do is if you put in your API key from whether it's open AI or Gro you'll be able to interact with those models directly all within this platform that you have whether it's locally or deployed on your own Hardware so what's nice with this is you have the option to configure with any API that's opening ey compatible if you just want to add a new llama model so if you saw on Twitter hey we release the new llama model you can just go in here you can put in the model string and it will begin to download it for you so a really nice feature to have that all built in there you can add a tag here which it will put in this leftand corner here just like you see options you also have the ability to download the chat whether it's with a Json text or PDF man so just to go into a handful of more things before I show you how to get set up if you check out the settings here within the settings you do have the ability to set the system prompt you also have some Advanced parameters here if you want to play around with the seed or the top K values or a number of different values that you can go ahead and tweak you also have the ability to set up an open AI compatible endpoint if you want to swap this out to grock you can put in grock endpoint here and put in your grock API key and then we also have our AMA API so in this instance I'm running it with IND Docker and there's instruction on how to set this up depending on where you have your old llama models running as well you can also tweak the interface a little bit so if you want to change the default prompt suggestions you can do that and there's even experimental features for the memory feature just like open AI has within chat GPT where it will choose to remember certain aspects and do a rag process on the back end to bring them in to certain conversations if it sees fit you can play around with that feature as well in here do you also have the ability to change out the text to speech and so in here we have the web API you can also set it for open AI if you'd like and then in terms of the speech to text you can also do this through the web API or you can do it locally if you'd like now the other thing that I really like about this is it gives you the ability where you can upload images as well if you're using a model that does support images which I'll demonstrate in just a moment here you'll be able to incorporate that say if you want to use one of the lava models you'll be able to do that as well just a couple other things you can export your chat history your account and all of that here as well the other thing with open web UI is it doesn't necessarily need to just be locally running on your machine if you deploy it on your own infrastructure you can add in your own users within this in my case here that I'm the admin would say if you want to add other users you can do that as well and you can just walk through the promps to do that I just wanted to quickly show you one of the image models I have lava running here locally upload a file here and I say what is this image this is a relative small image model and there you have it I have this lava model running locally it's able to tell what that image is all without actually using any inference and incurring any cost thing if you select a model that does have the Invision capabilities you can upload a file just like that and then you can also use something like gp4 Omni models now within the workspace here you have all the different models that you have loaded up whether it's locally within Ama or the models on if you plugged in open AI or another vendor there's also an area within here where you can store different prompts you can look look at the community of the different contributors out there that are submitting these different prompts you can search for specific prompts or you can just look add specific examples so anyone that's interested in prompt engineering and that type of stuff you'll be able to find some really good templates within this and then the nice thing with the prompts with how they're set up is you'll be able to access these prompts right within the chat which I'll show you in just a moment then there is also a document feature so you can upload documents within here if you want to ask questions of particular documents you can throw in different documents within this section as well and then similarly for the documents you'll also be able to access them within the chat with a simple command there's also a playground as you might expect so if I just go back to chat here to access your prompt templates you can just click that forward slash and then to access your documents you can click the Octor you can ask a question of all your documents or you can select the particular document that you want to interact with to get set up there's going to be some prerequisites on your end you're going to have to install Docker you can just go ahead and search to download Docker on the doer. website and then you'll also have to download ol now if you don't want to use this for ol you can use this if you want to use it just as an interface to interact with whether it's the open AI API or the Croc API as well depending on how you have it set up you can also even download it from a zip file if you like so once we grab this we can go over to an empty directory here so if I just list it out we see that it's empty and then we can clone the repository so while that's pulling it down everything that we need we're just going to scroll down here to the docker construction depending on your setup just make sure to grab the proper Docker command for what you're trying to accomplish in this case we're going to be using hulama on our computer I'm just going to copy the string here all right so now that it's all downloaded what we're going to do is we're just going to CD into our directory here to do is paste in that command but you have to make sure that you actually do have Docker running if you've downloaded Docker just make sure that you actually have that running and also also make sure if you're going to be using AMA they have Hama running as well you can just run that command might take a couple minutes just to run through everything once it's set up you can head over to Local Host 3000 and you'll be able to interact with that it's really nice that it has a dogger container where you could just go ahead and deploy this to an instance if you like you can run it locally just like I'd shown you but that's pretty much it for this video if you found this video useful please comment share and subscribe otherwise until the next SE
Weekly deep dives on AI agents, coding tools, and building with LLMs - delivered to your inbox.
Free forever. No spam.
Subscribe FreeNew tutorials, open-source projects, and deep dives on coding agents - delivered weekly.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.