
In this video, I guide you through setting up the new OpenAI real-time API, which promises new interactive possibilities for developers with its web socket-based architecture. You will learn how to clone the repository, configure the environment with an OpenAI API key, and set up a relay server for backend communication. The API offers real-time two-way interactions and a stateful interface, enabling function calls like getting weather updates with ease. I also explore features like 'set memory' functionality and demonstrate deploying basic applications. Stay tuned for future episodes where I'll cover deploying this in a production environment. By the end of this tutorial, you'll have a functional setup to experiment with and expand upon! Links: Introducing the Realtime API https://openai.com/index/introducing-the-realtime-api/ OpenAI Realtime Console https://github.com/openai/openai-realtime-console Learn The Fundamentals Of Becoming An AI Engineer On Scrimba; https://v2.scrimba.com/the-ai-engineer-path-c02v?via=developersdigest My AI-powered Video Editor; https://get.descript.com/n6dxd9jp6ouy 00:00 Introduction to OpenAI Real-Time API 00:38 Understanding Web Sockets and Real-Time Interaction 01:13 Function Calling Demonstration 01:39 Stateful API and Memory Functions 02:52 Setting Up the Repository 03:11 Configuring the Environment 03:49 Running the Application 04:34 Handling Function Call Outputs 05:11 Exploring the Code and Next Steps 07:12 Conclusion and Next Steps
Weekly deep dives on AI agents, coding tools, and building with LLMs - delivered to your inbox.
Free forever. No spam.
Subscribe FreeNew tutorials, open-source projects, and deep dives on coding agents - delivered weekly.
--- type: transcript date: 2024-10-04 youtube_id: bNMOev4p3_8 --- # Transcript: OpenAI Realtime Voice API: A 7-Minute Getting Started Guide in this video I'm going to be showing you how to get started with the new open aai realtime API by the end of the video you'll have an idea on how to set up this repo and begin to play around with an application and a potential idea that you might have for using the new capabilities with this realtime API open eyes realtime API opens up a lot of possibilities for developers to create interactive applications you can interact with it without even pressing any buttons like you would with potentially other that were out before this and the interesting thing with this is especially within the application they really illustrate exactly how it works the difference with this API is it's actually set up with websockets if you look within the interface here you see the client and the server here you can see this number increment that number incrementing is sending all of the little packets across the network of everything that I'm saying that as soon as I stop talking the network already has that whole payload that it can begin to process that's really interesting using websockets for this API allows for Real Time two-way interaction which is great for applications that require instant updates now the other cool thing with this application is it also demonstrates how you can use function calling if I say something like what's the weather in New York City it sounds like you can trigger a function call to get the weather information for New York City would you like me to demonstrate by fetching the current weather there yes the current temperature in New York is 17.4 De C with a wind speed of 9.7 kilomet hour now the other thing that's really cool with this is you're able to have a stateful API before this what we would have to do is we would have to continually send in the state of the chat completion essentially the chat history of all of the different messages so that the llm would have the context of what was previously discussed but now since it stores the state you're able to say things like yes to the previous questions and it understand what it's talking about so if I say what's the weather in Toronto as well as set some memory for my grocery list tomorrow to pick up apples bananas and oranges so we have the weather in Toronto and then we also have this set memory function this is a great starting off point in terms of starting to learn how to use this real-time API one thing to know with this is that this application does still require additional steps if you're looking to deploy this this is really going to be something where you're going to have to add some layer of authentication to be able to actually deploy something like this and in an upcoming video I also plan to show you how you can use something like this in a more production environment if you're interested in that just stay tuned to the channel I hope to put that out over the coming week or two so the first thing that we're going to do is we're going to get clone this repository and then once we have it I'll just go within the directory here and then I'm just going to open up this within a new cursor workspace here we'll open up our terminal and what we can do from here is we can go ahead and pnpm or npm install everything and while that's installing what we can do is we can set up a EnV what we'll need is an open AI API key so we can just open AI API key equals and then to get your API key you can just head on over to their platform you can go to dashboard and then on the left hand side here you'll see API keys so once you have that you can paste it in here now the other thing that we're going to set up is a relay server and this is going to be how the back end of our application communicates with the websocket connection from open AI here we're going to set it to Local Host 8081 but you can change this out to whatever your relay server is or if you have it on a different port and what have you once you've saved that out you can go and start the front end of your application so you can npm start there we go we have our front end working and then to run our backend we can just pnpm run relay and then we see it's listening on Local Host 8081 and if I go back to our application now that it's all wired up we can test it out we can click connect here hello I'm here and sounds like you're testing the connection can you hear me okay set within memory to buy eggs tomorrow all right I've set a reminder for you to buy eggs there you see that it does have the reminder within the Dom element there and now if I ask what's the weather in Chicago it seems like I'm unable to retrieve the weather information right now would you like to try again later the interesting thing with this that I have noticed is sometimes the function call output comes after the assistant has already responded that's one thing to be mindful of with the websockets is sometimes the function invocations can take a little bit of time and if it triggered to respond back it might not have the context until I ask it again so now if I ask what's the weather in Chicago the current weather in Chicago is around 17.9 de C with a wind speed of about 10.6 km per hour that's just something to be mindful when building out your application in terms of next steps there is some good information within the read me here within here you'll be able to see how it's streaming back the audio but probably more importantly what a lot of people will be playing around with and setting is how to add their own function calling capabilities here is how we append tools to the websocket connect and what you can do here is you can just establish with natural language what the function call is doing and in this case it's using a free no API to get the weather endpoint and then it's returning that payload in terms of next steps this console page. TSX really has the line share of how all of this is set up and configured there's quite a bit within here for managing the state within the application on the front end maybe websocket connections and all of that within the application there's a couple pieces where I'd start to play around with this and swapping out maybe my own components or my own idea for a little application I'd really encourage you to look at add tools within add tools we have set memory and then we have the get weather function it's pretty straightforward so you can Define your function definition with natural language here all of the different properties so whether there are any arguments that your function requires and then the function itself you see the example defined right within here we're setting a memory KV within the G weather example we're actually reaching out to the API and then we're going to be getting the location the coordinates and then we're ultimately going to be getting the temperature as well as things like wind speed and what have you is these are a great starting off point you can look for set memory you can see what it's doing once it's returned in this case we're setting the memory KV here if we just search for set memory KV we see that memory KB and you can just search through this at a certain point we'll see this within the jsx here we see that Json stringify and that's going to be how it appeared on the screen here but this could very well be a component that you pass in as props and render whatever it might be say if you have a financial ticker or something like that if it's a financial app you could just ask for what is Apple stock price pass in say the AAPL ticker and have it render a chart this is just a super quick video on how to get set up with the new realtime API if you found this video useful please like comment share and subscribe otherwise until the next one
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.