OpenAI Open Sources Codex: The CLI Coding Agent - Developers Digest

Transcript

--- type: transcript date: 2025-04-17 youtube_id: euUgavBeUwE --- # Transcript: OpenAI Open Sources Codex: The CLI Coding Agent In this video, I'm going to be taking a look at Codeex, which is a new lightweight coding agent that runs in the terminal. This new CLI tool is arguably in response to Anthropic Code, which is another CLI from the team over at Anthropic. What you'll be able to do is within the root of your repository, your directory, wherever your code is, you'll be able to run this codeex command after it's all installed, and you'll be able to make edits or net new changes to whatever you'd like. This is yet another option out there in terms of the agentic AI coding space. The other thing to note with Codeex is if it sounds familiar that's because it is. This was originally released as a product in 2021. This is pre-Chat GPT and OpenAI codeex originally was an AI system that translate natural language to code. This isn't a new model. What this leverages under the hood are models like 03 or 04 mini. You'll be able to select the model that you want to use just from the terminal. So, in this video, I wanted to do a really quick demonstration on how it works and how you could potentially leverage it. But to get started, it is really straightforward. You can go and grab the mpm installation command and you can paste it within your terminal. Now, this does assume that you do at least have Node.js installed. And then once it's installed, you can go ahead and run the codeex command to get started. So, what I'm going to do is I'm just going to quickly create a new Nex.js application just to demonstrate this. Now that I have a project within the root of my directory, what you'll have to do is head on over to OpenAI and get an API key. And then once you have your API key, what we're going to do is we're just going to openAI API key. And then we'll paste in our key and submit it. Now once we have that, we can go ahead and run codeex. And what you'll see within here is we're going to have the default model be 04 mini. You do have the option to also pass in some flags. So you can pass in the flag to specify a different model. If you want to use the 03 model, you can do that. Alternatively, you can pass in some flags for whether you want to approve each change or not. These are just the default settings. And then once you're within here, you can go ahead and ask for your changes. I just want to quickly show you if you do want to swap out approval modes, there are a couple different ways. So I can go and pass in the flag for approval mode and specify it to be full auto. And then I can put in the argument for what I want the change to be. So I can say something like I want to change the homepage to read developers digest and then from there what it will do is it will just kick off that change and instead of asking me for every change it's going to go and it's just going to make those updates. Obviously if you're running it with this full auto mode just make sure that you have committed changes ideally or it's within an application that's just a toy app and isn't a highstakes application. You wouldn't want it to do something you don't want it to do. As we see here, what it will do is it will go through and it will think through the different steps. And the way that this is set up is there's the core model. But what these models in particular 04 mini as well as 03 are really good at is tool usage. Because within the context of coding, you're going to have to read files, write files. There's a number of different steps that occur to have an effective agentic coding system. And the other nice thing with this is you do have a config file that you can edit. If you do want some default configuration, there is a codeex config that you can use. If you put in aconfig folder, within that folder, you can put in a config.yamel specify things like the model. And additionally, you can add in an instructions md, which will act similar to something like cursor rules if you're familiar with that. Now that I see that we have the header and it ran through full auto mode. I'll go ahead and I'll make a new tab here. Now I'm within a new tab. I'm going to go ahead and start our development server. So I'll just bend dev and then I'll open it up here. Within here I have some clarifying questions. Even if you do run it within auto mode. Obviously there will be some interventions where you do have that human in the loop where it does need your feedback. What I'll do within here is I I'll say I want this globally on every page and add in a few placeholder links. Now, one thing that I have noticed with this is it definitely is in terms of the feedback that it gives back to you, not as snappy as something like Windsurf or Cursor because what it will do for the loading state is instead of streaming things out on screen or having the different pieces show within the UI like you'd have within cursor or windsurf, what you'll have is a largely and for some periods of time just this thinking and then it will snap in in sometimes a large chunk of code. So now we see it's added the global header to the root of the layout complete with a developers digest title and three placeholder links. It will show up on every page above your existing content. Let me know if you want to tweak it. Now I'll just open a new tab and I'll start our development server and just take a look at the change. Here is our website. Here's the next.js boilerplate and we have our header here. We have the links 1, two, and three. Now we also have this issue within here. What I can do is I'll just go over here and I'll say I got the following error and let's see if it can go and correct that. Now, one thing to note with this that a lot of people will probably be wondering is whether this is supported on Windows. It's not directly supported. It will require the Windows subsystem for Linux. So, Codex has specifically been tested on Mac OS as well as Linux with a version of node 22 or higher at time of recording. Now, I see after passing in the error, I now have this resolved here. Now, I'm just going to pass it in and ask it a handful of questions. I'm going to say I want to create a beautiful SAS landing page for a brand developers digest. We'll go ahead and I'll send that in and we can see how well 04 Mini performs in terms of the front-end coding capabilities. Here is the first iteration of what we asked for. It's a relatively simple design. There's obviously some potential accessibility issues. Now, I'm going to say I don't quite like the header. Let's have the text all be white. And then let's have the background be a dark purple as well as black. One thing that I did notice is in terms of some people's comparison for codecs with 04 mini compared to claude code is for some tasks, some people didn't actually get the best results from codec. They mentioned that claude code did particularly well on this task for writing documentation of a tricky area of a medium-sized codebase whereas codeex didn't do well at all. But the one notable thing with this where it is different is codeex is open source. That's going to be something hopefully that we'll see it improve quite a bit over the coming weeks and months. Since this is an official OpenAI release, the hope is that we'll quickly see this improve over the coming weeks and months. Now, I see within our navigation, at least we have that purple and black, but I'm still not a huge fan of that hero area. I'm going to say I want the hero area more to match the overall theme of the navbar. So, I'll go ahead and I'll send that in. Just an interesting anecdote with this is where I think tools like this are interesting is I think as coding over time potentially becomes more of a natural language process, it's going to be interesting to see how much of the real estate of the different screens that we use are occupied by tools like this where we just more or less have a natural language interface where we're just simply directing what we want the code to do. It's going to be interesting to see how well we can rely on just a natural language system to build out these applications. Now, obviously, when it's a low stakes application like this, it's a little bit easier, but it's going to be interesting to see over time how trustworthy these types of agentic tools are, especially for some more higher stakes applications. Now, within here, I'm going to say I want to build out a rich, beautiful footer, and I'll send that in. And right off the bat, my first impressions with O4 Mini is it definitely does perform quite well in terms of front-end coding. While the first iteration it didn't spit out something that looked quite great, it does seem to be like the type of model that with enough direction, it will give you something that is pretty feasible and does look quite reasonable. Here we see our navigation. And then from here, I'm going to say I want to create the game of Tetris on a new page. Let's have link 1, 2, and three, and then Tetris. And I want Tetris to be more or less full screen. And I want the background and overall theme to be this purple and black that we have on the screen. So now in terms of usage, since this is built through the API and you are likely going to be using a model like 04 mini or 03, depending on how you use this, the pricing will vary a little bit. Say for instance, if you are going to be using O4 mini like the default is, the nice thing with O4 Mini is you do have the advantage of cached inputs. So for things that you have sent in before, you will be able to benefit from that lower cacheed input cost. That's just one thing to consider, especially if you're going to be hammering on this model. Now, one thing that I will note is if you are using Cursor or Windsurf, is right now for this week, they do have the ability for you to use 04 Mini for free. You can try those out both within Cursor as well as Windsurf and be able to try out those agentic features within their platform without having to pay. Now, here we go. we have this really simple Tetris game, but all in all, obviously the UI is relatively rudimentary. You can definitely tell it was built by an LLM. In just a number of prompts with some really light guidance, we were able to get something of a working application here. Overall, that's pretty much it for this video. I encourage you to check out the repo, try out codecs, let me know what you think of it. How does it compare to things like Cloud Code? Will you be using this over something like Cursor or Windsurf or one of these other agentic IDEs? Otherwise, that's pretty much it for this video. If you found this video useful, please comment, share, and subscribe.

OpenAI Open Sources Codex: The CLI Coding Agent - Developers Digest