I'm going to take a contrarian view and say it's actually a good UI, but it's all about how you approach it.
I just finished a small project where I used o3-mini and o3-mini-high to generate most of the code. I averaged around 200 lines of code an hour, including the business logic and unit tests. Total was around 2200 lines. So, not a big project, but not a throw away script. The code was perfectly fine for what we needed. This is the third time I've done this, and each time I get faster and better at it.
1. I find a "pair programming" mentality is key. I focus on the high-level code, and let the model focus on the lower level code. I code review all the code, and provide feedback. Blindly accepting the code is a terrible approach.
2. Generating unit tests is critical. After I like the gist of some code, I ask for some smoke tests. Again, peer review the code and adjust as needed.
3. Be liberal with starting a new chat: the models can get easily confused with longer context windows. If you start to see things go sideways, start over.
4. Give it code examples. Don't prompt with English only.
FWIW, o3-mini was the best model I've seen so far; Sonnet 3.5 New is a close second.
I'm growing to the idea that chat is a bad UI pattern, period. It is a great record of correspondence, I think. But it is a terrible UI for doing anything.
In large, I assert this is because the best way to do something is to do that thing. There can be correspondence around the thing, but the artifacts that you are building are separate things.
You could probably take this further and say that narrative is a terrible way to build things. It can be a great way to communicate them, but being a separate entity, it is not necessarily good at making any artifacts.
I'm surprised that the article (and comments) haven't mentioned Cursor.
Agreed that copy pasting context in and out of ChatGPT isn't the fastest workflow. But Cursor has been a major speed up in the way I write code. And it's primarily through a chat interface, but with a few QOL hacks that make it way faster:
1. Output gets applied to your file in a git-diff style. So you can approve/deny changes.
2. It (kinda) has context of your codebase so you don't have to specify as much. Though it works best when you explicitly tag files ("Use the utils from @src/utils/currency.ts")
3. Directly inserting terminal logs or type errors into the chat interface is incredibly convenient. Just hover over the error and click the "add to chat"
Natural language isn’t made to be precise that’s why we use a subset in programming languages.
So you either need lots of extra text to remove the ambiguity of natural language if you use AI or you need a special precise subset to communicate with AI and that’s just programming with extra steps.
Yep. 100% agree. The whole “chat as UX” metaphor is a cul-de-sac that I’m sure we’ll back out of sooner or later.
I think about this like SQL in the late 80s. At the time, SQL was the “next big thing” that was going to mean we didn’t need programmers, and that management could “write code”. It didn’t quite work out that way, of course, as we all know.
I see chat-based interfaces to LLMs going exactly the same way. The LLM will move down the stack (rather than up) and much more appropriate task-based UX/UI will be put on top of the LLM, coordinated thru a UX/UI layer that is much sympathetic to the way users actually want to interact with a machine.
In the same way that no end-users ever touch SQL these days (mostly), we won’t expose the chat-based UX of an LLM to users either.
There will be a place for an ad-hoc natural language interface to a machine, but I suspect it’ll be the exception rather than the rule.
I really don’t think there are too many end users who want to be forced to seduce a mercurial LLM using natural language to do their day-to-day tech tasks.
wiremine ·12 days ago
I just finished a small project where I used o3-mini and o3-mini-high to generate most of the code. I averaged around 200 lines of code an hour, including the business logic and unit tests. Total was around 2200 lines. So, not a big project, but not a throw away script. The code was perfectly fine for what we needed. This is the third time I've done this, and each time I get faster and better at it.
1. I find a "pair programming" mentality is key. I focus on the high-level code, and let the model focus on the lower level code. I code review all the code, and provide feedback. Blindly accepting the code is a terrible approach.
2. Generating unit tests is critical. After I like the gist of some code, I ask for some smoke tests. Again, peer review the code and adjust as needed.
3. Be liberal with starting a new chat: the models can get easily confused with longer context windows. If you start to see things go sideways, start over.
4. Give it code examples. Don't prompt with English only.
FWIW, o3-mini was the best model I've seen so far; Sonnet 3.5 New is a close second.
Show replies
taeric ·12 days ago
In large, I assert this is because the best way to do something is to do that thing. There can be correspondence around the thing, but the artifacts that you are building are separate things.
You could probably take this further and say that narrative is a terrible way to build things. It can be a great way to communicate them, but being a separate entity, it is not necessarily good at making any artifacts.
Show replies
themanmaran ·12 days ago
Agreed that copy pasting context in and out of ChatGPT isn't the fastest workflow. But Cursor has been a major speed up in the way I write code. And it's primarily through a chat interface, but with a few QOL hacks that make it way faster:
1. Output gets applied to your file in a git-diff style. So you can approve/deny changes.
2. It (kinda) has context of your codebase so you don't have to specify as much. Though it works best when you explicitly tag files ("Use the utils from @src/utils/currency.ts")
3. Directly inserting terminal logs or type errors into the chat interface is incredibly convenient. Just hover over the error and click the "add to chat"
Show replies
croes ·12 days ago
So you either need lots of extra text to remove the ambiguity of natural language if you use AI or you need a special precise subset to communicate with AI and that’s just programming with extra steps.
Show replies
matthewsinclair ·12 days ago
I think about this like SQL in the late 80s. At the time, SQL was the “next big thing” that was going to mean we didn’t need programmers, and that management could “write code”. It didn’t quite work out that way, of course, as we all know.
I see chat-based interfaces to LLMs going exactly the same way. The LLM will move down the stack (rather than up) and much more appropriate task-based UX/UI will be put on top of the LLM, coordinated thru a UX/UI layer that is much sympathetic to the way users actually want to interact with a machine.
In the same way that no end-users ever touch SQL these days (mostly), we won’t expose the chat-based UX of an LLM to users either.
There will be a place for an ad-hoc natural language interface to a machine, but I suspect it’ll be the exception rather than the rule.
I really don’t think there are too many end users who want to be forced to seduce a mercurial LLM using natural language to do their day-to-day tech tasks.
Show replies