| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by lxgr 583 days ago

> These tools produce non-working code more often than not (OpenAI's flagship models are not even correct 50% of the time[1]), so you still have to read, understand and debug their output.

Definitely, but what LLMs provide me that a purely textual interface can't is discoverability.

A significant advantage of GUIs is that I get to see a list of things I can do, and the task becomes figuring out which ones are going to solve my problem. For programming languages, that's usually not the case (there's documentation, but that isn't usually as nested and context sensitive as a GUI is), and LLMs are very good at bridging that gap.

So even if an LLM provides me a broken SQL query for a given task, more often than not it's exposed me to new keywords or concepts that did in fact end up solving my problem.

A hand-crafted GUI is definitely still superior to any chat-based interface (and this is in fact a direction I predict AI models will be moving to going forward), but if nobody builds one, I'll take an LLM plus a CLI and/or documentation over only the latter any day.

1 comments

Terretta 582 days ago

> OpenAI's flagship models are not even correct 50% of the time[1]

Where does [1] go? In any case, try Anthropic's flagship:

91% > 50.6%

https://aider.chat/docs/leaderboards/#code-refactoring-leade...

link