Hacker News new | ask | show | jobs
by jasonthorsness 362 days ago
The terminal really is sort of the perfect interface for an LLM; I wonder whether this approach will become favored over the custom IDE integrations.
4 comments

Exactly. It has access to literally everything including any MCP server. It's so awesome having claude code check my database using a read-only user, or have it open a puppeteer browser and check whether its CSS changes look weird or not. It's the perfect interface and anthropic nailed it.

It can even debug my k8s cluster using kubectl commands and check prometheus over the API, how awesome is this?

> or have it open a puppeteer browser and check whether its CSS changes look weird or not.

It's got 7 fingers? Looks fine to me! - AI

Me laughing as a human non-frontend dev having to do anything related to CSS

The number of times that my manager or coworkers have rejected proposals for technical solutions because I can't make a webpage look halfway decent is too damn high.

The one thing "AI" actually does well enough for me is writing CSS. It's actually the only thing I trust it with, because there is very little consequence to trusting the output when it writes CSS.

I have a designer on my team that adds their polish to the basic HTML and CSS I produce, but first I have to produce it. I really don't care what the front-end ends up looking like, that's for someone else to worry about. So I let the "AI" write the CSS for buttons and other UI elements, which it is good enough at to save me time. Then I hand it off to the designer and they finish the product, make the buttons match the rest of the buttons, fix the padding, whatever. It certainly has accelerated that part of my workflow, and it produces way better looking front-end UI styling than I would care to spend my time on. If I didn't have the designer, the AI-generated CSS would be good enough for most people. But, I wouldn't trust the AI to tell me if a page "looks weird". I have no doubt it would become a nuisance of false-positives, or just not reporting problems that actually exist.

sort of, except I think the future of llms will be to to have the llm try 5 separate attempts to create a fix in parallel, since llm time is cheaper than human time... and once you introduce this aspect into the workflow, you'll want to spin up multiple containers, and the benefits of the terminal aren't as strong anymore.
I feel like the better approach would be to throw away PRs when they're bad, edit your prompt, and then let the agent try again using the new prompt. Throwing lots of wasted compute at a problem seems like a luxury take on coding agents, as these agents can be really expensive.

So the process becomes: Read PR -> Find fundamental issues -> Update prompt to guide agent better -> Re-run agent.

Then your job becomes proof-reading and editing specification documents for changes, reviewing the result of the agent trying to implement that spec, and then iterating on it until it is good enough. This comes from the belief that better, more expensive, agents will usually produce better code than 5 cheaper agents running in parallel with some LLM judge to choose between or combine their outputs.

Who or what will review the 5 PRs (including their updates to automated tests)? If it's just yet another agent, do we need 5 of these reviews for each PR too?

In the end, you either concede control over 'details' and just trust the output or you spend the effort and validate results manually. Not saying either is bad.

If you can define your problem well then you can write tests up front. An ML person would call tests a "verifier". Verifiers let you pump compute into finding solutions.
I'm not sure we write good tests for this because we assume some kind of logic involved here. If you set a human to task to write a procedure to send a 'forgot password' email, I can be reasonably sure there's a limited number of things a human would do with the provided email address, because it takes time and effort to do more than you should.

However with an LLM I'm not so sure. So how will you write a test to validate this is done but also guarantee it doesn't add the email to a blacklist? A whitelist? A list of admin emails? Or the tens of other things you can do with an email within your system?

Will people be willing to make their full time job writing tests?
We’ll just have an LLM write the tests.

Now we can work on our passion projects and everything will just be LLMs talking to LLMs.

I hope sarcasm.
They probably won't. But it doesn't matter. Ultimately, we'll all end up doing manual labor, because that is the only thing we can do that the machines aren't already doing better than us, or about to be doing better than us. Such is the natural order of things.

By manual labor I specifically mean the kind where you have to mix precision with power, on the fly, in arbitrary terrain, where each task is effectively one-off. So not even making things - everything made at scale will be done in automated factories/workshops. Think constructing and maintaining those factories, in the "crawling down tight pipes with scewdriver in your teeth" sense.

And that's only mid-term; robotics may be lagging behind AI now, but it will eventually catch up.

As well, just because it pasts a test doesn't mean it doesn't do wonky, non-performant stuff. Or worse, side effects no one verified. Plenty often the LLM output will add new fields I didn't ask it to change as one example.
Having command line tools to spin up multiple containers and then to collect their results seems like it would be a pretty natural fit.
Why would spinning containers remove the benefits? Presumably there is a terminal too interacting with the containers.
Nah, if parallelism will help, it'll be abstracted away from the user.
Tmux?
as the models get better, IDEs will be seen as low level
Wait you write your code by hand??? ewww...
Aider's supported /voice for a while now.
voice is probably the worst human -> compute interface we have.
Human speech evolved with biological constraints and through neurological adaptions to emit and understand the nonlinear output that has lexically fuzzy areas to the untrained ear. So I think it's a rather "lossy" analog to digital conversion because the computer is simulating understanding of a form of information transfer that it itself is not constrained by (digital systems don't have vocal cords and could transmit anything).
You could say that about any form of human communication at all.
What??? It’s literally the worst interface

Do you not want to edit your code after it’s generated?

I'm running terminal in one window with AI interaction and then VS Code with project on same directories so I can see via color coding updated or new files to review in the IDE.

How do you interact with your projects?

How is that better than running your AI interaction in a dedicated toolpane/subwindow directly inside your IDE?

The Chat panel in VS Code has seen a lot of polish, can display full HTML including formatting Markdown nicely, has some fancy displays for AI context such as file links, supports hyperlinks everywhere, and has fancy auto-complete popups for things like @ and # and / mentioned "tools"/"agents"/whatever. Other VS Code widgets can show up in the Chat panel, too. The Chat Panel you can dock in either sidebar and/or float as its own window.

A terminal can do most of those things too, with effort and with nothing quite like the native experience of your IDE and its widgets. It seems like a lesser experience than what VS Code already offers, other than you only have one real choice for AI assistant that supports VS Code's Chat panel (though you still have model choice).

I run aider in VSCode terminal so that I can fix smaller lint errors myself without another AI back-and-forth.
this is demonstrably worse than cursor
Sure, in VS Code. Or Xcode. Or IntelliJ/GoLand/RubyMine.
...if your IDE doesn't have a terminal then it isn't an IDE.
The "old wisdom" on comp/lang.perl.misc, when new people asked what was the best IDE to Perl programming, was "Unix".

You get both editors to choose from, vi _and_ emacs! All the man pages you could possibly want _and_ perldocs! Of _course_ as a Perl newbie you'll be able to fall back on gdb for complicated debugging where print statements no longer cut it.

I have a whole other screen for my terminal(s). The IDE already has enough going on in it.
Then you are not impeded from editing your code because it was written through a terminal process, which seems to be OP's contention.
why wouldn't you want the diffs in the IDE? Its richer and you can do more with them