Hacker News new | ask | show | jobs
by gerdesj 19 days ago
LLMs are an additional tool to add to your arsenal. They are not omnipotent and need care, just like any other tool.

My best effort, so far, at an analogy is a modern drill driver compared to a screw driver/brace and bit/etc:

You can get some remarkable results in a very short time compared to the "old school" gear.

You can get some "amazing" anecdotes eg "I screwed down an entire floor at 16" x 1" c/c within an hour instead of an entire day and I took loads of fag breaks" (I could have used a nail gun instead in half the time but I'll never raise that floor easily in the future, and probably done at twice the cost)

I have several on prem LLMs and access to the rest and I'm pretty sure I'll be extending my analogy to ... brand, eventually.

What I do not expect to be doing is looking for a new job. A drill driver is not a carpenter/site labourer/useful without a person!

1 comments

But a modern drill absolutely 100% removes the need for a brace and bit. An LLM doesn't replace any existing tools.
I think we will see very limited human displacement - it'll be in narrow places where it makes sense. Much of it will just be augmentation.
From what I've heard for many devs it replaced an IDE... I still use one myself, but I've a lot of people don't anymore.
Basically IDE free since May 2025. I actually reinstalled vscode when setting up a new machine and I think I've launched it twice?

cc -> local automated testing -> github -> PR -> heavy integration tests -> review (github ui, +/-) -> manual test locally -> merge -> deploy -> manual test remotely -> synthetic user testing -> repeat

But what about navigating the code by the call stack? I didn't know that GitHub has a way to do that. Or maybe I'm probably coming across as being dumb enough to be talking about still trying to have a mental model of what calls what.
Personally, I use a debug agent for that.

I've never used breakpoint debugging, was always a printf debugger. And now an agent can do that loop for me.

Prompt is usually something along the lines of:

>I would expect the behavior of this to be [X] - instead I'm observing [Y]

And the agent will form hypothesis, place printf statements, compile, and scrape logs on loop - each loop ruling out hypothesis or narrowing down what portion of the code is responsible for the unexpected behavior.

It has been able to pin-point the exact line(s) of code responsible every time I've reached for it so far.

For what it's worth, generally speaking I read all of the code and keep it in my brain - I have some uncommon assets in that regard like a high reading speed and great memory. `git grep` is the other tool I use often.

I rarely find that the call stack is the limiting factor, to me, and I suppose I do something similar to what you're talking about but just in my head - I know where a file is referenced via imports, what a function does, and what the flow of control is like.

Do you not need to use the debugger sometimes? Or can cc debug by itself
> Do you not need to use the debugger sometimes? Or can cc debug by itself

A key feature of AI coding assistants and coding agents is troubleshooting. It turns out that LLMs excel at pattern matching, specially when coupled with feedback signals. It turns out that troubleshooting represents just that. A few years ago people searched the likes of stack overflow to fix problems, and it turns out LLMs can do the equivalent of that much faster.

Coding agents can use debuggers if they need to.

From what I've seen they're more likely to run a python -c "import your_code; your_code.do_stuff()" experiment to figure out what's going on though.

I have not used a debugger in anger in perhaps a decade. I write tests, and if that's not enough, I write more tests.

Tests stick around and prevent future problems, whereas the debugger only shows me something once.

But tests show you if a bug is happening, they don’t help you understand the underlying cause of the bug. In a decade, you haven’t hit a compiler codegen issue, a silicon erratum, a race condition, or anything that required actually spending effort understanding the causal path?
I've heard that. I hope that the people who are IDE-free are just better AI-wranglers than I am because my experience has been that if I can get an agent to one shot something, it's fine but if I can't, the agents tend to make an absolute mess of spaghetti that doesn't actually do what it was asked to do IME.
There are times when you quite literally need to feel the thing you are drilling into and a brace and bit give you feedback that a drill driver wont.

Pretty sure I once saw a surgeon using a brace and bit for something unpleasant. I think it was brain/skull related.