Hacker News new | ask | show | jobs
by haar 320 days ago
I've had little success with Agentic coding, and what success I have had has been paired with hours of frustration, where I'd have been better off doing it myself for anything but the most basic tasks.

Even then, when you start to build up complexity within a codebase - the results have often been worse than "I'll start generating it all from scratch again, and include this as an addition to the initial longtail specification prompt as well", and even then... it's been a crapshoot.

I _want_ to like it. The times where it initially "just worked" felt magical and inspired me with the possibilities. That's what prompted me to get more engaged and use it more. The reality of doing so is just frustrating and wishing things _actually worked_ anywhere close to expectations.

1 comments

Bingo, it's magical but the learning curve is very very steep. The METR study on open-source productivity alluded to this a bit.

I am definitely at a point where I am more productive with it, but it took a bunch of effort.

Apologies if I was unclear.

The more I've used it, the more I've disliked how poor the results it's produced, and the more I've realised I would have been better served by doing it myself and following a methodical path for things that I didn't have experience with.

It's easier to step through a problem as I'm learning and making small changes than an LLM going "It's done, and production ready!" where it just straight up doesn't work for 101 different tiny reasons.

My preferred approach to avoid that outcome is to divide & conquer the problem. Ask the LLM to implement each small bit in the order you'd implement it yourself given what you know about the codebase.
The subjects in the study you are referencing also believed that they were more productive with it. What metrics do you have to convince yourself you aren't under the same illusionary bias they were?
Yesterday I used ffmpeg to extract the frame at the 13 second mark of a video out as a JPEG.

If I didn't have an LLM to figure that out for me I wouldn't have done it at all.

LLM's still give subpar results with ffmpeg. For example when I asked Sonnet to trim a long video with ffmpeg, it put the input file parameter before the start time parameter, which triggers an unnecessary decode of the video file. [1]

Sure, use the LLM to get over the initial hump. But ffmpeg's no exception to the rule that LLM's produce subpar code. It's worth spending a couple minutes reading the docs to understand what it did so you can do it better, and unassisted, next time.

[1] https://ffmpeg.org/ffmpeg.html#:~:text=ss%20position

That says more about suboptimal design on ffmpeg's part than it does about the LLM. Most humans can't deal with ffmpeg command lines, so it's not surprising that the LLM misses a few tricks.
Had a LLM generate 3 lines of working C++ code that was "only" one order of magnitude slower than what i edited the code to in 10 minutes.

If you're happy with results like that, sure, LLMs miss "a few tricks"...

It is nice to use LLMs to generate ffmpeg commands, because those can be pretty tricky, but really, you wouldn't have just used the man page before?

That explains a lot about Django that the author is allergic to man pages lol

I remember when I was a kid, people asking a teacher how to spell a word, and the answer was generally "look it up in a dictionary"… which you can only do if you already have shortlist of possible spellings.

*nix man pages are the same: if you already know which tool can solve your problem, they're easy to use. But you have to already have a shortlist of tools that can solve your problem, before you even know which man pages to read.

That’s what GNU info is for, of course.
man -k (or apropos)
I just took a look, and the man page DOES explain how to do that!

... on line 3,218: https://gist.github.com/simonw/6fc05ea7392c5fb8a5621d65e0ed0...

(I am very confident I am not the only person who has been deterred by ffmpeg's legendarily complex command-line interface. I feel no shame about this at all.)

To be a little more fair... that example is tidily slotted into the EXAMPLES section, under the heading "You can extract images from a video, or create a video from many images".

I don't think most people read the man pages top to bottom. And even if they did, then for as much grief as you're giving ffmpeg, llm has an even larger burden... no man page and the docs weigh in at over 8k lines.

I get the general point that ffmpeg is a powerful, complex tool... but this is a weird fight to pick.

Ffmpeg is genuinely complicated! And the CLI is convoluted (in justifiable, and unfortunate ways).

But if you approach ffmpeg from the perspective of "I know this is possible", you are always correct, and can almost always reach the "how" in a handful of minutes.

Whether that's worth it or not, will vary. :)

The correct solution here would have been to feed the man page to an LLM summarizer.

Alas instead of correct and easy solutions to problems we are focused on sci-fi robot assitant bullshit.

You wouldn't have just typed "extract frame at timestamp as jpeg ffmpeg" into Google and used the StackExchange result that comes up first that gives you a command to do exactly that?
Before LLMs made ffmpeg no-longer-frustrating-to-use I genuinely didn't know that ffmpeg COULD do things like that.
I'm not really sure what you're saying an LLM did in this case. Inspired a lost sense of curiosity?
Was the answer:

ffmpeg -ss 00:00:13:00 -i myvideo.avi -frames:v 1 myimage.jpeg

Because this is on stack overflow and it took maybe one second to find.

I've found reading the man page for a tool is usually a better way to learn what a tool can do for you now and also in the future.

This is the rub for me… people are so quick to forget the original source for a lot of the data these models were trained on, and how easy and useful these platforms were. Now Google will summarize this question for you in an AI overview before you even land on Stack Overflow. It’s killing the network effect of the open web and destroying our crowd sourced platforms in favor of a lossy compression algorithm that will eventually be regurgitating its own entrails.
Well, maybe. People will just stop using them and will make fun of people who do. You can only bullshit people for so long.