| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by xsh6942 146 days ago

It really depends by what you mean by "it works". A retrospective of the last 6months.

I've had great success coding infra (terraform). It at least 10x the generation of easily verifiable and tedious to write code. Results were audited to death as the client was highly regulated.

Professional feature dev is hit and miss for sure, although getting better and better. We're nowhere near full agentic coding. However, by reinvesting the speed gains from not writing boilerplate into devex and tests/security, I bring to life much better quality software, maintainable and a boy to work with.

I suddenly have the homelab of my dreams, all the ideas previously in the "too long to execute" category now get vibe coded while watching TV or doing other stuff.

As an old jaded engineer, everything code was getting a bit boring and repetitive (so many rest APIs). I guess you get the most value out of it when you know exactly what you want.

Most importantly though, and I've heard this from a few other seniors: I've found joy in making cool fun things with tech again. I like that new way of creating stuff at the speed of thought, and I guess for me that counts as "it works"

6 comments

raphaelj 146 days ago

Same experience here.

On some tasks like build scripts, infra and CI stuff, I am getting a significant speedup. Maybe I am 2x faster on these tasks, when measured from start to PR.

I am working on a HPC project[1] that requires more careful architectural thinking. Trying to let the LLM do the whole task most often fail, or produce low quality code (even with top models like Opus 4.5).

What works well though is "assisted" coding. I am usually writing the interface code (e.g. headers in C++) with some help from the agent, and then let the LLM do the actual implementation of these functions/methods. Then I do final adjustments. Writing a good AGENTS.md helps a lot. I might be 30% faster on these tasks.

It seems to match what I see from the PRs I am reviewing: we are getting these slightly more often than before.

---

[1] https://github.com/finos/opengris-scaler

link

BrandoElFollito 146 days ago

> I guess you get the most value out of it when you know exactly what you want.

Oh yes. I am amateur-developping for 35 years and when I vibe code I let the basic, generic stuff happen and then tell the AI to refactor the way I want. It usually works.

I had the same "too boring to code" approach and AI was a revelation. It takes off the typing but allows, when used correctly, for the creative part. I love this.

link

spopejoy 146 days ago

The OP question was about agentic utility specifically. I've also gotten great side-project utility from AI codegen without having to marry my project to CC or give up on looking at code by simply prompting when I need something from whatever LLM.

Nothing wrong with CC, but I keep hearing the same kind of app being built -- home automation, side-project CRUD.

What I'm deeply skeptical of is the ability for agentic to integrate with a team maintaining+shipping a critical offering. If you're using LLMs for one-off PRs, great but then agentic seems like a band aid for memory etc.

Meamwhile if you're full CC/agentic it seems like a team would get out of sync.

link

theshrike79 146 days ago

> I suddenly have the homelab of my dreams, all the ideas previously in the "too long to execute" category now get vibe coded while watching TV or doing other stuff.

This is the true game changer.

I have a large-ish NAS that's not very well organised (I'm trying, it's a consolidated mess of different sources from two deacades - at least they're all in the same place now)

It was faster to ask Claude to write me a search database backend + frontend than try to click through the directories and wait for the slow SMB shares to update to find where that one file was I knew was in there.

Now I have a Go backend that crawls my NAS every night, indexes files to a FTS5 sqlite database with minimal metadata (size + mimetype + mtime/ctime) and a simple web frontend I can use to query the database

...actually I kinda want a cli search tool that uses the same schema. Brb.

Done.

AI might be a bubble etc. but I'll still have that search tool (and two dozen other utilities) in 5 years when Claude monthly subsciption is 2000€ and a right to harvest your organs on non-payment.

link

martinosis 144 days ago

This is exactly where LLMS shines, but when you get to a larger project,for me everything falls apart since most of the time the application gets way to complex because the LLM try to guess what you want. This is ok for small project but quite bad for larger ones.

link

theshrike79 144 days ago

Depends on so many things. Like the definition of “large” and what you’re asking the LLM to do and how the project is set up for LLM use.

It doesn’t need to guess if it has the tools and documentation available.

link

donw 146 days ago

Same here. You have to slice things small enough for the agent to execute effectively, but beyond that, it’s magic.

link

hahahahhaah 142 days ago

Terraform is a great use case:

* Unrefactorable and highly boilerplatey

* Probably too big a job and low impact to rewrite as IaC

* AI can do all that tedious plumbing well

* Since result is a depoyment not executable code it suffices to check correct resources are created.

link

andy_ppp 146 days ago

I honestly find AI quite poor at writing good well thought through tests, potentially because:

1. writing testable code is part of writing good tests

2. testing is actually poorly done in all the training data because humans are also bad at writing tests

3. tests should be more focused around business logic and describing the application than arbitrarily testing things in an uncanny valley of AI slop

link

theshrike79 146 days ago

When Vibe coding/engineering I don't think of tests in the same way as when testing human written code.

I use unit tests to "lock down" current behavior so an agent rummaging around feature F doesn't break features A and B and will get immediate feedback if that happens.

I'm not trying to match every edge case, but focus more on end to end tests where input and output are locked golden files. "If this comes in, this exact thing must come out the other end." type of thing.

The AI can figure out what went wrong if the tests fail.

link

andy_ppp 146 days ago

Yeah, I need to start accepting to some degree the world has changed - in the past when I want to understand a system I'd have read the tests, but with AI I can just ask cursor to explain what the code is doing and it's fairly good at explaining the functionality to me.

I'm not sure I feel truly comfortable yet with huge blocks of code that are not cleanly understood by humans but it's happening whether I like it or not.

link