Hacker News new | ask | show | jobs
by amluto 135 days ago
This article is lumps design and implementation together. In my experience, LLMs are really quite bad at designing anything interesting. They are sort of tolerable at implementation — they’re remarkably persistent (compared to humans, anyway), they will tirelessly use whatever framework, good or bad, you throw at them, and they will produce code that is quite painful to look at. And they’ll say that they’re architecting things and hope you’re impressed.
5 comments

In my experience, LLMs are bad at designing even repetitive boring things.
(Author here) I meant "design" as in designing physical objects — and all our "programming" is "design" in this definition, because "manufacturing" is done by compilers and bundlers.

And I wouldn't write this article 3 months ago. Since then the quality of the output jumped significantly, it is now possible to put the agent into a proper harness (plan/edit/review/test) and the output is good — and if it's not, you discard it and try again, or point out a detail for the next cycle of improvements.

Yes, this requires a lot of forethought to set up, but it works.

I'm not talking only about "web things", I'm working on a project that involves engineering calculations and a lot of optimization of hot paths, both CPU and GPU.

>LLMs are really quite bad at designing anything interesting

Let’s be honest, how many devs are actually creating something interesting/unique at their work?

Most of the time, our job is just picking the right combination of well-known patterns to make the best possible trade-offs while fulfilling the requirements.

> Most of the time, our job is just picking the right combination of well-known patterns to make the best possible trade-offs while fulfilling the requirements.

Right. I don't trust LLM's to pick the right pattern. It will pick _a_ pattern and it will mostly sorta fulfill the requirements.

Today I asked an LLM (Codex whatever-the-default-is) to implement something straightforward, and it cheerfully implemented it twice, right next to each other, in the same file, and then wrote the actual code that used it and open-coded a stupendously crappy implementation of the same thing right there. The amazing thing is that the whole mess kind of worked.
Right. Kind of works is their MO at the moment. I do try to keep in mind that just because something sucks at the moment, that doesn't mean that it will always suck (especially when you pour _trillions_ of dollars into it)
The problem is that OpenAI and Claude don’t care if their tools produce good code, as long as people pay for them.
Just pick patterns yourself and let LLM fill them in with colours :)
(Author here) I found that over time I spend more time striping someone's badly designed abstractions to get to the real functionality. LLMs are surprisingly good at figuring it out, plowing through the code and documentation and finding out that a 100MB library is in reality a HTTP client for 7 REST endpoints, or something like this.
Unless you work for a consulting firm, you should be working on something new/unique.

It’s a winner-takes-all market. There are no buyers for off brand Salesforce or Uber.

That feels a bit rigid.

Many people are in position where they can’t afford risking their financial future by going all-in on startup. They just want to do honest work in exchange on paycheck and enjoy time with family after 5pm and on weekends.

There are not? So Lyft and bolt do not exists?

Same with Salesforce, there are a few hundred alternatives

I was wondering if our goal is to leverage them to think about interfaces a bit, like a slightly accelerated modeling phase and then let them loose on the implementation (and maybe later let them loose on local optimization tricks)

    > …a slightly accelerated
    > modeling phase and then
    > let them loose on
    > the implementation…
If you mean _visual_ modeling ala UML [1], then I have it on "good authority" [2] that's a sound approach…

_____

MODEL

The Verdict: If you provide a clear instruction like "Before you touch the code, read architecture.puml and ensure your changes do not violate the defined inheritance/dependency structure," the agent will be very effective at following it.

If you just "hope" it bears it in mind, it probably won't.

_The agent is a tool, not a mind-reader; it will take the shortest path to a passing test unless you wall that path off with your architectural models_.

To make it actually work, you need to turn the UML from a "suggestion" into a "blocker." You should add a section to your AGENTS.md (or CLAUDE.md ) that looks like this:

    1. Tool Trigger: By using words like "…"

Why this works:

_____

[1] https://news.ycombinator.com/item?id=46974325

[2] https://g2ww.short.gy/TheMightyBooch

UML or else, a set of types / interfaces that represent the system and its properties. thanks for the suggestion
I was going to say - with agents the only part I actually have to do is design. Well, and testing. But they don’t really do design work so much as architecture selection if you provide a design.