|
|
|
|
|
by adam_patarino
192 days ago
|
|
Yeah, that matches my experience: LLMs are amazing “script interns” and shaky “systems engineers.” One trick that helps on bigger stuff: force it into a tight loop of (1) write a failing test for a single behavior, (2) implement smallest change, (3) run tests, (4) refactor. When you make the unit of work “one green test,” the model’s tendency to wander gets way less destructive. |
|
Outside of work I've been running a pure vibe-coding experiment where I don't look at the code at all, ever. I'm using this approach of telling it a specific scenario has to work in a certain way (the software relates to financial and tax planning).
The AI bot is very creative at creating a mess even with such tight guardrails. Many days into it I discovered that it had implemented four completely separate tax computation routines. All of them buggy in different ways. All of them addressed specific scenarios I had specified as part of the spec. But it never occurred to the bot to have a single centralized tax function! It is very good at satisfying specific scenarios I give, but absolutely terrible at any kind of system-wide planning.
(I'm using cursor for this experiment)