| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by raw_anon_1111 99 days ago
	Absolutely no one is arguing that you shouldn’t have a combination of manual and automated tests around either AI or human generated code or that you shouldn’t have a thoughtful design

1 comments

sarchertech 98 days ago

In a non-trivial app you can't test your way through all of the e2e workflows and thoughtful design isn't what I'm talking about.

How many bugs have you seen that passed your automated and manual testing? Probably 99.9% of them.

Now imagine that you take those same test suites and you unleash an agent on the code that has far worse reasoning capabilities than a human and you tell them they can change anything in the code as long as the tests pass.

link

raw_anon_1111 98 days ago

So if bugs pass through testing which they have forever, wouldn’t that imply that humans are just as fallible as AI - and slower?

I never suggested letting agents code for a day on end. I use AI to code well defined tasks and treat it like a mid level ticket taker

link

sarchertech 98 days ago

If you have an employee who codes 2x faster than everyone else but produces 10x the bugs, would your suggestion to be to let him rip and stop reviewing his code output?

> I never suggested letting agents code for a day on end. I use AI to code well defined tasks and treat it like a mid level ticket taker

It doesn’t matter how long you’re letting it run. If you aren’t reviewing the output, you have no way of knowing when it changes untested behavior.

I regularly find Claude doing insane things that I never would have thought to test against, that would have made it into prod if I hadn’t renewed the code.

link

raw_anon_1111 98 days ago

> It doesn’t matter how long you’re letting it run. If you aren’t reviewing the output, you have no way of knowing when it changes untested behavior.

You’re focused on the output , I’m focused on the behavior. Thats the difference. Just like when I delegate a task to either another developer or another company like the random Salesforce integration or even a third party API I need to integrate with.

link

sarchertech 98 days ago

Unfortunately you are not equipped to observe and test all or even most of the behavior of a non-trivial system.

And if you attempt to treat every module in your system like it’s untrusted 3rd party code you’ll run into severe complexity and size limits. No one codes large systems like that because it’s not possible. There are always escape hatches and entanglements.

link

raw_anon_1111 98 days ago

Actual a little company you might have heard of called Amazon does…

Jeff Bezos mandated it in 2002.

https://konghq.com/blog/enterprise/api-mandate

AWS S3 by itself is made up of 200+ micro services

link