Hacker News new | ask | show | jobs
by rspeele 25 days ago
Suppose the image resize service has some caching, and due to a bug in the caching, under certain circumstances it will respond with an already-cached resized version of a different source image.

Let's say for example it caches on something stupid like the CRC32 of the input image -- good enough that the couple dozen images in your test dataset don't collide, you don't see it in smoke testing your app, but real world data has collisions on a daily basis.

This gets into production and customer A sees a resized version of customer B's document for a thumbnail. Now customer A is wondering how many other customers are seeing resized versions of their private documents in thumbnail images. They are very very mad.

If the image resize service was built by "another team" then that other team is responsible for the bug and will take most of the heat for it. If it was built by an "agent swarm" or "gas town" or whatever under my direction then I'm 100% responsible for it and rightly deserve the heat.

That is why I cannot understand any approach that doesn't involve reading the code at all. Testing alone is not sufficient. MTTR is not sufficient because you can't make a customer less mad about a data privacy bug by fixing it.

2 comments

Practically, this is just about confidence values, anticipated blast radius and balancing testing vs review overhead.
I found this sort of odd. What is your point? Is it good or bad that another team was responsible in one scenario?
Two points.

1. You can treat software like a black box when other people developed it for you because they can stand behind it. They have their own reputations to uphold. You can't when AI developed it for you because YOU are responsible for 100% of the bugs in it. If you take this trendy stance of "I never read or write code, just specs", you are just rolling the dice on what you stamp your name on.

2. Just because you have unit tests and you've tested the software by clicking through the app doesn't mean you've found every bug. There have always been bug types, like the example checksum collision, that are easier to detect by reading the code than by running the code because it will work most of the time even though the approach is wrong.

But I'm already responsible for the bugs in my software. Also, who cares if someone else is responsible? And how does that align with OSS's "no warranty provided"?

> There have always been bug types, like the example checksum collision, that are easier to detect by reading the code than by running the code

AI seems radically, insanely more qualified to not write bugs like that. I doubt that if you polled developers 99% would be able to tell you what a CRC32 even is, let alone why it's insufficient as a cache key.

> But I'm already responsible for the bugs in my software. Also, who cares if someone else is responsible? And how does that align with OSS's "no warranty provided"?

The original example from Simon Willison referred not to pulling in a 3rd party library, but working "at larger organizations" where "another team hands over something". In other words we area all working on the same product for the same company, they have been assigned another part of it and I'm expected to use their code.

In that scenario of course I care that someone else is responsible! It may affect whether I get fired or not!

It's different if you're a solo founder of a startup and for everything you ship, the buck stops with you. But proportionally many many more devs are in a situation where they are a cog in a machine.

> AI seems radically, insanely more qualified to not write bugs like that. I doubt that if you polled developers 99% would be able to tell you what a CRC32 even is, let alone why it's insufficient as a cache key.

I actually do agree that AI generally writes pretty good code. Doesn't mean I'm not gonna check. Sometimes it is too clever for its own good, such as re-implementing from scratch something that already exists and is well-proven.

The whole example is kind of contrived in the first place (how many environments don't have an excellent "image resizing" solution to reach for off the shelf?), so I hope you don't mind my bug example is also contrived.