| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by lrem 885 days ago
	Does anyone serious do this? That’s an honest question, from a pretty experienced SRE.

2 comments

darkwater 885 days ago

In a world of unicorns and rainbows, absolutely. In the real world, it's as you probably already know: it's not that easy in a complex enough system.

Quick counter-example for GP: what if the 500 spike is due to a spike in malformed requests from a single (maybe malicious) user?

link

laeri 885 days ago

A malformed request should not lead to a 500, they should be handled and validated.

link

darkwater 885 days ago

Well, in the real world it might. It should trigger a bug creation and a fix to the code, but not an incident. Now all of a sudden to decide this you need more complex and/or specific queries in your monitoring system (or a good ML-based alert system), so complexity is already going up.

link

laeri 883 days ago

Query input validation is nearly a solved problem. If you don't I would argue this is an incident if in this case 500's are returned.

link

jabradoodle 885 days ago

You need to validate your inputs and return 4xx

link

darkwater 885 days ago

Yeah and you also shall not write bugs in your code. Real world has bugs, even trivial ones.

link

jabradoodle 884 days ago

If your service is returning 5xx, that is the the definition of a server error, of course that is degraded service. Instead we have pointless dashboards that are green an hour after everything is broken.

Returning 4xx on a client error isn't hard and is usually handled largely by your framework of choice.

Your argument is a strawman

link

jon_adler 885 days ago

True, however it also doesn’t impact other users and doesn’t justify reporting an incident on the status page.

link

tazjin 885 days ago

https://www.buildkitestatus.com/

link