Hacker News new | ask | show | jobs
by kiririn 597 days ago
The more we abstract things, treat servers like cattle, and lose low level knowledge, the more things like this will happen

You shouldn’t have to try to reproduce this in a test environment - your infrastructure should allow profiling in live for cases like this. And it should be solved with profiling, not guesswork and bisecting

1 comments

Crazy talk! Next thing you are going to say is that realtime-focused infrastructure should have native counters to detect missing deadlines, so those pops could be preemptively detected via dashboards, instead of via customer complaints?

That's not how you live in AI age. Move fast and break things; YOLO; K8S all the things; etc..

(/s in case it's not clear)