| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by _jal 1520 days ago

I don't see this. I have thousands of long-lived instances - full VMs, not containers, running in our hardware.

If they start "going bad", something is wrong. That's a signal I wouldn't want to ignore.

It has happened - once an HBA in a storage node was causing occasional corruption, another time due to a communication failure people were building things with the wrong version of something which had a memory leak and would eventually summon the OOM killer. There have been other issues.

"Have you tried turning it off and back on again" is still a terrible system management strategy.

1 comments

bognition 1519 days ago

Failure rates in AWS are probably higher than what you're seeing in your own hardware.

link

_jal 1519 days ago

Maybe. If you don't look, you don't know.

But given the number of people I've heard using "we're on AWS, out of my control" as an excuse, this appears to be an unofficial service they offer.

link