Hacker News new | ask | show | jobs
by zyamada 2145 days ago
Possibly, but how can we be 100% sure without the ability to compare behavior? If I were following this line of research I'd still want to know if there's any difference in the nature of a failure when it comes from within the OS (possibly simulated by half -f) and the situation the parent OP pointed out where the instance just goes poof without sending any kind of signal to the OS itself.
1 comments

Though of course then the trouble is, if AWS is simulating specific behaviour, it won't be exactly the same as real problems when they occur. It is a bit better, but hard to say how much better.

I'd think the key on this is being able to simulate very specific partial-failure conditions. e.g. specific packet loss, loss of connections to EBS, etc. Just turning machines off I expect wouldn't be that valuable.

> Though of course then the trouble is, if AWS is simulating specific behaviour, it won't be exactly the same as real problems when they occur.

If Amazon simulated power failure, or network cable disconnect, or potentially even corrupt writes to disk, I personally feel it would be indistinguishable from the event really happening.

The problem is that they can't really realistically simulate that for a single instance unless you've got the whole physical machine to yourself, and of course the worst issues tend to be when it's half-broken, not entirely off/disconnected
Power loss or network disconnect could easily be simulated without needing the whole physical machine.