Hacker News new | ask | show | jobs
by maccard 1054 days ago
> No, it is a consequence of poor engineering on the part of the user of the service,

The entire service going down for 24 hours due to a reboot is not a consequence of poor engineering on the part of the user. A production service which people rely on for critical data failing on the _textbook_ example of running a live service is poor engineering on the services part.

1 comments

I've seen entire datacenters and many services go offline due to 'minor mishaps' and that was stuff done by the largest companies on the planet. If you don't account for failure of underlying infra + services you are not doing it right.

Tarsnap makes very particular guarantees, if you look into that then you'll realize that for some applications it is very useful and for other applications it is not, or that you may have to use not one but multiple backup services to be able to serve all your needs. This can be costly.