Hacker News new | ask | show | jobs
by samfisher83 3245 days ago
Do you think you will do a better job of backing up stuff compared to google, msft, etc.? They have dedicated engineers and spend lots of money on this stuff.

Think of it from a statistical perspective what is probability of you setting up this back up system vs them?

7 comments

Reasons why data can go missing:

- account compromised, wiped out

- operator error

- malicious employee

All of these have happened to companies that I have worked with, so no, I won't do a better job of backing stuff up comapred to google, msft, etc, BUT I would rather have some get-out-of-jail-free card if any of the above should happen and suddenly where there used to be data there is nothing.

You should approach this from a cost-benefits perspective, not from a skills perspective.

They may have better engineering, but they also have extra risks. My home server will never ban me because it thinks I've violated its TOS, for example.
Nor will your own storage lock you out because you've annoyed a state actor, while a cloud provider will roll over.
I actually really doubt that Google, Amazon et al have proper backups of every client's storage - I've never come across details or even an idea of such a system. They just have enough redundancy and, more importantly, a "never-delete" architecture - data is merely tagged for deletion for a significant amount of time before it's ever deleted, and various systems check consistency on an ongoing basis.

Of course, even that doesn't prevent you from fucking up - your datastore will do exactly what you tell it to. Nobody can prevent you from doing the equivalent of rm -rf on your S3 store, or accidentally deleting the only copy of that movie your client's been working on for the last four years, and nothing can protect you from it except a decent backup.

Not sure about GCP, but Google certainly has back-ups for GMail. I was affected by an outage where only a few accounts (maybe millions but at least not a lot by Google standards) had emails deleted due to a software issue. They explained that recovery would take a few hours because data had to be restored from tape. At least that's the message they showed when I tried to login. Note that this was the free GMail product, no business support.
Even though there is some reward for expertise, backups are not difficult. What exactly does Big-4 bring to the backup table that none of us with Amanda, rsync, or BackupExec could do?

Cost of resources aside, a person could run hourly full-backups all day every day and have just as good a backup regime as a billion dollar company. Time-to-restore is something that the aforementioned expertise factors into, but a good backup is the linchpin, and can still be restored by whatever means.

Nobody said that you shouldn't have any data in the cloud. The argument is that you shouldn't have your data only in some cloud.

If you have your data in some cloud (either directly or as backup) as well as in your really crappy backup solution that has a 10% failure rate, you still are ten times less likely to loose your data than by just keeping your data in the cloud.

It seems like storing your backups using two cloud providers is much more reliable than using just one cloud provider + local NAS.
As long as - if this is a company - different people have access to those accounts, yes.
It's a good point, but I can't help but think of all the mistakes, disclosures, privacy violations, poor design, gratuitous change, etc. that has happened at the hands of "dedicated engineers." In this case specifically, are the engineers who caused this incident not dedicated and well paid?
Addition backups only add, not subtract, from reliability.