Hacker News new | ask | show | jobs
by dangrossman 3544 days ago
Write a little program in your favorite shell or scripting language that

* rsyncs the directories containing the files you want to back up

* mysqldumps/pg_dumps your databases

* zips/gzips everything up into a dated archive file

* deletes the oldest backup (the one with X days ago's date)

Put this program on a VPS at a different provider, on a spare computer in your house, or both. Create a cron job that runs it every night. Run it manually once or twice, then actually restore your backups somewhere to ensure you've made them correctly.

5 comments

Yep. Here is an example that I use to upload it to dropbox.

I don't delete and/or gzip my oldest uploads though.

    #!/bin/sh

    DATE=$(date +%d-%m-%Y@%H:%M:%S.%3N)
    DB_USER="qux"
    DB_PASS="foo"
    DB_NAME="bar"
    DROPBOX_TOKEN="baz"

    /usr/bin/mysqldump -u${DB_USER} -p${DB_PASS} ${DB_NAME} > /tmp/${DATE}.sql
    /usr/bin/curl -H "Authorization: Bearer ${DROPBOX_TOKEN}" https://api-content.dropbox.com/1/files_put/backup/ -T /tmp/${DATE}.sql
Careful! If someone hacks your server, they now get your Dropbox account.

One alternative is to put these backups into S3 using pre-signed requests rather than Dropbox. An S3 pre-signed request gives permission only to upload files, perhaps only to a certain location in a certain bucket.

It's a bit harder to set up, but the shell script will look almost the same.

You can actually set up app folders in Dropbox so that a particular API key effectively chroots you to that folder. The attacker would only get the backups.
Which is literally the worst scenario. An attacker owns your box and now your backups.
Well, maybe second-worst. The worst would be them getting all your Dropbox files.
No, the worst case is actually losing your data ...
But if he already owns your box what prevents him from accessing your data anyway?
Usually you don't want to give any attackers the ability to destroy all your backups of the server they hacked.

S3's "upload-only" API keys are a solution here: you send the backups into a black hole but the attacker can't delete them.

Looks like Dropbox might not have something like that, giving the attacker read-write access to backups if they can get that API key.

How do you avoid that with any backup service?
Push to an S3 bucket with upload only credentials with versioning turned on.

Your master account (or superuser IAM account if you're paranoid) gives you read/write after 2FA login, but you could share your backup creds with the world and never have your backups pulled out or overwritten.

Use S3 lifecycle rules to expire backup objects after x days; data transfer in is free, the operation requests are pennies per thousand, only the bandwidth is expensive (10 cents/GB) to retrieve the backups when you need to perform a restore (even then, still very cheap).

Also, by storing in S3, you can backup and restore from anywhere.

Use a pull based one instead of a push based one.

My backup system involves my data storage system reaching out to each machine I want to backup and fetching the data locally.

All I need for that is to put my storage system's public key on the machines, and I'm fine with an attacker getting that.

Or put this program on the same VPS but instead of doing the gzipping and versioning yourself, incorporate Tarsnap[1] into the script. With Tarsnap you can also create a read/write-only key so that if someone hacks your server, a real threat mentioned downthread, they won't be able to delete your backups.

And whatever you do, check that you can actually recover from these backups every once in a while.

[1] https://www.tarsnap.com/

My only con against tarsnap is that it can take a long time to do a restore, even for a smallish (30G) backup. Last time I tested at least, I was looking at over three hours. The dev is aware of the issue and may have improved upon it in the meantime.

That is the _only_ reason I have for looking at something else.

Recommended change for additional peace of mind: The backup script does not have deletion privileges. A separate process expires old backups.
Mine used to rsync to my home Linux box and it would copy over all files and database dumps using this.

Then on the local Linux box I had a separate script that would doing snapshots of that directory to a complete different place on the filesystem.

A slightly nicer solution that still has the same spirit as your approach is to use BackupPC: http://backuppc.sourceforge.net/info.html

I've used it for many, many years. Setup is a bit of a pain, especially if it's your first time, but it's a totally reliable backup system and gives you something much better than just a pile of zip archives.

All of our servers get BackupPC'd (rsync-over-ssh, pulled) twice a day to an in-house server that's totally unreachable from the internet. I get emails from BackupPC when something goes wrong, which is pretty much never. Backups aren't a thing I have to worry about much anymore.

This. It's like you've read my code. Also worth considering encrypting backups so your users don't get screwed when your secondary VPS gets hacked.