Hacker News new | ask | show | jobs
by falcolas 3495 days ago
Copying data is always going to be expensive, but it can't be avoided.

The lightest weight solution I've seen is restoring from a daily backup in something like S3, then setting up as a slave from a live master to catch up on the day's binlogs. Still a lot of data to move and load, but at least it's not the entire contents of the DB.

The best you can do is be in control of when data transfers happens so you're doing it when it makes sense and not in the middle of your highest traffic period (which is what frequently happens when attempting to automatically scale DBs in response to load).

2 comments

That sounds a bit like what Joyent is doing in their Autopilot Pattern implementation for MySQL: https://www.joyent.com/blog/dbaas-simplicity-no-lock-in
I'd add that one can use throttling to bring new databases in live, while sustaining peak traffic.

Some databases have a configurable limit in MB/s for replication, or it's possible to assign disk/CPU quota on the slave to slow it down.

Combine that with good planning and monitoring, you'll be fine =)