Hacker News new | ask | show | jobs
by ssl-3 162 days ago
The first time I got paid to use rsync was nearly 25 years ago. It provided for reasonably space-efficient, remote, versioned backups of a mail server, using hard links.

That mail server used maildir, which...for those who are not familiar: With maildir, each email message is a separate file on the disk. Thus, there were a lot of folders that had many thousands of files in them. Plus hardlinks for daily/weekly/whatever versions of each of those files.

At the time there were those who were very vocal about their opinion of using maildir in this kind of capacity, likening it to abuse of the filesystem. And if that was stupid, then my use of hard links certainly multiplied that stupidity.

Perhaps I was simply not very smart at that time.

But it was actually fun to fit that together, and it was kind of amazing to watch rsync perform this job both automatically and without complaint between a pair of particularly not-fast (256kbps?) DOCSIS connections from Roadrunner.

It worked fine. Whenever I needed to go back in time for some reason, the information was reliably present at the other end with adequate granularity -- with just a couple of cron jobs, rsync, and maybe a little bit of bash script to automate it all.

1 comments

> there were a lot of folders that had many thousands of files in them

If you ever need to do something like this again, it's often faster to parallelize rsync. One tool that provides this is fpsync:

https://www.fpart.org/fpsync/

And you'd probably use the snapshot feature of a filesystem like btrfs or zfs instead of hardlinks for deduplication :-)
Yes and something like btrfs-send or zfs-send is probably faster than fpsync