Hacker News new | ask | show | jobs
by cyphar 3195 days ago
If your backup tool uses rsync under the hood, which is just a normal user-space process that uses the standard filesystem APIs, why does it matter what the underlying storage is? Obviously filesystem bugs can cause issues with this assumption, but filesystem bugs can break a whole host of other things in your backup tool.
3 comments

If I understood the blog post correctly, the backup software turned out to basically work just fine on APFS. It’s the data recovery software that will take some time to port over.
This was what I was left wondering as well. rsync for sure does not have any FS-specific code that I can find. If it uses the FS APIs provided by the OS, just like every other program that touches files, why would we expect anything other than boring old rsync doing it's thing?
rsync has a ton of HFS+ specific code. Making a faithful duplicate of a file (specifically its metadata) is a feat on its own: http://blog.plasticsfuture.org/2006/03/05/the-state-of-backu...
Upstream rsync does not seem to have any special-casing for mac/hfs. I'd wager Apple's extended-attribute handling is added in this patch[0] for the version distributed with macOS.

Even here, there isn't special casing for HFS. Instead, a special library function, copyfile()[1], is used to handle copying files and their associated metadata.

It seems this function was introduced in Mac OS X 10.5, which was after the article you linked. I'd wager copyfile() was introduced in response to the unwieldy file copy mechanics.

After discovering how copyfile() is used in rsync, I am fairly confident that rsync works so well on the new FS as a result of Apple implementing a fairly solid copyfile() for APFS.

[0] https://opensource.apple.com/source/rsync/rsync-20/patches/E... [1] http://www.manpagez.com/man/3/copyfile/

You're looking in the wrong place. Install the homebrew version, it has 3 patches for macOS that aren't in the upstream.

https://github.com/Homebrew/homebrew-core/blob/master/Formul...

The Mac version has some file system specific code https://developer.apple.com/legacy/library/documentation/Dar...:

”E, --extended-attributes copy extended attributes, resource forks”

Filenames are treated a bit differently on APFS. Not sure what else, but it seems there's a lot of other bizarre file metadata on OS X that you might want to sync.

edit: not really sure though, because it doesn't seem like this program cares about any of those metadata.

It cares about those metadata a lot. Backups wont be bootable without it. And there is also disastrous data loss without it, a lot of files have their actual file data embedded in the metadata (resource forks et al). It's nuts because it's based on a 30 year old filesystem. That's one of the reasons why the smooth transition is impressive.
And if you're just using rsync under the hood, you're making money off a fancy GUI with pretty minimal actual effort as well.
That's the way it looks until you actually try to make a commercial piece of software by doing that.

As it mentions in the blog post I come from a background of low level system programming (data recovery, even some kernel extensions, etc). It would have not added significant time to have rolled my own file copy tool. Irrespectively, it took about 1000 hours of work to make that app, as it says in the blog post. Things are nowhere as easy as they look. Doing the actual file copy is the easy part.

Let's suppose you had a fancy GUI wrapper around rsync which covered all the command line options, and a tiny output parser to display progress information. What more effort is needed to get a commercial, market-ready piece of software?
I just answered this in another response (and in addition the answer, you could answer a lot of the questions yourself by actually trying the app. It has animations built into it for example, it's clear at a glance that it's far beyond what you described):

It's also not just a GUI. There is loads of code in there that does stuff. For example:

• Scheduling for scheduled backups

• A background daemon that looks for events to launch the UI for scheduled backups

• Manipulation of disks ownership stuff in order to allow backups with permissions to happen correctly

• Volume blessing, etc, in order to make bootable volumes

• Trash handling & maintenance for deleted files

• Deleting files (for removing old backups) And that was just off the top of my head. The app is many thousands of lines of code and it performs literally hundreds of functions. They're nearly all transparent to the user, and that is the idea.

I used to think like that, turns out things get out of hand quickly once it has to be used by non technical users. I also build a backup app, but in my case for virtual machines [0]

As feelix says it is easy to burn 1000 hours on it. In addition to what feelix says you need to write help, build a site, do a lot of testing, write an installer etc. It does not compare at all to writing a script for technical users.

[0] http://www.vimalin.com

I highly recommend the essay "The Programming Systems Product" from the collection "The Mythical Man-Month" which provides an answer to this exact question.
Come on, be fair, half the web properties are a GUI to something you could have for free on the CLI, making a usable GUI is not a trivial task.
It's also not just a GUI. There is loads of code in there that does stuff. For example:

• Scheduling for scheduled backups

• A background daemon that looks for events to launch the UI for scheduled backups

• Manipulation of disks ownership stuff in order to allow backups with permissions to happen correctly

• Volume blessing, etc, in order to make bootable volumes

• Trash handling & maintenance for deleted files

• Deleting files (for removing old backups)

And that was just off the top of my head. The app is many thousands of lines of code and it performs literally hundreds of functions. They're nearly all transparent to the user, and that is the idea.

Presumably his customers feel it is worth the money, they wouldn't buy it otherwise.