Hacker News new | ask | show | jobs
by sepbot 4997 days ago
People often misunderstand what the Dropbox LAN Sync does. The file needs to be already in the cloud before it can be synced over LAN. This means that when you add a new file, it will first get uploaded to Dropbox servers then transferred to other computers over LAN. Considering that most home connections are asynchronous (slow up, fast down) the entire processes gets bottle-necked by that slow upload rate.
4 comments

This is also how SpiderOak's LAN sync works. Both Dropbox and SpiderOak guarantee that the single point of truth is the cloud, for good reason.

Making exceptions to that approach (like allowing individual machines to sync with eachother directly before transferring to the cloud) creates opportunities for conflicts and inconsistencies that are very difficult to resolve because all the information might not be available.

Your second paragraph reads like a defense of CVS. git-annex handles this fine for me on my network at home with multiple machines and multiple copies of the repository/archive/drop-box-folder on each machine.
With any VCS, commits are explicit. Dropbox and the like need implicit sync.
Are you familiar with git-annex and its helpful assistant?
While it looks cool, it seems to me that it isn't entirely automatic. Sadly the vast majority of people will not go for anything less automatic than Dropbox.
"any changes you make to its folder will automatically be committed to git, and synced to repositories on other computers." [1]

I guess its not automatic in the sense that it does not install itself, but then again neither does dropbox...

[1] http://git-annex.branchable.com/assistant/

Nitpicky, but you meant the typical home internet service is asymmetrical, not asynchronous.
You've given an excellent answer to the question Why is Dropbox's LAN Sync so slow? but the more interesting question is Why don't the Dropbox engineers change how LAN Sync works so it won't be so slow anymore?
+1

Also, uploading the data to Dropbox server, and then downloading again, wastes valuable quota for those under a data cap.

No, it doesn't download it from the master server. It transfers directly between the computers on the LAN. The only thing that has to go through the Dropbox servers is the master file list.
I can understand why, particularly given Dropbox's aim of being as simple as it can possibly be. At the moment the rule is simple. At the moment the cloud can be relied on as the authorative copy which everyone syncs with. If I have 3 or more local clients, all talking and sharing over LAN, many issues get very complicated.
It seems like it could be mitigated by having the client hash the file and notify the cloud of the hash of the file it intends to upload first. But I think your explanation for why they don't currently do LAN-only sharing is correct.
It does do that; I don't understand what you think it'll fix when it comes to multiple clients.

Do you want Dropbox to make a ghost entry with the contents and the hash, then start transferring it over LAN right away? That's going to be confusing when a ghosted file doesn't behave the same as a normal unsynced file. Especially if you start doing collaborative work and the version history is broken because that's cloud-based.

Note that technically the file contents don't need to be in the cloud, but rather the metadata and indexing do. I'm guessing that the indexing is how Dropbox breaks the file into blocks, checksums them and does some form of duplicate detection to save on storage costs.