(First off, as a disclaimer, I no longer work for Dropbox, I don't speak on their behalf. I've only used the feature as a user.)
I don't know a common search/find system that open()s or read()s files during the search by default. AFAIK Spotlight and Windows search are indexed searches. As for the indexing operations, I don't know how that is handled, they could disable indexing for remote files, or they could somehow integrate with indexing.
Based on my testing of a pre-released version of the feature (it isn't released yet), if you were to do something like `find ~/Dropbox -type f -exec md5 {} +`, it would download files.
As a user it did exactly what I expected. I was truly amazed. It was totally seamless and amazing.
Compared to the complexity of what has already been implemented, solving the problem of "I want to recursively open/read every file in my Dropbox, but I don't want it to download terabytes of data and fill by hard drive" seems fairly simple. For example there could be a setting for the maximum about of space Dropbox will use up, e.g., 40 GB, plus Dropbox could be smart enough to detect disk usage. If you `grep -R` it may download/open/read the files, once you reach 40 GB or near your disk capacity, Dropbox could start removing local copies of files that are not pinned to be local, i.e., remove the files that were downloaded because of the open()/read(), not the files you explicitly told it to keep local. I don't know how the team will choose to implement these features, but I'm confident that it will be well-thought-out and tested.
Remember, Dropbox is the company that especially monkey patched the Finder to get the sync icons (http://mjtsai.com/blog/2011/03/22/disabling-dropboxs-haxie/). They will go to great lengths for a seamless user experience, and do a ton of testing. I have no doubt that when Project Infinite is widely available it will be amazing, seamless, and have functionality many people thought wasn't possible or only dreamed existed.
Valid question. I wouldn't be happy if I ctrl-f on "My Documents", do a search and a 1 TB download starts up invisibly in the background filling my hard drive.
I suppose any company that is giving all their encrypted data to Dropbox to begin with may be OK with it. But most companies are already sketched out by the mere fact that their data is accessible to anyone outside the company.
In any event, if they were to index and provide search as a service as well, I wouldn't think it's something they do quietly. It would most likely include it's own huge marketing campaign.
Could Dropbox detect repeated access patterns from the same process, and/or whitelist processes as known "searchers," and start returning blank files? This seems like the kind of problem only a unicorn would dare to tackle, but as luck would have it...?
You want to save space by not having data on your local system but use a local search to look in the contents of files not on that system? You can't have your cake and eat it too.
I believe this is not the case here. In order for the files to start taking space on your drive you would actually need to right click that folder and choose "Save a local copy".
This is what I was wondering - we'd have to be careful writing a script that happened to traverse into the dropbox folder, because it might try to inflate all the files. It still seems like a cool idea, but I wonder if they have a workaround.
Spotlight is enabled by default _and_ left enabled on basically all the Macs and Mac users are actually a big userbase for Dropbox. It is very unlikely Dropnox team will forget that Spotlight indexing is running in the background.
Does not mean the files will get indexed, but there is no chance that Spotlight will trigger a unexpected terabyte download in the background.
I don't know a common search/find system that open()s or read()s files during the search by default. AFAIK Spotlight and Windows search are indexed searches. As for the indexing operations, I don't know how that is handled, they could disable indexing for remote files, or they could somehow integrate with indexing.
Based on my testing of a pre-released version of the feature (it isn't released yet), if you were to do something like `find ~/Dropbox -type f -exec md5 {} +`, it would download files.
As a user it did exactly what I expected. I was truly amazed. It was totally seamless and amazing.
Compared to the complexity of what has already been implemented, solving the problem of "I want to recursively open/read every file in my Dropbox, but I don't want it to download terabytes of data and fill by hard drive" seems fairly simple. For example there could be a setting for the maximum about of space Dropbox will use up, e.g., 40 GB, plus Dropbox could be smart enough to detect disk usage. If you `grep -R` it may download/open/read the files, once you reach 40 GB or near your disk capacity, Dropbox could start removing local copies of files that are not pinned to be local, i.e., remove the files that were downloaded because of the open()/read(), not the files you explicitly told it to keep local. I don't know how the team will choose to implement these features, but I'm confident that it will be well-thought-out and tested.
Remember, Dropbox is the company that especially monkey patched the Finder to get the sync icons (http://mjtsai.com/blog/2011/03/22/disabling-dropboxs-haxie/). They will go to great lengths for a seamless user experience, and do a ton of testing. I have no doubt that when Project Infinite is widely available it will be amazing, seamless, and have functionality many people thought wasn't possible or only dreamed existed.