It looks like this doesn't work with folders... or have some kind of hash to detect alteration. While I appreciate you sharing the project, this is probably premature at this point.
Here are somethings that I would consider adding:
* a README (update: done! :) )
* consider making this work with virtualenv by default. This way you can install
Flask as a library without making me install it for the system. Adding a short
bash wrapper script really helps here.
* support for folders
* sha1 hashes to ensure your files are consistent
* de-duplication of chunks would be a good thing to add too
* The way you are sending data now looks like you are using GETs for everything.
If you are really going to make this work, you should use GETs for downloading and PUTs or POSTs for uploading.
As soon as I clicked the source and didn't notice a binary diff being done for files I realized it was a toy/learning project, but my reaction was "Hell yeah!" anyway. Keep it up.
This has some fairly serious security issues, which is fine for a something not designed to be seriously used (or at all). However, the readme implies you could use this and your files will be safer than with some third party. Which is dangerous, to say the least.
I'll outline a few obvious issues I see:
- No explicit protection against directory traversal attacks (../../etc/passwd type stuff) on upload and download.
- Shell command injection on the file name on upload.
- Naive authentication.
- Unsalted, fast hash sent in the URL.
- Password stored in clear text server side.
- No transport security (HTTPS).
This is cool as a interesting project to work on, but it should be made clear not to use this for anything just yet.
You have to realize this is a very early release! The first working release! I Didn't take anything else into consideration, but I will. Thanks for the comment :D
for efficient diffs, I suggest the author (and anyone else who cares to explore having an efficient diff based backup of lots of data) look into rolling hashes. great way to split data into chunks that are robust against insertions in the middle of a file! (uniformly splitting a file every X bytes means you resync the entire file after an insertion!)
We're talking about 60 lines of python for the "server" and 90 lines of python for the "client".
That's a weekend hack, not a Dropbox clone.
It happened so many times in the past, that I'm patenting a method of getting on first page of HN:
1. Write something
2. Call it a dropbox clone
3. Congratulations, apparently no-one bothers to check if that "something" does anything even remotely similar to dropbox:
- does it have a GUI client with a highly polished interface?
- does it have an installer?
- does it have conflict resolution to reconcile changes made on different computers?
- does it try to not corrupt files if download/upload fails in the middle?
- is it self-contained or do you first have to install, say, python interpeter?