Hacker News new | ask | show | jobs
by mjtokelly 4949 days ago
Direct uploading like this has been possible for at least two years. However, in our tests it turned out that Amazon was silently losing some files >100MB (and almost all files ~1GB).

On the AWS forums it was clear that AWS was aware of this consistent problem, but was not able to fix it, and not willing to document it.

They eventually released "multipart upload", which remains the only reliable way to get large files to S3. Unfortunately, multipart upload is nearly impossible to implement as a web app (short of resorting to e.g. Java).

5 comments

>Unfortunately, multipart upload is nearly impossible to implement as a web app

I am right now writing a library to do just that. I already have a prototype working, and we'll probably open source our library once it's reasonably tested. But perhaps you know something I dont... is there anything in particular that makes it "nearly impossible"?

Sounds like my understanding must be out of date as of this year! (Which makes me very excited about your upcoming library.) My conclusion was based on research ~1 year ago finding no OSS or commercial software supporting it, and forum posts for those projects by developers sadly explaining why they just couldn't do multipart yet.

My bad experience with files disappearing was entirely with web-based POST requests, ~2 years ago. Large file transfer from EC2 to S3 was reliable, but our POST requests on slower connections (even those that were very reliable) would return with a false report of success.

Direct uploading like this has not been possible through XHR as CORS support was only added this past year. However, you could mitigate it by handling the CORS headers through a proxy, using flash, or a couple other methods.

You could, however, perform a traditional POST.

Multipart is not at all impossible from a webapp (depending on your browser requirements). FileReader API is available across all browsers (except IE < version 10, which I admit is a big problem). For IE, you can resort to some other solution (perhaps Silverlight, or Flash?)
I will second this. At the bottom of this article there is a reference the article I wrote a while ago when Amazon released this feature. I was planing on doing multipart uploading at that time using the FileReader but there was a bug in the way S3 did CORS so I didn't want to continue until that was fixed. They fixed it and I never came back to it. Maybe be a good time to try it again. Resuming a partial upload seemed like a good win to me.
Our video bucket is a few TB in size right now, file-sizes ranging from 10MB to 1GB. Not one file/upload broken until now. So I guess they fixed it.
What? Why is it hard? I just wrote a web app a couple weeks ago to enable my co-workers to upload large files (.5-1.5GB) to S3. Like, I don't doubt you, I'm just curious why my experience was so different than what you describe.