Hacker News new | ask | show | jobs
by cobbzilla 3584 days ago
I'm curious if you can share how to synchronize N files without doing at least N comparisons.

the main innovations in s3s3mirror are (1) understanding this & going for massive parallelism to speed things up and (2) where possible, comparing etag/metadata instead of all bytes.

so far, it has scaled pretty well, i know of no faster tool to synchronize buckets with millions of objects.

1 comments

Sorry, I should have perhaps put a disclaimer in my original comment. I work for a company called StorReduce and built our replication feature* (an intelligent, continuous "sync" effectively). We currently have a patent pending for our method, so I'm not sure if I can offer any real insight unfortunately.

I haven't looked at your project, but based on what you've said I agree the way you're doing it is conceptually as fast as it can be (massively parallel and leveraging metadata) whilst being a general purpose tool that "just works" and has no external dependencies or constraints.

* http://storreduce.com/blog/replication/