Hacker News new | ask | show | jobs
by sam_goody 703 days ago
How can I be confident that everything was synced correctly? Is there a way to compare the SHA or whatever key S3 provides?

Also, would this work well when there is not a lot of room on the disk it is syncing from? I have had serious issues with the S3 cli in such a scenario?

Also, how would this compare to something like rclone?

1 comments

File size is easy to compare, so you at least know that the full file got sycned. Hashes is a bigger issue with AWS in general, as you only get an ETag from S3 which is the MD5 of a file for Puts, but not for multiparts - and also dependent on the multipart size. So you cant really check the file for hash equality easily.

The good news is that with S3 over HTTP you should not really run into byte flip issues.

The sync server does not need any file system storage, it processes all uploads in memory and only ever buffers 5MB per worker for multipart uploads.

rclone looks like a good alternative, but without the focus on fast iterations for e.g. daily backups of huge buckets.