Hacker News new | ask | show | jobs
by llimllib 2480 days ago
An extremely naive program can sha1 hash 1 million 100 byte strings on my computer in less than half a second: https://gist.github.com/llimllib/72f60aa33b32e422962d876ddf0...

This is literally the first program I came up with, no attempt to optimize it at all.

There is zero chance that the AWS sync command is filling my CPU just by hashing bytes

edit: I'm going to try not to let you nerd snipe me into doing the profiling the AWS CLI needs to be doing, for them. Because that's now what I desire to do.

1 comments

so 200 megabytes/second? I'm not sure what your definition of large files is, but hashing anything sizable with SHA1 is trivially CPU-bound with any modern SSD, in the absence of a processor with the sha asm extensions.

that being said, quick glance at the source suggests that awscli's s3 sync only compares files by size & timestamp, not etag, so it's not hashing anything client-side.