Hacker News new | ask | show | jobs
by johnisgood 2112 days ago
I know, but according to the graphs, it is much more faster than SHA, despite hardware acceleration, so I am not sure.
1 comments

That depends on the platform, the size of the input, and whether multithreading is used.

On Ice Lake, where BLAKE3 benefits from AVX-512 and SHA-256 benefits from the SHA extensions, BLAKE3 seems to do better on both long and short messages. But maybe surprisingly, SHA-256 does better in a medium-length regime, where SHA-256's poorer startup time* has been mostly amortized out, but BLAKE3's chunk parallelism hasn't yet kicked in. See for example the 1536-byte results here: https://bench.cr.yp.to/results-hash.html#amd64-icelake . Using multithreading would exaggerate BLAKE3's advantage for very long messages (usually about 1 MiB and above), but it wouldn't improve the results for any of the message lengths measured there.

* I don't actually know where SHA-256's startup overhead comes from. Maybe someone who knows more could jump in here?

On ARM chips, the performance benefits of NEON are less dramatic than AVX-512, and the performance advantage of SHA-256 hardware acceleration is comparatively larger. I think it's rare for BLAKE3 to beat accelerated SHA-256 on ARM without at least some multithreading, but I've only personally benchmarked a few Raspberry Pis, and I want to be careful not to overgeneralize.