Hacker News new | ask | show | jobs
by pbsd 4242 days ago
The theoretical maximum for current chips is less than 16 bytes per cycle. On Haswell you can process (in parallel) 7 blocks in roughly the time it would take to process 1. The latency of each round is 7 cycles, a full AES-128 10 rounds is ~70 cycles, so effectively you can process at most 1.6 bytes per cycle, or 1.14 if you use 256-bit keys (ignoring the cost of key scheduling and overhead here).

Even if you dedicate all CPU cores to the task of encrypting memory, you still stop short of exceeding theoretical memory bandwidth by quite a bit.

2 comments

Do you believe it's reasonable to assume that AES performance will remain constant over the same 5-7 year timeframe? That's at least a couple of hardware generations for an improvement they could make in the current generation if there was a market for it.
There is certainly room for improvement, but I don't see an 16x speedup happening on a 5-year horizon using the current AES-NI instruction set.
Ah, I was forgetting about rounds, you're correct that you won't be able to match the memory bandwidth then.