Hacker News new | ask | show | jobs
by mhh__ 1926 days ago
At that throughput the CPU is speculating and exploiting the access pattern.

It's also worth saying that if Apple were dead set on throughput in this area they could've implemented some non-trivial fusion to improve performance. I don't have an M1 so I can't find out for you (and Apple are steadfast on not documenting anything about the microarchitecture...)