Hacker News new | ask | show | jobs
by ben-schaaf 51 days ago
> SIMD is tricky even with SoA because there is significant latency going between the general registers and the vector units

My experience is mostly limited to AMD64, but libraries like glibc use SIMD in many places for faster linear search. Presumably they've done testing and found it worth while.

1 comments

Yeah arm little cores are a very different story - they aren't superscalar out of order architectures, they can dispatch up to two operations per cycle.

Big cores are more like that dispatching 8 or more operations per cycle, but they're also more expensive, larger, etc.