Hacker News new | ask | show | jobs
by eridius 2645 days ago
I don't know much about SIMD but are there no alignment concerns with reinterpreting a slice as an AVX vector?
1 comments

Unaligned and aligned loads of AVX vectors have basically the same performance since Ivy Bridge IIRC.
I was under the impression that unaligned ops ran at the same speed, but they used up more register ports, so it does reduce memory bandwidth between the register file and cache. Or does this no longer apply either?
My understanding is that the first unaligned load uses more register ports[0], but a second (third, etc) contiguous load doesn't. IANA[intel microarchitechure expert] though.

0: Or more memory bandwidth anyway.