|
|
|
|
|
by janwas
1143 days ago
|
|
Mostly agree, but there is actually a mismatch between madd_epi16 and Arm.
Implementing Arm semantics or x86 on the other requires ~5 instructions, but if we generalize the definition to allow reordering (e.g. Highway's ReorderWidenMulAccumulate [1]), it's only 2 instructions. 1: https://github.com/google/highway/blob/master/g3doc/quick_re... |
|
I agree it would perhaps be possible to find better semantics for SIMD that kinda gloss over all the differences. That would be cleaner but require a lot of names. Well I suppose that's what Highway does, isn't it?