|
|
|
|
|
by Const-me
122 days ago
|
|
> AVX2 level includes FMA (fast multiply-add) FMA acronym is not fast multiply add, it’s fused multiply add. Fused means the instruction computes the entire a * b + c expression using twice as many mantissa bits, only then rounds the number to the precision of the arguments. It might be the Prism emulator failed to translate FMA instructions into a pair of two FMLA instructions (equally fused ARM64 equivalent), instead it did some emulation of that fused behaviour, which in turn what degraded the performance of the AVX2 emulation. |
|
I don't have insight into how Prism works, but I have wondered if the right debugger would see the ARM code and let us debug exactly what was going on for sure.