Hacker News new | ask | show | jobs
by galangalalgol 1093 days ago
The optimizations both looked correct. Both told the compiler to target broadwell. The fastest nbody was rust, but it was non-portably using x86 intrinsics. Zig has explicit simd vectors in the stdlib and so did better than the portable explicit simd of the third place rust entry. However, zig is using optimized float mode equivalent to gcc ffastmath so it is almost certainly getting the wrong answers simce it didn't use the iterative sqrt trick. https://github.com/hanabi1224/Programming-Language-Benchmark...
2 comments

Is zig's optimized float mode also extremely error-prone like gcc ffastmath?

Reminder: With gcc/clang, -ffastmath makes it undefined behavior to run a calculation that results in an infinity or NaN. Due to the way UB works, the compiler can end up miscompiling not just the floating-point calculation, but also other code nearby (e.g. delete array bounds checks).

This is why Rust does not have any fastmath-equivalent: it would allow violating memory safety in safe code.

Seems like there should just be a fastmath mode that doesn't consider it UB. IIUC most of the gains come from being able to assume addition, multiplication etc are associative and commutative.
> so it is almost certainly getting the wrong answers

Not checking the correctness of the output sounds like a pretty bad oversight for a benchmark