Hacker News new | ask | show | jobs
by Retr0id 120 days ago
While it is definitely slop, I think the numbers may be "real" but compiler-dependent. The 20000x "speedup" presumably only happens when the compiler detects that it can optimize the whole algorithm into a nop, because it has no observable side effects. (I have not tested this hypothesis)
2 comments

On my system, under gcc 15.2.1, the two necessary factors to see the claimed speedup are:

- An optimization level of -O1 or higher - The -ffinite-math-only flag, or any flag (e.g. -ffast-math) which implies this flag.

The benchmark uses a default value for weights which is #defin-ed as `__builtin_inf()`, and assigns this value in multiple places. This, of course, is concerning, since it gives a very obvious means by which the benchmark might be completely optimized out, though a more careful analysis would be needed to explain why the Dijkstra and (Res) functions don't also get optimized out.

For clang, the equivalent flags are

- An optimization level of Og or higher - The -fno-honor-infinities flag, or any flag (e.g. -ffast-math) which implies this flag.

Notably, while the author enables LTO, and others have compiled with -march=native, neither flag is necessary (for me) to see the huge speedup, which on my machine peaks at over 1.2 million times.

Maybe, but I think the OP is submitting this in bad faith (or got utterly bamboozled by the AI). I tried with recent clang and the specified flags, and the behavior is the same.

(I think it's unlikely that it can be straight-up optimized out because it dirties the workspace thread local, and compilers generally don't optimize out writes to thread local variables.)