| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by zelphirkalt 17 days ago
	I am rather thinking, if one is so much faster, and they are truly equal, why is the compiler too stupid to convert one into the other?

3 comments

br121 17 days ago

It doesn't convert bogosort into heapsort either, despite the second being much faster than the first. I'm guessing that it's not that easy going from one to the other because the only thing they have in common is the output (and only after you have checked the last value), so if the transformation is not hard-coded into the compiler, the odds of it randomly discovering the optimization is close to zero

link

zelphirkalt 17 days ago

Yeah, I would expect such transformations to be implemented as optimizations. Just like maybe (the admitedly simpler):

    (+ ((lambda () 1)) ((lambda () 1))) -> (+ 1 1)

A syntactical transformation, where it is possible as an equivalent transformation.

I may be overlooking special cases, but I thought the compiler is smart enough to infer that the array elements are integers and that `<` will result in a boolean, which is just `0` and `1` and will understand that having only the `if` without `else` branch is equivalent in this case. Guess I was wrong and the compiler is not sophisticated in this specific way.

link

djray 16 days ago

They aren't equal, as the faster version does an unconditional memory write instead of only writing to the array if the condition is satisfied. The compiler is strictly forbidden from turning a conditional write into an unconditional one.

link

peterfirefly 15 days ago

The Linux kernel had problems years ago when gcc started to do exactly that in certain cases (because it screwed things up with task switches, interrupts, and SMP). It fairly quickly afterwards either stopped doing it entirely or got a switch that would stop it from doing it. Don't remember which.

link

flohofwoe 17 days ago

The two code snippets do different things, apples and oranges... e.g. the array modification in the second example needs to move in front of the if for the two snippets to behave identically. I bet then the compiler output is the same with -O1 or higher.

PS: e.g. note how bla() (first code snippet) and blob() (fixed second code snippet) have identical output (both are turned into the same 'branchless' code via a conditional 'setl' instruction), but the blub() function (original second code snippet) differs because that function has different behaviour:

https://www.godbolt.org/z/h9Kfbn5bc

TL;DR: most 'branchless advice' that only tinkers with language features (like "x = a ? b : c" instead of an if) is useless because to the optimizer passes both are the same thing (a condition).

When there's a difference in the generated code then it's usually a bug and the before-after code are not actually equivalent (like in the code examples above).

link

peterfirefly 15 days ago

I played a bit with your link. It depends on the compiler. x86_64 clang trunk indeed compiles the original first and your fixed second form to the exact same code. I tried a couple of msvc and gcc versions and they did not but they all made them both branchless.

link