|
|
|
|
|
by masklinn
1032 days ago
|
|
The go compiler might have a heuristic where a branch with IO is considered cold compared to a branch without. In the original, it essentially faces If <cond>:
Mov
Else:
Noop
Considering these branches unpredictable, it generates a CMOV.With If <cond>:
Mov
Else:
Print
It now considers the first branch hot and the second cold, and thus branch predication valuable, and generates a branch instead.Turns out for the use case choice (1) is a misfiring, as the branch is extremely predictable, so all the conditional move does is create an unnecessary data dependency. It’s not necessarily the wrong choice in general, as for unpredictable branches cmov will usually be a huge gain, they’ll incur a cycle or two of latency but save from 15+ cycles of penalty on a mispredicted branch (which if the prediction only works half the time is an average 7.5 cycles per iteration). You can find older posts which demonstrate that side of the coin e.g. https://owen.cafe/posts/six-times-faster-than-c/ |
|
Happens to be extremely predictable for this data. In general, over all possible inputs, it’s not extremely predictable.
If you assume all inputs are different (not something the compiler can assume, of course) the probability of having to update the max value goes down from 1 for the first iteration to 1/n for the last, so, possibly, the loop should be split into two halves. Go through the start of the sequence assuming the value needs updating more often than not, and switch to one where it doesn’t at some point.
For truly large inputs you could even add heuristics looking at how much room there is above the current maximum (in the limit, if you’ve found MAX_INT, you don’t have to look further)
Sorting programs used to have all kinds of such heuristics (and/or command line arguments) trying to detect whether input data already is mostly sorted/mostly reverse sorted, how many different keys there are, etc. to attempt avoiding hitting worst case behavior, but I think that’s somewhat of a lost art