Hacker News new | ask | show | jobs
by ncmncm 1650 days ago
Thank you. It appears Gcc won't put two cmov writes to memory in a block. Thus,

  void swap_if(bool c, int& a, int& b) {
    int ta = a, tb = b;
    a = c ? tb : ta;
    b = c ? ta : tb;
  }
is very slow, under Gcc, when c is poorly predicted, as is typical when e.g. partitioning for quicksort. But how well it will be predicted depends on input data.

[0] https://godbolt.org/z/j5W9dMjYE

2 comments

To me it looks like something related to some other optimization pass (I don't know much about gcc passes). But not related to writes to memory. Here are two writes both using cmov (on different code): https://godbolt.org/z/n3dTrPo6e

Edit: compiling your code without modifications, but with `-Os` also gives two cmov's: https://godbolt.org/z/r86azb7be

In my experience, the key is that you've triggered two decisions off of a single boolean condition. A single ternary will be a cmov, but two ternaries become a branch.
> but two ternaries become a branch

I need to use inline assembly because of this.