Hacker News new | ask | show | jobs
by mmozeiko 1646 days ago
Here are two cmov's with gcc: https://godbolt.org/z/j9bcv639c
2 comments

I'm not a programmer by profession and barely one by hobby. I know nothing about compiler internals but looking at the definition of a basic block from Wikipedia, "In compiler construction, a basic block is a straight-line code sequence with no branches in except to the entry and no branches out except at the exit." it seems that the ternary operators in "return (x ? a : b) + (y ? c : d);" although part of the same programming block would be distinct compiler basic blocks.
Indeed, it seems contradictory to be talking about converting multiple branches within a basic block. Or even a single branch, for that matter.

Perhaps they're talking about what ends up in a single basic block after the conversion to cmov? Still an odd way to describe it.

Basic blocks are how compilers "conceptualize" our code and start thinking about optimization.

Frankly, my recommendation to you is to ignore that stuff. You need like 3 or 4 years of school to reach that level. You should have studied easier graph optimization problems (ex: Finite Automatia) first, before dealing with code-optimization.

The school curriculum is typically Programming 101 -> Data Structures -> Algorithms -> Finite Automata (aka: Regex) -> Pushdown Automata (aka: context-free grammars) -> Turing Machines (aka: general purpose code) -> Compilers (which use Finite Automata, Pushdown Automata, and Turing Machines simultaneously).

Basic Blocks are the graph structure that compilers use to think about code. Its a lot of graph-theory built up on a lot of language theory.

------

That being said: if you're actually interested in this stuff, then feel free to explore. But just beware, you've stumbled upon a very complex subject that is easily 4th year undergrad or even graduate-school level.

With enough effort, you'll understand things. But this is no subject for beginners to go around exploring by themselves.

-------

That being said, I want to encourage you to explore the subject anyway. Just explore the subject with awareness that there's a lot to understand here.

If you want to skip the 3ish years of elementary data-structures / comp. sci theory... I suggest starting with Static Single Assignment and working forward from here.

https://en.wikipedia.org/wiki/Static_single_assignment_form

Thank you. It appears Gcc won't put two cmov writes to memory in a block. Thus,

  void swap_if(bool c, int& a, int& b) {
    int ta = a, tb = b;
    a = c ? tb : ta;
    b = c ? ta : tb;
  }
is very slow, under Gcc, when c is poorly predicted, as is typical when e.g. partitioning for quicksort. But how well it will be predicted depends on input data.

[0] https://godbolt.org/z/j5W9dMjYE

To me it looks like something related to some other optimization pass (I don't know much about gcc passes). But not related to writes to memory. Here are two writes both using cmov (on different code): https://godbolt.org/z/n3dTrPo6e

Edit: compiling your code without modifications, but with `-Os` also gives two cmov's: https://godbolt.org/z/r86azb7be

In my experience, the key is that you've triggered two decisions off of a single boolean condition. A single ternary will be a cmov, but two ternaries become a branch.
> but two ternaries become a branch

I need to use inline assembly because of this.