Hacker News new | ask | show | jobs
by huachimingo 1471 days ago
Which one is faster? (C code) Return: see if abs(num) > x.

/logical comparison/ int greater_abs(int num, int x){ return (num > x) || (num+x < 0); }

/squared approach/ int greater_abs2(int num, int x){ return num*num > x; }

See it by yourself, with (and without) optimizations: https://godbolt.org/

What would happen if x is a compile-time constant?

6 comments

Math & logic are rarely the bottleneck over memory & allocation bottlenecks, right? Does Godbolt assume x86? Does the answer change depending on whether you’re using an AMD or NVIDIA GPU, or an Apple, ARM or Intel processor? Does it depend on which instruction pipelines are full or stalled from the surrounding code, e.g., logic vs math? Hard to say if one of these will always be better. There are also other alternatives, e.g. bitmasking, that might generate fewer instructions… maybe “abs(num) > x” will beat both of those examples?
> Does Godbolt assume x86?

Godbolt uses whatever compilers, targets, and optimisations you ask it to.

It is, in fact, a very useful tool for comparing different compilers, architectures, and optimization settings.

Ah, yes, thank you, it looks awesome! It doesn't have GPU options (which is fair, since this is standard C++). But I see now I could have figured out the answer to my question in just a few seconds. :)
Compiler Explorer does have GPU options, actually. It has well-established CUDA support and I have been meaning to help add HIP support.
Oh sweet, I missed it then! Where do you go to switch to CUDA?
In the language tab, you can just select "CUDA C++" (or OpenCL C for that matter) instead of C++ or Rust.
I don't know which one is faster, but I know that one is not correct (squared approach).
Wouldn't the second one also potentially be incorrect due to overflow?
Yes. Suppose that both numbers are positive, that x>num, and that x+num is bigger than INT_MAX. In that case we hit signed integer overflow, which is undefined behavior. If signed integer overflow happens to wrap around, which it might, then the result could be negative and the function would return the wrong result. Or anything else could happen; undefined behavior is undefined.

In practice, just writing "abs(num) > x" gives quite good machine code, and it does so without introducing hard-to-see bugs.

Depends. In the first it will depend on the branch predictor which will depend on the relative expected magnitudes of num and x

In the 2nd, which I assume should be

    { return num*num > x * x; }
then it depends on the micro-arch, as it's one basic block so no branches and assuming a deep pipeline on x64, one multiplier (pipelined), probably this is faster for 'random-ish' num and x.
Your squared approach is wrong: greater_abs2(3, 4) returns true but should return false.
They probably meant num * num > x * x
Which is still wrong since num = 3, x = -4 should be true.
No idea.

How frequently am I calling `abs`?

If you write math heavy code, probably a lot more than if you're writing web apps. Depends on what kind of software you write.
Got it.

Well if that were the case, I’d use a profiler to see if spending time on optimizing ‘abs’ would realistically be worth it.

Most questions like these have no answer because if any of the parameters is known (which it usually is) it’ll get folded away to nothing.