Hacker News new | ask | show | jobs
by taeric 7 days ago
This depends entirely on the sizes needed. For a surprising number of things people would use fixed point for, I would be surprised if you couldn't get good speed with surprisingly little effort.
1 comments

I would be surprised if you could get even one third of performance of floats. So why bother?
If you don't have to do trig, I'd be surprised if you aren't faster by default, oddly. Indeed, if you are just adding and subtracting, it is just a number. If you are doing multiplication, it is a multiply and shift. So long as you don't try and support massive numbers of different fixed sizes, that shift is almost certainly still cheaper than float hardware. (Indeed, a lot of multiplications wouldn't even need the shift...)

Again, I do not mean this as a criticism of floats. For simulations and for numbers where you do have to support completely arbitrary values, there is a reason floats are a thing.

An integer add, sub, or shift is 1 cycle of latency; integer mul is generally 3 cycles; integer div is lol-that-is-slow. Floating-point adds, subs, muls, and fmas are generally 4 cycles, with div being lol-that-is-slow (but generally faster than integer division because your divisor and dividend have fewer bits).

So fixed-point addition and subtraction are definitely faster, multiplication is a wash if you're doing binary-based fixed point (but slower if you're doing decimal-based fixed point), and fixed-point division is definitely slower than floating-point division.

Kudos on providing the numbers. I wasn't confident in the numbers that I remembered and with how pipelined everything is, I didn't know how much to lean on them. Not exactly my standard workflow to care about this level of speed.

My gut would still be that it is typically a wash for most everyone as far as speeds go? If default libraries supported it more directly, I would think it would largely be a win for a lot of reasoning. In particular, silly stuff like 1e32 + 1e1 would not be nearly as surprising to most people. And the entire class of bugs around stuff like doing something until it reaches 0.9 would almost certainly go away if we guaranteed precision to a set number decimal places.

Alas, default libraries do not support this, though. So the above is admittedly wishful thinking on my end. And I could as easily describe a world where people insist on arbitrary numbers of fixed point values and how that would be its own set of landmines.

I very commonly see fixed-point libs be slower than floating-point (assuming floating-point hardware is available --- soft-float is slow as hell compared to fixed-point, of course).

Commonly enough that I think it's some fundamental reason (given available/current hardware as opposed to hypothetical).

Two reasons I can think of, though granted, they only apply in certain niches:

1) Floating-point SIMD extensions are far more common than integer ones. This means you can compute N (often N=4 or 8) float operations in one instruction, vs 1 integer operation.

2) For any GPGPU processing: GPUs far prefer floats, to the point where you didn't even use to have integers available (somewhat ironically, the platforms that prefer float much more strongly to int are mobile/embedded ones nowadays --- which is the exact opposite of the CPU situation). To this day, you have a `mul24` intrinsic for integer multiplication in some languages ... which converts two integers to floats, multiplies them as floats, and then converts back to integers. Yes, that was faster than direct multiplication. I'm sure many GPUs do it directly nowadays though.

It's also worth considering that a typical fixed-point multiply (as opposed to integer) is an integer multiply followed by a shift; often to a 2×-bit intermediate, if you want to preserve precision. That's a cost.

Apologies for not responding earlier. I am honestly fine with where this conversation ended and was worried about it never ending. That said, I also find it fun to discuss. :D Conflicting feelings! (To that end, fully understood if this is fully dropped, now.)

I do not at all contest this. Would largely expect it. It is a common enough trap that people introduce complications to something expecting it to be faster.

I do expect that the largest reason, in this case, is simply volume/network effects. More people lean on floats than on fixed decimal. Therefore, it is not that surprising that more optimization has happened there. This is exactly why I somewhat lament there is not more fixed point work out there. My assertion is if it was more standard, there would be more standard optimizations.

For that last point, I would still expect you could pick constants more for multiplication so that you didn't have to do the shifts as often. Probably even preferring classes of numbers where if you have a set of X values that you often multiply with Y values, if you can limit yourself so that all Y values have no decimal part, X*Y is always just an integer multiplication.

Now, I also fully grant that systems start to crash when someone didn't realize exactly why no Y values had a decimal. They decide they really need one, and then update the code with a massive performance hit that they didn't even pay attention to.

Add/Sub isn't 1 cycle if you want overflow handling. One of the really nice parts about float is that it loses precision gradually.