Hacker News new | ask | show | jobs
by kccqzy 623 days ago
I have worked on firmware that has plenty of fixed point arithmetic. The firmware usually runs on processors without hardware floating point units. For example certain Tesla ECUs use 32-bit integers where they divide it into four bits of integer part and 28 bits of fractional part. So values are scaled by 2^28.
1 comments

>> The firmware usually runs on processors without hardware floating point units.

I'm working on control code one an ARM cortex-M4f. I wrote it all in fixed point because I don't trust an FPU to be faster, and I also like to have a 32bit accumulator instead of 24bit. I recently converted it all to floating point since we have the M4f part (f indicate FPU), and it's a little slower now. I did get to remove some limit checking since I can rely on the calculations being inside the limits but it's still a little slower than my fixed point implementation.

The other great thing about going fixed point is that it doesn't expose you to device specific floating point bugs, making your embedded code way more portable and easier to test.

32b float on your embedded device doesn't necessary match your 32b float running on your dev machine.

32b float can match your desktop. Really just takes a few compiler flags(like avoiding -funsafe-math), setting rounding modes, and not using the 80bit Intel mode(largely disused after 64bit transition).
I understand what you are saying ...

You aren't guaranteed that your microcontrollers float is going to match your desktop. Microcontrollers are riddled with bugs, unless you need floats and fixedpoint is fast enough. My recommendation is still to use fixedpoint if application is high reliability.

Esp if your code needs to be portable across arm, risc-v, etc.

Many microcontrollers today, including ARM, RISC-V, and Xtensa have IEEE compliant FPUs or libms available. Same numeric format, same rounding, same result.

Fixed point isn't bad at all, just often slower when a compliant FPU is available.

> IEEE compliant FPUs or libms available. Same numeric format, same rounding, same result.

IEEE only mandates results within ½ ULP (= best possible) for basic operations such as addition, subtraction, multiplication, division, and reciprocal.

For many other ones such as trigonometric functions, exponential and logarithms, results can (and do) vary between conforming implementations.

https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.h...:

“The IEEE standard does not require transcendental functions to be exactly rounded because of the table maker's dilemma. To illustrate, suppose you are making a table of the exponential function to 4 places. Then exp(1.626) = 5.0835. Should this be rounded to 5.083 or 5.084? If exp(1.626) is computed more carefully, it becomes 5.08350. And then 5.083500. And then 5.0835000. Since exp is transcendental, this could go on arbitrarily long before distinguishing whether exp(1.626) is 5.083500...0ddd or 5.0834999...9ddd. Thus it is not practical to specify that the precision of transcendental functions be the same as if they were computed to infinite precision and then rounded. Another approach would be to specify transcendental functions algorithmically. But there does not appear to be a single algorithm that works well across all hardware architectures. Rational approximation, CORDIC,16 and large tables are three different techniques that are used for computing transcendentals on contemporary machines. Each is appropriate for a different class of hardware, and at present no single algorithm works acceptably over the wide range of current hardware.”

Why did you decide to convert your code to floating point if your fixed point implementation was faster and already written?
Are there any good benchmarks for float vs fixed point, specially for ARM systems?
Just look at the instruction set for your particular CPU. Every CPU is different, but in most architectures I've seen, floating point operations are 2-3 times slower for the same word size.

Single float adds are usually 2 or 3 CPU cycles while single-word integer adds are usually 1 cycle.

Again, this is extremely dependent on the particular CPU you have. Some architectures do have single-cycle FPU operations, but it's not very common in microcontrollers as far as I can tell.

That would vary wildly with the ARM chip you are talking about. I would say figure out which ARM you’re interested in and go down the rabbit hole from there.