| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by kloch 1256 days ago

> if you add very big values to very small values, you can get inaccurate results (the small numbers get lost!)

There is a simple workaround for this:

https://en.wikipedia.org/wiki/Kahan_summation

It's usually only needed when adding billions of values together and the accumulated truncation errors would be at an unacceptable level.

2 comments

phkahler 1256 days ago

It can also come up in simple control systems. A simple low-pass filter can fail to converge to a steady state value if the time constant is long and the sample rate is high.

Y += (X-Y) * alpha * dt

When dt is small and alpha is too, the right hand side can be too small to affect the 24bit mantissa of the left.

I prefer a 16/32bit fixed point version that guarantees convergene to any 16bit steady state. This happened in a power conversion system where dt=1/40000 and I needed a filter in the 10's of Hz.

link

jbay808 1256 days ago

This is a very important application and a tougher problem than most would guess. There is a huge variety of ways to numerically implement even a simple transfer function, and they can have very different consequences in terms of rounding and overflow. Especially if you want to not only guarantee that it converges to a steady-state, but furthermore that the steady-state has no error. I spent a lot of time working on this problem for nanometre-accurate servo controls. Floating and fixed point each have advantages depending on the nature and dynamic range of the variable (eg. location parameter vs scale parameter).

link

kergonath 1256 days ago

> It's usually only needed when adding billions of values together and the accumulated truncation errors would be at an unacceptable level.

OTOH, it’s easy to implement, so I have a couple of functions to do it easily, and I got quite a lot of use out of them. It’s probably overkill sometimes, but sometimes it’s useful.

link