Hacker News new | ask | show | jobs
by crazygringo 1895 days ago
Indeed, and therefore:

  0.1 + 0.2 != 0.3
You can check it in the JavaScript console.

This actually makes me wonder if anyone's ever attempted a floating-point representation that builds in an error range, and correctly propagated/amplified error over operations.

E.g. a simple operation like "1 / 10" (to generate 0.1) would be stored not as a single floating-point value, but really as the range between the closest representation greater than and less than it. The same with "2 / 10", and then when asking if 0.1 + 0.2 == 0.3, it would find an overlap in ranges between the left-hand and right-hand sides and return true. Every floating-point operation would then take and return these ranges.

Then floating point arithmetic could be used to actually reliably test equality without ever generating false negatives. And if you examined the result of calculation of 10,000 operations, you'd also be able to get a sense of how off it might maximally be.

I've search online and can't find anything like it, though maybe I'm missing an important keyword.

6 comments

Interval arithmetic is what you're looking for, and there's an IEEE standard and many implementations.
Thank you!!

Yes, that turns out to be exactly it [1]. Looks like there's even at least one JavaScript library for it [2].

It seems like such a useful and intuitive idea I have to wonder why it isn't a primitive in any of the common programming languages.

[1] https://en.wikipedia.org/wiki/Interval_arithmetic

[2] https://github.com/mauriciopoppe/interval-arithmetic

> It seems like such a useful and intuitive idea I have to wonder why it isn't a primitive in any of the common programming languages.

It is basically useless for numerical computation when you perform iterations. Even good convergent algorithms can diverge with interval arithmetic. As you accumulate operations on the same numbers, their intervals become larger and larger, growing eventually to infinity. It has some applications, but they are quite niche.

It is true that error intervals eventually go to infinity, but the very point is to keep them small enough to be useful throughout the calculation. IA is pretty bad for that but other approaches like affine arithmetic can be much better. (It still doesn't make an approachable interface for general programming languages though.)
But if the intervals are growing to infinity, then should you be trusting your result at all?

Are there really cases where current FP arithmetic gives an accurate result, but where the error bounds of interval arithmetic would grow astronomically?

It seems like you'd have to trust FP rounding to always cancel itself out in the long run instead of potentially accumulating more and more bias with each iteration. Is that the case?

Wouldn't the "niche" case be the opposite -- that interval arithmetic is the general-purpose safe choice, while FP algorithms without it should be reserved for those which have been mathematically proven not to accumulate FP error? (And would ideally output their own bespoke, proven, interval?)

> But if the intervals are growing to infinity, then should you be trusting your result at all?

Most often, yes; the probability distribution of your number inside that interval is not uniform, it is most likely very concentrated around a specific number inside the interval, not necessarily its center. After a few million iterations, the probability of the correct number being close to the boundary of the interval is smaller than the probability of all your atoms suddenly rearranging themselves into an exact copy of Julius Caesar. According to the laws of physics, this probability is strictly larger than zero. Would you think it "unsafe" to ignore the likelihood of this event? I'm sure you wouldn't, yet it is certainly possible. Just like the correct number being near the boundaries of interval arithmetic.

Meanwhile, the computation using classical floating point, typically produces a value that is effectively very close to the exact solution.

> It seems like you'd have to trust FP rounding to always cancel itself out in the long run instead of potentially accumulating more and more bias with each iteration. Is that the case?

The whole subject of numerical analysis deals with this very problem. It is extremely well known which kinds of algorithms can you trust and which are dangerous (the so-called ill-conditioned algorithms).

I suggest adding an example to demonstrate this effect because I don't think it's necessarily obvious for someone who hasn't already seen it.
With large integers - no, ie:

`Math.pow(2, 55)+1 === Math.pow(2, 55)` returning true

There was a time when I thought (like you) that everybody should be using interval arithmetic, but then I came across a counterexample that convinced me I was wrong. I don't remember the precise example, but maybe the following will do the same for you.

Say x = 4.0 ± 1.0. What is x / x?

It should be x / x = 1.0 ± 0.0, but interval arithmetic will give you [3/5, 5/3].

Notice the interval is objectively wrong, as the result cannot be anything other than 1.0. Now imagine what happens if you do this a few more iterations. Your interval will diverge to (0, +∞), becoming useless.

The moral of the story (which may be more obvious in hindsight): interval arithmetic is a local operation; error analysis is a global operation. Naturally the former cannot substitute for the latter.

Your example only proves your point if every instance of x is the same x, with the same objective value. i.e., what if x/x is actually (x=5.0)/(x=3.0) ?
Well, if it's built into the language and the compiler or interpreter knows both x's are the same variable it could optimize it away to 1.0 ± 0.0, so I don't see the problem. Or if the library works on objects passed by reference and can confirm them as the same object, it could do that too.

But if, in your code, you've copied x to y (not by reference), then it seems that x / y would correctly be [3/5, 5/3]. This is a feature, not a bug.

In any case, since intervals are more likely to be more like ± 0.000000000000001 when dealing with basic common non-iterative calculations, it doesn't seem like a problem in practice even if the compiler/interpreter doesn't optimize it away?

> Say x = 4.0 ± 1.0. What is x / x?

> It should be x / x = 1.0 ± 0.0, but interval arithmetic will give you [3/5, 5/3].

This is the “dependency problem” which is eliminated in many cases, and mitigated in others, by rewriting so identical (vs. merely in components) values appear only once; when you might need a numerical answer at one point (if possible) but to use the value in further computations, this can be done by storing and manipulating values symbolically and extracting numerical results as needed.

This feels like the central limit theorem. Accumulating a bunch of random variables is going to produce a normal distribution, and a normal distribution has an infinite interval.
Yeah, but the standard deviation goes by sqrt(n) for many operations, and there is significant autocorrelation. Interval arithmetic will give you worst-case bounds, which will quickly get fantastically pessimistic.

Numerical analysis was a field invented to give realistic error bounds.

Currently, either a programmer understands that floating point numbers are complicated beasts, or they don't, and get surprised.

With interval arithmetic, either a programmer would understand that floating point numbers are not actually numbers but intervals... or they wouldn't, and get surprised.

So I don't really see much upside. If you know that you need interval arithmetic, chances are that you're already using it.

Interval arithmetic certainly has its place. However, you don't find it used more often because a naive implementation results in intervals that are often uselessly huge.

Consider x in [-1, 1], and y in [-1, 1]. x*y is also in [-1,1], and x-y in [-2, 2]. But now consider actually that y=x. That's consistent, but our intervals could be smaller than what we've computed.

Sure, but wouldn't realistic intervals be more like x in [0.29999999999999996, 0.30000000000000004]?

I mean, intervals as large as whole numbers might make sense if your calculations are dealing with values in the trillions and beyond... but isn't the point of interval arithmetic to deal with the usually tiny errors that occur in FP representation?

That is overthinking it horribly.

The problem is that the user wants to write 1/10 and 2/20 and 3/10 but those numbers aren't really in the binary system.

The user gets some numbers (let's call them A, B and C) that aren't the same but they fool people at first because they not only deserialize as 0.1 but the they also serialize from 0.1. Trouble is that A + B != C but some other number.

Excel tries to hide it but the real answer is to keep the exponent in base 10 if you plan to read and write numbers like 137.036 or 9.1E-31. How the mantissa is doesn't matter, it could be base 7 for all I care -- it is just an integer.

Interval math is for much tougher problems like recursion of

  k * x * (1-x)
is easily proven to have periodic orbits of infinitely long period, but if you are using 32-bit floats you can't have a period longer than 4 billion. That kind of qualitatively difference means that there's no scientific value in iterating that function with floats, although you can do accurate grid samples with interval arithmetic.
Another way to get a feeling for the error (simpler than fully fledged interval arithmetic) is to toggle the rounding rules from the usual (towards even, or so) to up and down, and observe the change in result.

https://en.wikipedia.org/wiki/IEEE_754#Rounding_rules

That sounds interesting, but I would imagine it would become very complicated once you start applying nontrivial functions (discontinuous functions, for example). In that case the range of possible values could actually become discontinuous. I would imagine accounting for that is actually more computationally expensive than just using arbitrary precision decimals.
Yeah, you call tan() on that number, and suddenly your interval is like most of the number line. Actually, you don't even have to be that fancy: if the number is close to epsilon, the error bars on 1/x would be huge.
But isn't that a feature, rather than a bug? It prevents you from getting "false accuracy".
Yes, that's a feature. If you're using interval arithmetic and your result is an unreasonably large interval, then there's a good chance the algorithm in use is numerically unstable.
Sure, but what's the use case for mathematics where you don't know what side of an asymptote you're on?
>> if the number is close to epsilon, the error bars on 1/x would be huge.

> Sure, but what's the use case for mathematics where you don't know what side of an asymptote you're on?

Knowing which side of the asymptote you're on does not solve this problem or even ameliorate it.

That's not what I'm saying - it seems to me to be programmer error that such a question is being asked (an exception should be thrown regardless, since the result is not of a usable quality).
> Then floating point arithmetic could be used to actually reliably test equality without ever generating false negatives.

The flip side is that you generate plenty of false positives once your error ranges get large enough. This happens pretty readily if you e.g. perform iterations that are supposed to keep the numbers at roughly the same scale.

But what you would actually get is something like this:

   x---0.1 + 0.2 ---x
      x---0.3---x
That is, the range of 0.1 + 0.2 would be wider than the range of 0.3. And now what do you do? There is overlap, so are they equal? But there are parts that don't overlap, so are they different?
Make equality checks illegal, and instead define specific operations for contains and overlaps.
Well right now you basically can't ever check for equality with floating-point arithmetic and trust that two numbers that should intuitively be equal are reported as equal.

For me, floating-point equality would be if there are any parts that overlap. Basically "=" would mean "to the extent of the floating-point accuracy of this system, these values could be equal".

If you're doing a reasonably limited number of operations with values reasonably larger than the error range, then it would meet a lot of purposes -- you can add 0.5 somewhere in your code, subtract 0.5 elsewhere, and still rely on the value being equal to the original.

You cannot compare floating point numbers like that.

The equality test in floating point numbers is comparing against the epsilon.

    Math.abs(0.3 - (0.1 + 0.2)) < Number.EPSILON
Which is the same you other languages.

Using the epsilon for comparison is not mentioned in the article. Floating point absorption is also not mentioned in the article.

This entire discussion and the fact this is on the front page of HN is pretty disappointing and sad.

Is this really a surprise for you? if it is... have you ever implemented any logic involving currency? You may want to take another look at it.

Well, that was trivially easy to disprove with a little jiggling of numbers in the console:

  Math.abs(1.8 - (0.1 + 0.2 + 0.9 + 0.6)) < Number.EPSILON
returns false.

Also, you generally really shouldn't be implementing any currency logic using floating point numbers, yikes. Stick to integers that represent the value in cents, or tenths of cents, or similar. Or, even better, a DECIMAL data type if your platform supports it.

I genuinely hope you've never written financial software that judges if the results of two calculations are equal via the method you've described.

I never said you should use floating-point numbers for currency. I said that if you are not aware of the shortcomings of floating-point numbers (the only numeric type in JavaScript, not counting workarounds like typed arrays) you should not be working with values representing currencies until you do.

Fixed-point numbers (aka "decimal" type in Java, C# and others) are the preferred way to deal with this problem as I mentioned in this same discussion hours ago.

Now, in the example you presented, you have a round-off error amplication problem where the round-off error grows larger than the epsilon. You can avoid that using the Kahan summation algorithm.

https://en.wikipedia.org/wiki/Kahan_summation_algorithm

    const arr = [0.1, 0.2, 0.9, 0.6];
    let sum = 0.0;
    let c = 0.0;
    for(let i = 0, l = arr.length; i < l; i++) {
        y = arr[i] - c;
        t = sum + y;
        c = (t - sum) - y;
        sum = t;
    }
Now you evaluate sum, and it's 1.8 as expected.
^ And of course a comment mentioning the correct way of doing things got downvoted.
rounding error after multiple operations can be more than just epsilon
That's why there are algorithms to do those multiple operations.
> have you ever implemented any logic involving currency? You may want to take another look at it.

Floating point arithmetic is good enough for science, should be good enough for commerce too, no? Why is commerce special?