| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dataflow 1895 days ago

There was a time when I thought (like you) that everybody should be using interval arithmetic, but then I came across a counterexample that convinced me I was wrong. I don't remember the precise example, but maybe the following will do the same for you.

Say x = 4.0 ± 1.0. What is x / x?

It should be x / x = 1.0 ± 0.0, but interval arithmetic will give you [3/5, 5/3].

Notice the interval is objectively wrong, as the result cannot be anything other than 1.0. Now imagine what happens if you do this a few more iterations. Your interval will diverge to (0, +∞), becoming useless.

The moral of the story (which may be more obvious in hindsight): interval arithmetic is a local operation; error analysis is a global operation. Naturally the former cannot substitute for the latter.

3 comments

generalizations 1895 days ago

Your example only proves your point if every instance of x is the same x, with the same objective value. i.e., what if x/x is actually (x=5.0)/(x=3.0) ?

link

Dylan16807 1895 days ago

x can't be two values at the same time. It's one instance, not a class.

If you have "scope1.x/scope2.x" then they don't cancel out, but that's not "x/x"

And if you save a value for later, and change x, then that value isn't x anymore.

(I assume you're using = to state the value, not as an assignment operator.)

link

dataflow 1895 days ago

Then you're computing x/y and not x/x...

link

crazygringo 1895 days ago

Well, if it's built into the language and the compiler or interpreter knows both x's are the same variable it could optimize it away to 1.0 ± 0.0, so I don't see the problem. Or if the library works on objects passed by reference and can confirm them as the same object, it could do that too.

But if, in your code, you've copied x to y (not by reference), then it seems that x / y would correctly be [3/5, 5/3]. This is a feature, not a bug.

In any case, since intervals are more likely to be more like ± 0.000000000000001 when dealing with basic common non-iterative calculations, it doesn't seem like a problem in practice even if the compiler/interpreter doesn't optimize it away?

link

dataflow 1895 days ago

> and the compiler or interpreter knows both x's are the same variable it could optimize it away to 1.0 ± 0.0, so I don't see the problem

You're moving the goalposts here. Those are HUGE if's. You went from something you could trivially implement in any language to something that requires a LOT of infrastructure and will severely limit your options.

How are you going to handle (x - z) / (x + z) when z = 0? How are you going to handle f(x) / g(x) when they turn out to compute the same value in different ways?

You go from "I need to change float to interval<float>, give me 15 minutes" to "I need a computer algebra system, let me figure out if I can embed Mathematica/SymPy/Maple/etc. into program so I can do math." And even when you do that you STILL won't be able to handle cases where the symbolic engine can't simplify it for you. Which in general it won't be able to do.

> But if, in your code, you've copied x to y (not by reference), then it seems that x / y would correctly be [3/5, 5/3]. This is a feature, not a bug.

No, that is most definitely a bug. The correct result is 1.0, but you're producing [3/5, 5/3]. If that's not a bug to you then you might as well just output (-∞, +∞) everywhere and call it a day. You can insist on calling it a "feature" if it makes you feel better, but that won't change the fact that it's still just as useless (if not actively harmful) for your intended calculation as it was before.

Contrast this with just leaving it as a float instead of an interval, where you would've gotten the correct answer.

> In any case, since intervals are more likely to be more like ± 0.000000000000001, it doesn't seem like a problem in practice even if the compiler/interpreter doesn't optimize it away?

That's only after 1 iteration. Notice that in my example the error was multiplicative, not additive. Run more iterations and your error will magnify.

link

crazygringo 1895 days ago

I wasn't moving the goalposts, I was simply responding to your specific example.

But there's nothing "wrong" about it, it's correct -- that's how it's designed to work. If you're starting with non-extreme values and just miniscule floating-point errors and iterate 1,000 times with basic arithmetic I still don't see how it's going to cause a problem. E.g.:

  Math.pow((1 + Number.EPSILON), 1000) => 1.000000000000222

The result is the same even if you multiply in a loop 1,000 times rather than call Math.pow().

If you're multiplying a million or billion times then that's where I can now understand you should be an expert an numerical analysis in the first place to have any confidence in your result, and you know whether or not your float result can be trusted or not at all.

But the good thing is that if you don't know what you're doing, started with interval arithmetic and it ballooned to (-∞, +∞), then that's a strong signal you shouldn't necessarily be trusting the algorithm's results at all, and to go talk to someone with a background in numerical analysis, right?

Whereas if you iterate a million times and the interval is still miniscule compared to your values, you have absolute confidence you're fine. Seems useful to me -- not useless or actively harmful at all.

link

dataflow 1895 days ago

No, but I'm out of ideas for how to explain it any more clearly. If you really think this is all correct behavior and IA is going to make your life better, start using it in production and you'll learn it the hard way.

link

enriquto 1895 days ago

> If you really think this is all correct behavior and IA is going to make your life better, start using it in production and you'll learn it the hard way.

Even if this may come out as a bit snarky, this is a very good, honest, and helpful answer. At least, it would be helpful to extremely skeptical people like me. I'm a bit like crazygringo, and I cannot be convinced of anything until I have tried it myself and dirtied my hands with it.

To crazygringo: for a concrete, real-world example, try to implement a Kalman filter (a very simple numerical algorithm) using interval arithmetic. It simply doesn't work. You'll see that you cannot extract any useful result from the output of the computation. The implementation of an "interval Kalman filtering" is a (niche) subject of current research, where various very complicated algorithms are being invented to try to reproduce the nice properties of Kalman filtering--even when using something obviously inappropriate like interval arithmetic. For an intuitive understanding, assume that the input data are quantized to integer values, so that the starting intervals are of length 1.

link

crazygringo 1895 days ago

I guess I just don't see why it's worse than FP. I mean, you said:

> Contrast this with just leaving it as a float instead of an interval, where you would've gotten the correct answer.

But if I type into my console:

  var a = 0.1; var b = a + 0.2; b -= 0.2; b == a;

I get false. That's not correct. Whereas if "==" checked for overlap between intervals, it would be true. Which would be correct.

Heck, to use your own division example:

  (0.1 + 0.2 - 0.2) / 0.1 ==> 1.0000000000000002

That's not 1.0. So the FP you claim is so correct... just isn't at all. Again, while interval arithmetic would correctly detect overlap of intervals between that and 1.0.

Anyways, thanks for trying to explain.

link

dataflow 1895 days ago

Turns out Wikipedia has a decent article on this; you might want to check it out: https://en.wikipedia.org/wiki/Interval_arithmetic#Dependency...

link

Stupulous 1895 days ago

I think you run into the opposite issue, where equality is too likely and nonequal values return as equal. There might be a sweet spot with something like

(0.1 + 0.2 - 0.2) / 0.1 ==[+/- .001] 1.0 => True

link

dragonwriter 1895 days ago

> Say x = 4.0 ± 1.0. What is x / x?

> It should be x / x = 1.0 ± 0.0, but interval arithmetic will give you [3/5, 5/3].

This is the “dependency problem” which is eliminated in many cases, and mitigated in others, by rewriting so identical (vs. merely in components) values appear only once; when you might need a numerical answer at one point (if possible) but to use the value in further computations, this can be done by storing and manipulating values symbolically and extracting numerical results as needed.

link