Hacker News new | ask | show | jobs
by Harmohit 717 days ago
This article does a great job at explaining interval arithmetic. However, the introduction says

>Instead of treating each as exactly 7 feet, we can instead say that each is somewhere between a minimum of 6.9 feet and a maximum of 7.1. We can write this as an interval (6.9, 7.1).

Yes we can use an interval to express an uncertainty. However, uncertainties in physical measurements are a little bit more complicated.

When I measure something to be 7 plus minus 0.1 feet, what I am saying is that the value of the measured variable is not known for sure. It can be represented by a bell curve centred on 7 and 95% of the area under the curve (95% probability) that the true value lies between 6.9 and 7.1. The value of the measured variable is much more likely to be 7 than 6.9. There is also a small chance that the value lies outside of the 6.9 to 7.1 range.

In an interval, there is no probability distribution. It is more like an infinite list of numbers.

In practice, interval arithmetic is seldom used for uncertainty analysis for scientific experiments.

7 comments

To close the loop: The connection is called an alpha-cut.

In the Gaussian case it would cut the normal distribution horizontal at a defined height. The height is defined by the sigma or confidence you want to reflect.

The length of the cut resp. The interval on the support is how you connect propability and intervals.

It's possible to use gaussian variables and use gaussian error propagation, for an implementation see https://gvar.readthedocs.io/en/latest/ which is critical for the lsqfit library https://lsqfit.readthedocs.io/en/latest/

In gvar everything by default is normally distributed, but you can add_distribution, log-normal is provided, for example. You can also specify the covariance matrix between a set of values, which will be correctly propagated.

It's hard for me to understand the goal of this comment. Nothing in it is incorrect. It's also not really a meaningful critique or response to the article. The article did not attempt to describe "uncertainty analysis for scientific experiments". It blatantly began by describing interval arithmetic and ended by justifying it as being meaningful in two contexts: IEEE floating point numbers and machining tolerances. Neither are experimental domains and both do have a meaningful inbuilt notion of interval that would be not be served by treating intervals as gaussians.
Gaussian distributions are a horrible choice for representing measurement uncertainty. If the tool is properly calibrated, 100% of the probability mass will be within (6.9, 7.1). A normal distribution would have probability mass in negative numbers!

There's also no motivation for choosing a normal distribution here - why would we expect the error to be normal?

If the error is the sum of many little errors, as it often is in mechanical assemblies, it's approximately normal due to the central limit theorem.
True, but that’s not how most sensors actually work. For example consider a weighing scale. If it says 10.1kg, why would we use a normal distribution?
What I hear is that similar techniques should/could be used by explicitly modeling it not as an interval (6.9, 7.1) but as a gaussian distribution of 7±0.1, and a computer can do the arithmetic to see what the final distribution is after a set of calculations.
You could use intervals to prove the codomain of a function, given its domain is an interval, using the same arithmetic.

Would actually be useful in programming as proving what outputs a fn can produce for known inputs - rather than use unit tests with fixed numerical values (or random values).

There is no reason to assume a normal distribution. If you have a tool that measures to a precision of 2 decimal places, you have no information about what the distribution of the third decimal place might be.
This is correct, which is why intervals don't choose an interpretation of the region of uncertainty.

If you do have reason to interpret the uncertainty as normally distributed, you can use that interpretation to narrow operations on two intervals based on your acceptable probability of being wrong.

But if the interval might represent, for example, an unknown but systematic bias, then this would be a mistake. You'd want to use other methods to determine that bias if you can, and correct for it.

> There is no reason to assume a normal distribution.

There absolutely is with sane assumptions about how any useful measurement tool works. Gaussian distributions are going to approximate the actual distribution for any tool that's actually useful, with very few exceptions.

Tools, yes. Processes, no.

When fabricating, we'll often aim for the high end of a spec so you have material remaining to make adjustments. Most of our measurements actually follow double-tail or exponential distributions.

I'm sorry but if I give you a measuring tape that goes to 2 decimal places and you measure a piece of wood at 7.23 cm, when you get a more precise tape you have no information at all about what the third decimal place will turn out to be. It could be anywhere between 7.225 and 7.235, there is no expectation that it should be nearer to the centre. All true lengths between those two points will return you the same 7.23 measurement and none are more likely than any other given what you know.
I'm not sure why you are being downvoted - this is absolutely true.