Hacker News new | ask | show | jobs
by blackhole 4781 days ago
This is a terrible idea. This is a catastrophically bad idea. How do you compare numbers? How do you figure out that 2.2 is less than exp(65)? You have to represent these as numbers at some point in the calculation, and in order to do that, you're probably going to be using either floating point or fixed point, which means it's still trivially possible to construct an error case that suffers the exact same problems normal floating point numbers have. Observe:

    x=1
    for(i=0;i<n;++i)
      x=0.01+sqrt(x)
You can't simplify this, so it will simply loop until it hits the datatype boundary, and then get rounded into oblivion, because the underlying floating point representation will break in the exact same way. The only way to get rid of this would be to use an arbitrary number of bits for precision, in which case... just use an arbitrary precision library instead! This is EXACTLY WHY WE HAVE THEM.

Most equations that aren't trivial cannot be simplified. This datatype would only be useful for spoon-fed high school math problems. Furthermore, it costs so much CPU to perform you might as well just use an arbitrary precision library, which will probably be faster and be just as effective at preventing edge cases.

https://en.wikipedia.org/wiki/Arbitrary-precision_arithmetic

3 comments

At some point you _do_ have to use floating point (the whole point of numeric computations is to do things you can't do symbolically), however, this project might enable judicous use of symbolic manipulation in coordination with floating point. Holding off the computation until further in the program might have advantages: Automatic reordering of computations to help preserve precision, automatic symbolic manipulation to cancel out large terms, and perhaps automatic calculation of the amount of precision needed (eg if calculating "a < b", generate the digits of a and b until they differ). The programmer could specify exactly when symbolic manipulation is desired vs numeric.

The quadratic equation case he shows is pretty good. Say you want to solve ax^2 + bx + c = 0. In C you might write "(-b + sqrt(b * b - 4 * a * c))/(2 * a)" which has all the precision problems he described. However, in his library that code would be converted to an AST, then symbolically simplified given the variable values, and _then_ evaluted numerically. The intermediate symbolic step would be able to automatically account for the edge cases, eg if c was 0, so that the numeric step wouldn't fail. In this case, the symbolic solver would recognize that (+, (sqrt, (-, ( *, b, b), 0)), b) is exactly zero before the floating point computation occurs.

I often write scientific programs with the issues he describes, and I have also wished for something along these lines but kind of suspect it is impossible. I usually have to play in mathematica/on paper a bit before finding an algorithm that takes care of the numeric issues (yes, even when using multiprecision), and while it would be nice for it to happen automatically, my impression is that the algorithms I find on paper are nontrivial for a computer to come up with. In some cases, however, the computer could have done it for me. But I have a feeling that in the case precision becomes an issue, even if this library gets me 90% of the way, I would still want total manual control of how the calculations occur. But I still want him to try in case it works!

Hey, if he wants to make a symbolic algebra system, that's great! But don't go advertising it as a replacement for floating point, because it isn't.
Actually, it's a pretty interesting idea. He basically wants to keep track of the last n operations with every given number and take advantage of this history to try to minimize the floating point error. To compare 2.2 with exp(6.5) or anything else you just evaluate the expression, but taking advantage of possible reorderings or simplifications.

Your critique really boils down to:

Most equations that aren't trivial cannot be simplified.

but this is not true, just take a look at what Wolfram Alpha or Mathematica can do.

There are also numbers that can't _possibly_ be represented by this system using the elementary functions/operators, i.e. are "non-analytic".

This may actually not be a problem in most instances, however. It really depends on the use case, but outside of scientific computing/engineering design we're not usually doing computations to find roots of degree >= 5 polynomials.

OTOH, as pointed out by GP, it is easy to construct 'unsimplifiable' operations which after some time may grow too large.

My conclusions are, this certainly cannot be implemented on practical systems without some care -- it's not as straightforward as it looks at first -- and certainly not a better default as is right now.

If you have an equation in your program that can be effectively simplified, you should have simplified it before writing in the first place. The thing about equations that can be simplified is that 99.9% of the time, they can be simplified at compile time.

If you want to write a library to do it for you, great! But don't advertise it as a replacement for floating point number representations, because it isn't.

Hey, why have compilers, if you can just write hand-optimized assembly yourself!

Compilers aren't able to do similar optimizations, first of all the flow of data that finally leads to some computation might be non-trivial and I think most compilers deal well only with reasonably local (in space and in time) optimizations. Besides, compilers typically don't know much mathematics. If I have a function for taking a square root and a function for squaring the compiler most likely wont rewrite sqr(sqrt(x)) to just x. Certainly it wont rearrange a quadratic equation to minimize round-off.

You are basically totally dismissing an idea on very little evidence, which is typical of HN lately. It certainly isn't completely hopeless and trying it out will shed much more light on the problem than your know-it-all-in-advance comments not substantiated by any evidence (like the claim about not being able to simplify most expressions and now the claim of 99.9%).

He want more than compile-time simplification. He wants runtime simplification depending on the values encountered at runtime. For example in the quadratic equation.

And, he clearly isn't marketing it as a replacement for floating point. He says: "It is important to note, however, that the goal of the project is to make tools for symbolic calculations, not to create a viable alternative to the floating point." and he talks a lot about estimating numerical error and precision.

He says this about SymPy...
oops you're right. Nevertheless, he clearly intends to extend rather than replace floating point since one of the goals is to estimate the errors in the floating point calculations, and he rounds.
"You can't simplify this"

I have learned to be careful about making such claims.

http://mathworld.wolfram.com/NestedRadical.html

Yea but simplifications like this are probably non-computable, even when possible, and hard to do in any case.
Wrong, that isn't what this is. This isn't a nested radical to infinite terms, it's a nested radical to n terms. Thus, it cannot be simplified.