Hacker News new | ask | show | jobs
by rm999 2651 days ago
Most of the issues discussed in the article are not from limits in precision, they're from overflows that arise from limits in the exponential range. Finding the average of 8e+307 and 8e+307 is a "low precision" problem, and even naive methods to avoid overflow will not hit limits on precision in this case (e.g. 8e+307/2 + 8e+307/2).

You're right that there are issues with precision and floating point math (the article touches on this a little), but the implementation is definitely wrong if it promises to work like in the case of Python's statistics library (which was correctly using the fractions library to store arbitrary numbers, but converted to a fraction in the wrong order).

1 comments

Sorry, I should’ve been more clear: when I said “precision” I meant any of several things related to floating point arithmetic including, but not limited to: actual numerical precision of operations, non-associativity of operations, overflows (which affects the latter case as well), etc. (Which is why I mention dynamic range in the GP.)

The point is that there is an “unwritten contract” between the user and the library: if we want any kind of performance, then we have to promise that the inputs are “well behaved” (in a very specific sense that varies from algorithm to algorithm). The user knows this and so does the developer, because otherwise most algorithms will spend most of the actual compute time converting problems to “fully general forms” that work on every input, but are hundreds if not thousands of times slower than the original.[0]

The point is that most people want “mean” to be fast, and a simple implementation of mean will work on almost every case, without needing to resort to arbitrary precision fractional arithmetic. The claim I make above is that the latter thing is almost never what people need, because if they did then almost every numerical algorithm is doomed from the start, whenever they perform operations that are more complex than “take the mean.” Now, if they really need this to be done for the huge numbers given (or whatever the case is), then there are libraries that will do this at a huge cost in performance. (But, if they really did need it—for example, when a problem cannot be rewritten in a simple way that would work with the available implementations—my bet is that this user would know).

Also apologies for mistakes, I’m currently on my phone walking to a meeting.

——

[0] that isn’t to say that if you can write a fast and general version, that you shouldn’t! Just that it usually incurs some overhead that is often unacceptable (see, e.g. using arbitrary precision fractions for Gaussian elimination).

Great explanation and detail. If one faces a badly conditioned problem, one must think about the error inherent to the inputs. Anything measured from real sensors is already corrupted by that perturbation matrix. Even in some perfect scenario like a model generated by CAD, the corresponding physical object is not going to match.

Numerical stability feels like this black art that programmers are scared of, but I maybe it's less scary if you explain the core problem: nothing in the world is a real number, it's always a probability distribution.

> using arbitrary precision fractions for Gaussian elimination

Here's a way to get exact results from just standard precision, say, integers 32 bits long:

For Gauss elimination, that's for solving a system of linear equations, say, m equations in n unknowns. So, we are given, in floating point, a matrix m x n A, an unknown vector 1 x n x, and a right side constant 1 x m b. So we are looking for x so that

Ax = b

Now, multiply each equation by an integer, say, a power of 2, so that all the numbers in A and b are integers.

Next get a list of positive prime numbers, each, say, stored in an integer 32 bits long. Say we have p such prime numbers; so we have prime number for i = 1, 2, ..., p.

Next for each i, solve Ax = b by using Gauss elimination but in the field of integers modulo prime number i. For division, use the Euclidean greatest common divisor algorithm. Yes for this arithmetic have to be able to form the 64 bit sum or product of two 32 bit whole numbers and then divide by 32 bit integer and keep the 32 bit remainder -- commonly we write a little assembler routine for this.

After doing that work p times (could be done in parallel), we use the Chinese remainder theorem to put together the rational numbers that are quotients of multi-precision whole numbers of the solution. With those, we can get floating point approximations as accurate as we please.

But if want to work in floating point arithmetic anyway, then there are three ways to do better:

(1) To start, in Gauss elimination, look down column 1 of A, find the row with the number of largest absolute value, and swap rows to put that number in row 1 and column 1. More generally after p - 1 rows, will want a pivot element in row p and column p. By swapping rows, use the number largest in absolute value in column p and rows p, p + 1, ..., m.

(2) When form an inner product, say,

u(1)v(1) + u(2)v(2) + ... + u(q)v(q)

form each product in double precision and add the double precision numbers. If want to do a little better, then before adding, sort the numbers so that are adding the smaller numbers first. Or

"The main sin in numerical analysis is subtracting two large numbers whose difference is small.", etc.

(3) Once have x so that Ax ~ b, find difference d = b - Ax and then solve for e in Ae = d and replace x by x + e so that A(x + e) = Ax + Ae = (b - d) + d = b. So, take x + e as the solution. After Gauss elimination, solving Ae = d can go relatively quickly. Can do this step more than once.