Hacker News new | ask | show | jobs
by fgdelcueto 1810 days ago
They are indeed concise and beautiful, but some of the implementations I checked are quite naΓ―ve. They're great for academic purposes and for some simple problems they will work just fine, but I would advise against using them for real-world problems.

For instance, for some of the regression algorithms the pseudo-inverse of the matrix is used to solve the normal equations. Again, for small, well-behaved problems it is OK, but for larger scale, less well-behaved problems, it is actually dangerous (numerical instabilities and other issues). There are better, faster, more stable approaches. (PhD in Applied Math here)

BTW, I have been using Julia for work for several years now and I think it is a fantastic language for scientific computing. For those who use Matlab/Numpy regularly, you should definitely check it out.

2 comments

So what are these better faster and more stable approaches?
It depends a lot of the problem. For a general least squares solution of the Ax=b problem, is your A matrix dense, sparse, structured? Do you need to solve it very accurately or just approximately? Are you going to be using this over and over with the same A but different b's? In that case, you may profit from a one time Cholesky factorization that can be used over and over very quickly. Is your linear operator A explicit (as in you can actually inspect the matrix) or is it a linear operator function in which you can only see the result of A(x) (and its adjoint A_adj(y)) ? In this case, you cannot use common factorizations (QR, cholesky, svd) but will have to use iterative methods (ie. Krylov space methods).

There's a whole branch in applied math that's called numerical linear algebra and deals with all these kinds of situations. One of the most important results is that you should almost never do an explicit computation of the inverse of a matrix. It is expensive and can be highly unstable. If you ever see a A\b or pinv(A)*b in somebody's code, it should raise a big red flag.

A little defense of the pseudoinverse here, it's often a fine tool, far more so than the true inverse.

If the matrix is not full rank the psuedoinverse provides the least squares solution.

It can also be done efficiently and may well already be implemented using whatever numerical method one prefers (SVD, QR).

And the implementation probably allows for regularization via a singular value cutoff.

I'm not sure what matlab's backslash operator does these days, but it looks like a more efficient way to get a least-squares solution (compared to the pseudoinverse), possibly with sparsity imposed slightly.

Instead of

  𝛉 = pinv(𝐗)*y
write

  𝛉 = y\𝐗
It will automatically dispatch to the most appropriate algorithm depending on the type of X (Cholesky, QR, LU/Gauss, …)
As a complete layman the second one, given your explanation, seems to be more generic and declarative. Is that a correct assumption?
Yes, it basically means "solve 𝐗𝛉 = y for 𝛉", for which inverting 𝐗 is one of the many methods that can be used to solve it. MATLAB also has the same backslash operator to mean the same thing.
The former could easily be corrected by a compiler macro, should Julia eventually support that.
I don't know about the pseudoinverse in particular, but at least in the case of normal inversion -- you almost never want to invert a matrix, but sometimes you do, so it should be possible to sensibly express the idea of 'invert' I think.
True, but the specific combination of inversion and multiplication, with a well-known meaning, could be transformed. Standalone inversion would still stay an inversion. Algebraic identities are definitely one thing you might want to exploit with advanced compiler macros.
I guess the only concern I'd have is, is it possible that someone knows what they are doing (so, something is special about their matrix) and wants to do an in-place inversion, multiply, and then use their inverted matrix for something else. I'm don't know Julia, though, so I'm not sure if this would be a weird thing to do. (is invert even in-place?)
Often one is not really interested in solving Ax=b for all possible b but just one particular b. In that case one can iteratively solve for x by algorithms like conjugate gradient which only require multiplying A times the current best guess for x, which can be very fast and avoids numerical instability issues in many cases.
QR, Cholesky
QR and Cholesky factorization (QR is the most versatile and is always stable, so usually a better first choice)
> For those who use Matlab/Numpy regularly, you should definitely check it out.

May I ask how was transition like from Matlab/Numpy to Julia like did you migrate your work or start from the scratch using Julia?