Hacker News new | ask | show | jobs
by imtringued 205 days ago
Matrix free generally refers to using "X-vector product" operators, where X is something like the Jacobian or Hessian, but you do not materialize the final Jacobian or Hessian matrix. A big X operator is split into smaller X operators and you operate on the X operator by obtaining the X-vector products sequentially. This doesn't necessarily mean there are no matrices in the individual X-vector products. The smaller X operators could still be matrix vector products.

In fact, one of the big benefits of splitting your big matrix into a series of small matrix vector products is that some of the matrix vector products are parameterized and some are not or at least they share the same parameters over multiple matrix vector products. This means you can perform matrix-matrix multiplication against some of the operators. This is particularly evident in batched training of neural networks.