Hacker News new | ask | show | jobs
by mjw 2775 days ago
This is very neat. That said the reason these methods haven't received much attention so far is that relatively few people actually need to compute Jacobeans or Hessians directly.

Often only Hessian-vector products or Jacobean-vector products are required, and these can be computed via more standard autodiff techniques, usually a lot more efficiently than if you were to compute the Hessian or Jacobean directly.

Also for models with lots of parameters, the Jacobean and Hessian are usually impractically large to realise in memory (N^2 in the number of parameters).

Nevertheless the symbolic tensor calculus approach is very appealing to me. For one thing it could make it a lot easier to see in a more readable symbolic notation what the gradient computations look like in standard backprop, and could perhaps make it easier to implement powerful symbolic optimizations.

1 comments

True, for large-scale problems Hessian-vector products are often the way to go (or completely ignoring second order information). However, computing first an expression for the Hessian symbolically and then taking the product with a vector is still more efficient than using autodiff for Hessian-vector products. It is not in the paper though. But the gain is rather small (just a factor of two or so). But true, only for problems involving up to a few thousand parameters computing Hessians or Jacobians is useful.