Hacker News new | ask | show | jobs
by juliangoldsmith 504 days ago
What is it that makes higher order derivatives less useful at high dimensionality? Is it related to the Curse of Dimensionality, or maybe something like exploding gradients at higher orders?
1 comments

In n dimensions, the first derivative is an n-element vector. The second derivative is an n x n (symmetric) matrix. As n grows, the computation required to estimate the matrix increases (as at least n^2) and computation needed to use it increases (possibly faster).

In practice, clever optimisation algorithms that use the 2nd derivative won't actually form this matrix.