|
|
|
|
|
by shoyer
1658 days ago
|
|
> Neural networks need completely different optimisation methods, and there is no practically useful application of any of the Newton or Quasi-Newton methods for their optimisation. I don't think this is quite fair. There are several variations of 2nd order methods, notably KFAC and Shampoo, that seem to quite effective for large-scale neural network training, e.g., see the intro of this paper for an overview: https://openreview.net/forum?id=-t9LPHRYKmi |
|