|
|
|
|
|
by ericjang
2782 days ago
|
|
Yes, a recent exciting phenomena of interest to researchers is how and why the spectrum of the Hessian appears to separate into 2 parts - a "bulk" part that changes very slowly and "outliers" that change quickly. This suggests that only a few weights in the model actually change during training. If one could determine which weights these are, it might lend to faster and more efficient learning algorithms that don't have to backprop to all the parameters in a large neural network. https://arxiv.org/pdf/1706.04454.pdf https://openreview.net/forum?id=ByeTHsAqtX |
|