|
|
|
|
|
by mark_l_watson
4962 days ago
|
|
That is correct. The problem is that the gradients get smaller and smaller as you back propagate back towards the input layer. So learning on the front part of the net is slow. Hinton has a lot of good material about htis in his Coursera lectures. |
|