Y
Hacker News
new
|
ask
|
show
|
jobs
by
voqv
1396 days ago
Is that why it took long? I was under the impression it was because of diminishing gradients in backprop once you stack a huge amount of layers (the deep in deep neural networks).