|
|
|
|
|
by tfgg
3560 days ago
|
|
It sounds like Tegmark is pointing out a pretty obvious and deliberately designed property of LSTMs... the entire point of them is to avoid exponentially decaying / exploding gradients and allow propagation of information over longer time-scales. |
|