|
|
|
|
|
by londons_explore
808 days ago
|
|
I think the reason this hasn't been done is you have no way to decide how many recursions are necessary at train time. And if you pick a random number/try many different levels of recursion, you 'blur' the output. Ie. the output of a layer doesn't know if it should be outputting info important for the final result, or the output that is the best possible input to another round of recursion. |
|