| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by murbard2 2108 days ago
	> There is absolutely no training signal that tells an RNN to remember something beyond the BPTT horizon. So why would it? Because they generalize. Char-RNN learn to balance parenthesis separated by a longer distance than the BPTT window because they've learned that counting parenthesis is useful for prediction based on parenthetical statements shorter than the BPTT window.