| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by eli_gottlieb 602 days ago
	>Our key insight is that the diagonal linear recurrent layer can act as a gradient accumulator So they're sort of reinventing the discrete-time differentiator from signal processing, but parameterized neurally?

1 comments

Converging slowly on Kalman filters, calling it now.