| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by thesz 1352 days ago

It is extremely relevant for ML.

I am familiarizing myself with recurrent neural networks and getting them trained online is a pain - I get NaNs all the time except for very small learning rates that actually prevent my networks to learn anything.

The deeper network is, the more pronounced accumulation of errors in online training is. Add 20-30 fully connected (not highway or residual) layers before softmax and you'll see wonders there, you won't be able to have anything stable.