Hacker News new | ask | show | jobs
by abrichr 4016 days ago
Neural networks are notoriously difficult to train due to the large number of hyper-parameters that need to be tuned. If your network never converged, it's possible your learning rate was too high, so it kept overshooting the minima of the loss function.
1 comments

Quite possibly. The classification results were great, just wasn't good when trying to run things back through the network repeatedly. I did have issues that the learning rates reported in some of the original papers didn't match the ones in the released code.