I did a research project on this a while back - and when it comes to understanding deep network learning rate, regularization, hidden layer effects, and activations, I don't think anything is better than [this little web app](https://playground.tensorflow.org/#activation=tanh&batchSize...)