Hacker News new | ask | show | jobs
by agibsonccc 4359 days ago
I would just like to link to my comments from before for people who maybe curious:

https://news.ycombinator.com/item?id=7803101

I will also add that looking in to hessian free for training over conjugate gradient/LBFGS/SGD for feed forward nets has proven to be amazing[1].

Recursive nets I'm still playing with yet, but based on the work by socher, they used LBFGS just fine.

[1]: http://www.cs.toronto.edu/~rkiros/papers/shf13.pdf

[2]: http://socher.org/