Deep Learning via Hessian-free Optimization: http://www.cs.toronto.edu/~jmartens/docs/Deep_HessianFree.pd...
Optimizing Neural Networks with Kronecker-factored Approximate Curvature: http://arxiv.org/abs/1503.05671
James Martens' list of publications with links to sample code for the above two papers, slides/condensed conference versions, etc: http://www.cs.toronto.edu/~jmartens/research.html
Pretty neat stuff