Hacker News new | ask | show | jobs
by wwarner 56 days ago
Sane & interesting enough to have been disproven, by Boaz Barak iirc. Maybe not surprising since simulated annealing never achieved the results of gradient descent + backprop.
1 comments

You might be trying to be too literal.

What makes statistical mechanics so brilliant is that it takes first principle ideas (particle energies + ensemble) to derive macroscopic thermodynamic rules, all of which were originally derived from observation.

What the OP is proposing is a mathematical analysis of SGD + generic deep learning architectures might be able to derive the rules we have empirically derived from experiments in model training.