Hacker News new | ask | show | jobs
by mturmon 901 days ago
Yes.

And beyond this intuition (escape from local optima), the reason that annealing matters is that you can show that (under conditions) with the right annealing schedule (it's rather slow, T ~ 1/log(Nepoch) iirc?) you will converge to the global optimum.

I'm not well-versed enough to recall the conditions, but it wouldn't surprise me if they are quite restrictive, and/or hard to implement (e.g., with no explicit annealing guidance to choose a specific temperature).