Hacker News new | ask | show | jobs
by hellohello2 43 days ago
Why would this be true for simulated annealing?
1 comments

Because it improves the loss!

The gradient is the direction in which loss improves the fastest. Moving in a direction with a positive dot product with the gradient just means that you're (locally) improving the loss.