Hacker News new | ask | show | jobs
by sdenton4 36 days ago
Because it improves the loss!

The gradient is the direction in which loss improves the fastest. Moving in a direction with a positive dot product with the gradient just means that you're (locally) improving the loss.