Y
Hacker News
new
|
ask
|
show
|
jobs
by
raindeer2
232 days ago
The first bit is why it is called Stochastic gradient decent. You follow the gradient of a randomly chosen minibatch at each step. It basically makes you "vibrate" down along the gradient.