Y
Hacker News
new
|
ask
|
show
|
jobs
by
liuliu
4191 days ago
That's when learning rate changed to a smaller number. The graph mainly shows that with different initialization scheme, the network starts descending initially faster.