Y
Hacker News
new
|
ask
|
show
|
jobs
by
T_D_K
3165 days ago
Based on the limited amount of information, I'm assuming that by "training explodes" you mean that your gradient descent never reaches a local minimum. Try lowering your learning rate? You may be "stepping over" the minimum.