|
|
|
|
|
by srean
270 days ago
|
|
> It's also not necessarily immediately obvious that the derivatives ARE wrong if the implementation is wrong. It's neither full proof or fool proof but an absolute must is a check that the loss function is reducing. It quickly detects a common error that the sign came out wrong in my gradient call. Part of good practice one learns in grad school. |
|