|
|
|
|
|
by goosedragons
278 days ago
|
|
That's true! But it's also true that any program dealing with floats can run into numerical instability if care isn't taken to avoid it, no? It's also not necessarily immediately obvious that the derivatives ARE wrong if the implementation is wrong. |
|
It's neither full proof or fool proof but an absolute must is a check that the loss function is reducing. It quickly detects a common error that the sign came out wrong in my gradient call. Part of good practice one learns in grad school.