Y
Hacker News
new
|
ask
|
show
|
jobs
by
superdimwit
2198 days ago
In the same way that the ReLU derivative is not defined at x=0. Most of the time, in practice, this all doesn't really matter and you can still get gradient descent to work in a useful way.