Y
Hacker News
new
|
ask
|
show
|
jobs
by
wsmoses
1957 days ago
You don't always need the input to compute the gradient. For example the gradient of a sum function doesn't require the original input, it just sets all of the derivative(input)'s to 1.
1 comments
shoyer
1957 days ago
To be more precise, in backwards mode auto-diff, inputs only need to be saved if they are used in a non-linear way.
link