Hacker News new | ask | show | jobs
by p1esk 2258 days ago
Not sure what you mean by “automatically extracted from a program”, all DL frameworks manually write backward pass for each op.
1 comments

I mean the tracing operation that produces a structure appropriate for AD computation. I agree with you that there's work needed to specify the node derivatives.

Although, honestly, I misspoke. The difference between AD and symbolic differentiation is more subtle. Really AD is profiting because it uses AST representations to keep a graph of intermediate values while symbolic methods can blow up exponentially (or require clever, difficult to generalize tricks to reconstruct that graph).