|
|
|
|
|
by 6d65
1958 days ago
|
|
You might be right, I seem to have confused the pytorch(and tf eager mode) differentiation approaches with numerical methods. I'm making a Pytorch inspired ML framework, and indeed, each op node, defines also a backward pass, which is a manual definition of a derivative. And going backwards over the ops graph, and combine derivatives for each op via chain rule to get the final gradient, looks indeed like a runtime analytical method rather than a numerical one. The advantage of an automatic AD is not having to define the backward pass for each op, and the function that calculates the derivative being generated at compile time. I've left the project marinate a bit, so the little knowledge I had is fading away. |
|