| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by 6d65 1958 days ago

You might be right, I seem to have confused the pytorch(and tf eager mode) differentiation approaches with numerical methods.

I'm making a Pytorch inspired ML framework, and indeed, each op node, defines also a backward pass, which is a manual definition of a derivative. And going backwards over the ops graph, and combine derivatives for each op via chain rule to get the final gradient, looks indeed like a runtime analytical method rather than a numerical one.

The advantage of an automatic AD is not having to define the backward pass for each op, and the function that calculates the derivative being generated at compile time.

I've left the project marinate a bit, so the little knowledge I had is fading away.