|
|
|
|
|
by jrevels
2773 days ago
|
|
Author here; the arxiv version can be found at https://arxiv.org/abs/1810.08297. Not much different from OP's linked version, but it includes citations to other interesting Julia AD/TPU-related papers that utilize this technique. Happy to answer any questions, at least until I turn in for the night :) |
|
AD is basically a code transformation method.
What's the most notable way the GPU in particular comes into play?
How does caching come into play? What about intrinsic condensing functions?