|
|
|
|
|
by nestorD
2313 days ago
|
|
I have not read the paper yet but, reading the abstract, the idea of using continuation to store the differentiation information is reminiscent of the technique used in Zygote[0]. Is there some parenty between the ideas ? [0]: https://arxiv.org/abs/1810.07951v4 |
|
> The implementation proposed by Pearlmutter and Siskind returns a pair of a value and a backpropagator ... Doing this correctly requires a non-local program transformation ... Further tweaks arerequired if a lambda uses variables from an outer scope ... In contrast to Pearlmutter and Siskind [2008], using delimited continuations enables reverse-mode AD with only local transformations. Any underlying non-local transformations are implicitly resolved by shift and reset.
I'll have to look more carefully to understand how the CPS version avoids non-local program transformations.