Probably the main justification is that the analysis and transformation steps needed to compute the vjp and jvp pullbacks of a function (which correspond to reverse- and forward-mode automatic differentiation) require enough of the other machinery of a compiler that they are best done WITHIN a compiler. Then other things become quite natural, too, like producing the tangent vector versions of data structures like tuples and maps!
Moreover, a statically typed language like Swift is a much better starting point for this kind of effort than Python. Array shapes and dimensions are already a type system - you might as well go the whole distance and get all the other safety, readability, and efficiency benefits!
PS shout out for named array axes as the future of array-based (and hence differentiable) programming... see http://nlp.seas.harvard.edu/NamedTensor for a good rationale
that’s basically the conversation that is being had in the linked thread:
should this be done as a special case in the compiler
or
shouldn’t this be a more general meta programming based feature
with those in favor of meta-programming also bringing up the potential complexity/speed loss added to the compiler for this feature
the problem for swift is, there hasn’t been a coherent meta-programming story (something like a meta-programming mega proposal) so it seems hard to push for that instead right now....
Moreover, a statically typed language like Swift is a much better starting point for this kind of effort than Python. Array shapes and dimensions are already a type system - you might as well go the whole distance and get all the other safety, readability, and efficiency benefits!
PS shout out for named array axes as the future of array-based (and hence differentiable) programming... see http://nlp.seas.harvard.edu/NamedTensor for a good rationale