|
|
|
|
|
by awaythrowact
1792 days ago
|
|
If the next machine learning killer-app model requires autodiff'ed dynamic control flow, do you think Google/Facebook will build that capability into XLA/TorchScript? Seems like if NLP SOTA requires dynamic control flow, Google will build it? Maybe they let you declare some subgraph as "dynamic" to avoid static optimizations? But maybe the static graph assumption is so deeply embedded into the XLA architecture, they'd be better off just adopting Julia? (I honestly don't know the answer, asking your opinion!) |
|
To get the more flexible form, you really would want to do it in a way that uses a full programming language's IR as its target. I think trying to use a fully dynamic programming language IR directly (Python, R, etc.) directly would be pretty insane because it would be hard to enforce rules and get performance. So some language that has a front end over an optimizing compiler (LLVM) would probably make the most sense. Zygote and Diffractor uses Julia's IR, but there are other ways to do this as well. Enzyme (https://github.com/wsmoses/Enzyme.jl) uses the LLVM IR directly for doing source-to-source translations. Using some dialect of LLVM (provided by MLIR) might be an interesting place to write a more ML-focused flexible AD system. Swift for Tensorflow used the Swift IR. This mindset starts to show why those tools were chosen.