Hacker News new | ask | show | jobs
by amkkma 1773 days ago
Some specific steps that will push it past jax/pytorch for chunky array heavy GPU code (can already beat or meet openblas/MKL for kernels written in scalar form).

1. Better compile time memory management (https://github.com/aviatesk/EscapeAnalysis.jl)

2. Linalg passes built on generic composable compiler ecosystem: https://youtu.be/IlFVwabDh6Q?t=818

3. Metatheory.jl egraph based symbolic optimization interleaved with the abstract interpreter: https://github.com/0x0f0f0f/Metatheory.jl

4. Partial eval mixed concrete and abstract interpretation

5. Compiler based autoparallel with dagger.jl

6. New compiler integrated AD (as a package) that isn't based on an accidental lispy compiler hack like zygote: https://github.com/JuliaDiff/Diffractor.jl

7. Changes to array semantics which will include generic immutability/ ownership concepts.

And many more. The key is that all the initial groundwork that traded off fundamental flexibility for specific speed will then feed back into making the ML usecase faster than if it had focused on that initially. People can do all kinds of crazy yet composable things, in pure Julia without modifying the base compiler.

Bonus: Being able to modify the type lattice to track custom program properties. This means that you don't need to be stuck into global tradeoffs with a static type system and can do things like opt in track array shapes at compile time per module: https://twitter.com/KenoFischer/status/1407810981338796035 Other packages like for quantum computing are planning to do their own analyses. It's generic and the usecases and compositions aren't frozen at the outset. (unlike for example, the swift tensors fitting perfectly proposal).