Hacker News new | ask | show | jobs
by catgary 503 days ago
Julia has really lost the differentiable programming mindshare to JAX. I’ve spent weeks or months getting tricky gradients to work in Julia, only to have everything “just work” in JAX. The quality of the autograd is night and day, and goes down to the basic design decisions of the respective “languages” (in the sense that JAX jit compiles a subset of Python) and their intermediate representations.

Fundamentally, when you keep a tight, purely functional core representation of your language (e.g. jaxpr’s) and decompose your autograd into two steps (forward mode and a compiler-level transpose operation) you get a system that is substantially easier to guarantee correct gradients, is much more composable, and even makes it easier to define custom gradients.

Unfortunately, Julia didn’t actually have any proper PLT or compilers people involved in the outset. This is the original sin I see as someone with an interest in autograd. I’m sure someone more focused on type theory has a more cogent criticism of their design decisions in that domain and would identify a different “original sin”.

In the end, I think they’ve made a nice MatLab alternative but there’s a hard upper bound on what they can reach.

1 comments

> Julia didn’t actually have any proper PLT or compilers people involved in the outset.

while I don't disagree that currently JAX outshines Julia's autodiff options in many ways, I think comments like this are 1. false 2. rude and 3. unnecessary to make your point

Julia was a scientific computing language made by scientific computing experts. They did a great job on some things, but whiffed a few major decisions early on.
It's a general purpose language made by experts in a myriad of subjects.
I’m sorry, but I’m going to disagree with you on that. Can you point to any of the language designers who had a background in programming language theory? The closest thing I see is Bezanson’s work on technical computing, which seems laser-focused on array programming. I don’t really see anything related to types or program transformations.
> whiffed a few major decisions early on

Anything particular in mind?

The always on jit was a big mistep (IMO, the opt-in torchscript model is much better). I tried a julia a few times and it was just too slow to be usable for anything remotely exploratory. Every year or so, I'd read "TTFP has been improved", so I'd try again and it was still slow as mollasas in siberia. I suspect a lot of people had that experience and will be hard pressed to give julia a real shot at this point, even it it does/has fix the problem.
In general, I’d say there’s too much superficial flexibility but not enough control.

- I wrote this elsewhere: I find their approach to memory management/mutable arrays really hits the worst of both worlds (manual memory management and garbage collection). You end up trying to preallocate memory but don’t actually have control over memory allocations. I find the dynamic type system exacerbates this.

- It’s a very big language, even in the IR. So proper program transforms like mapping functions or autograd are quite difficult to implement.

- Static compilation is really hard, which makes it a non-starter for a lot of domains where it could have made inroads (robotics, games, etc).