Hacker News new | ask | show | jobs
by Kalanos 500 days ago
With some serious repositioning, I think there is still an opportunity for Julia to displace Python tools like polars/pandas/numpy, airflow, and pytorch -- with a unified ecosystem that makes it easy to transition to GPU and lead a differentiable programming revolution. They have the brain power to do it.

The future of Python's main open source data science ecosystem, numfocus, does not seem bright. Despite performance improvements, Python will always be a glue language. Python succeeds because the language and its tools are *EASY TO USE*. It has nothing to do with computer science sophistication or academic prowess - it humbly gets the job done and responds to feedback.

In comparison to mojo/max/modular, the julia community doesn't seem to be concerned with capturing share from python or picking off its use cases. That's the real problem. There is room for more than one winner here. However, have the people that wanted to give julia a shot already done so? I hope not because there is so much richness to their community under the hood.

2 comments

Julia has really lost the differentiable programming mindshare to JAX. I’ve spent weeks or months getting tricky gradients to work in Julia, only to have everything “just work” in JAX. The quality of the autograd is night and day, and goes down to the basic design decisions of the respective “languages” (in the sense that JAX jit compiles a subset of Python) and their intermediate representations.

Fundamentally, when you keep a tight, purely functional core representation of your language (e.g. jaxpr’s) and decompose your autograd into two steps (forward mode and a compiler-level transpose operation) you get a system that is substantially easier to guarantee correct gradients, is much more composable, and even makes it easier to define custom gradients.

Unfortunately, Julia didn’t actually have any proper PLT or compilers people involved in the outset. This is the original sin I see as someone with an interest in autograd. I’m sure someone more focused on type theory has a more cogent criticism of their design decisions in that domain and would identify a different “original sin”.

In the end, I think they’ve made a nice MatLab alternative but there’s a hard upper bound on what they can reach.

> Julia didn’t actually have any proper PLT or compilers people involved in the outset.

while I don't disagree that currently JAX outshines Julia's autodiff options in many ways, I think comments like this are 1. false 2. rude and 3. unnecessary to make your point

Julia was a scientific computing language made by scientific computing experts. They did a great job on some things, but whiffed a few major decisions early on.
It's a general purpose language made by experts in a myriad of subjects.
I’m sorry, but I’m going to disagree with you on that. Can you point to any of the language designers who had a background in programming language theory? The closest thing I see is Bezanson’s work on technical computing, which seems laser-focused on array programming. I don’t really see anything related to types or program transformations.
> whiffed a few major decisions early on

Anything particular in mind?

The always on jit was a big mistep (IMO, the opt-in torchscript model is much better). I tried a julia a few times and it was just too slow to be usable for anything remotely exploratory. Every year or so, I'd read "TTFP has been improved", so I'd try again and it was still slow as mollasas in siberia. I suspect a lot of people had that experience and will be hard pressed to give julia a real shot at this point, even it it does/has fix the problem.
In general, I’d say there’s too much superficial flexibility but not enough control.

- I wrote this elsewhere: I find their approach to memory management/mutable arrays really hits the worst of both worlds (manual memory management and garbage collection). You end up trying to preallocate memory but don’t actually have control over memory allocations. I find the dynamic type system exacerbates this.

- It’s a very big language, even in the IR. So proper program transforms like mapping functions or autograd are quite difficult to implement.

- Static compilation is really hard, which makes it a non-starter for a lot of domains where it could have made inroads (robotics, games, etc).

> The future of Python's main open source data science ecosystem, numfocus, does not seem bright. Despite performance improvements, Python will always be a glue language.

Your first sentence is a scorching hot take, but I don't see how it's justified by your second sentence.

The community always understood that python is a glue language, which is why the bottleneck interfaces (with IO or between array types) are implemented in lower-level languages or ABIs. The former was originally C but often is now Rust, and Apache Arrow is a great example of the latter.

The strength of using Python is when you want to do anything beyond pure computation (e.g. networking) the rest of the world already built a package for that.

So without the two-lang problem, I think all of these low-level optimization efforts across dataframes, tensors, and distributed computing would be part of a unified ecosystem based on shared compatibility.

For example, the reason why numfocus is so great is that everything was designed to work with numpy as its underlying data structure.