Hacker News new | ask | show | jobs
by civilized 1657 days ago
Awesome! One question immediately comes to mind. Any interest in doing this stuff with Julia? You're basically the epitome of their target audience: a scientific computing expert who does HPC with differentiable programs.
2 comments

A bit late to the party, but here are some reasons:

- When we started Veros (~4 years ago) Julia was very new on our radar and we didn't know whether it would stick. And to be frank, I'm still not convinced whether it will stick. Yes it seems like a fantastic language, but we all know how long it took Python to gain traction.

- Climate scientists and students already do their post-processing in Python. Having the whole stack in the same language makes things a lot easier for domain experts whose first priority is physics, not coding.

- Python skills translate better to other jobs, which I think is important for young academics.

- The Python library ecosystem is so good. Need to use PETSc? `import petsc4py`. Simplify postprocessing? Export your model state as `xarray` dataset. Julia is great for bleeding edge autodiff through everything stuff, but the bread and butter libraries are just so polished and battle tested in Python.

- I don't know Julia :)

Those are very good reasons!
There's an earlier blog post by the same author where they discuss three possible ways of moving away from the Fortran/C status quo towards higher-level models. They mention Julia as one of the routes, but not the one they decided to take: https://dionhaefner.github.io/2021/04/higher-level-geophysic...
"On the other hand, Julia’s focus on scientific applications is both blessing and curse. In this day and age, a lot of the progress in computing is driven by applications outside academia (mostly through machine learning)." This seems like a crazy mis-read to me. Julia is probably the language that has the best integration of differential equations and machine learning. Jax closes the gap a little, but is still way behind.

For example https://gist.github.com/ChrisRackauckas/62a063f23cccf3a55a4a... shows a pretty simple case where DifferentialEquations.JL is 6x faster at gradient calculations than Jax.

I was mostly referring to the millions (billions?) of dollars getting poured into Python library development by tech companies. With the effect that Python stays relevant and has a thriving library ecosystem. Maybe I'm wrong and Julia is just that good that it doesn't matter - I guess time will tell.
Jax is just a tool to generate XLA, which produces extremely high performance computational graphs which can map to arbitrarily fast hardware, so I'm very skeptical of the utility of the conclusions of thelink you provided (which seems to be comparing single process CPU linear algebra?)
Single thread CPU Linear algebra is the bottleneck of most small systems, so if you can't do that right, you are going to have problems. If you don't believe the benchmark, feel free to run them yourself.

That said, Jax also has bigger issues in it's handling of higher derivatives. Currently, it only supports a few types of jacobians, and the ones it is missing include all the sparse methods that can make your code orders of magnitude faster. https://jax.readthedocs.io/en/latest/notebooks/autodiff_cook.... DifferentialEquations, on the other hand can do automatic sparsity detection https://diffeq.sciml.ai/stable/tutorials/advanced_ode_exampl....

This post is about Big Simulations, not small systems. Like, hundreds to thousands of cores wiht parameters that don't fit in RAM on a single machine.

I am sure the benchmark produces the numbers the author says, but it's not measuring something useful to the posters of this simulation.

XLA only optimizes quasi-static code, which does not include adaptive numerical solvers like those for ODEs. It's a generally good assumption for ML though, but there are ways to break it. I wrote a piece showcasing some ideas around that: https://www.stochasticlifestyle.com/useful-algorithms-that-a...
IIUC people have already run MD (which is the field I used to work in) on XLA, https://twitter.com/sschoenholz/status/1334997741185814530 In these cases it's almost always better (unless you are a numerical genius) to port to the engine, than to try to make a better algorithm that runs on a smaller engine.
Yes, that has nothing to do with what I just said though. Of course MD is fine because symplectic ODE solvers cannot generally have adaptivity (without tricky and very expensive handling of `t` inside of the Hamiltonian which nobody does because it's still an active research topic how to make it computationally viable). So MD gets a quasi-static code which XLA is fine with optimizing. I was explicitly talking about the non-quasi-static cases.
I'm also surprised that XLA.jl doesn't seem to have had continued development: https://github.com/FluxML/XLA.jl

When in doubt, piggybacking on (or at least interoperating with) what the large technology companies are investing in is probably savvy, sort of what the OP did.

XLA.jl was kind of a solution looking for a problem. If you want fast code in Julia, you can just write Julia.
That's incorrect. If you work with mid-sized neural networks and MCMC sampling, allocations start to play a significant role (And Flux.jl is bad at preallocation). Prealloc.jl does not work properly. Zygote.jl adds even more allocations to the mix...

Jax/XLA completely solves this problem. Yes, it's annoying that you have to work with a static graph but if your problem fits the description... it's great.

I read that as being about what language industry uses to write ML applications, not about technical feasibility of integrating machine learning methods into a codebase. Put differently: industry most often uses Python (especially in ML), therefore the author wants to target Python in order to maximize uptake outside of academia. They even admit that doing it in Python is technically harder than doing it in Julia ("Unfortunately, this type [Type III] is also the hardest to get right"), but consider it worth the trouble for the broader accessibility.

(That's more or less the direction I've been going with research code lately too, so I can sympathize, although I'm not entirely happy with the situation and definitely also sympathize with the Julia folks being unhappy about it.)

Somehow I don't think an ocean simulation needs to be in Python so some startup can use it to... what, sell ads or something?

Anyone interesting enough to be looking at your ocean simulation code can probably handle it being in Julia, and may even prefer it, since the language is so much better designed for this kind of thing than Python.

hopefully if enough people are unhappy about it && sees future in alternative (i.e. critical mass), we can collectively have a "phase transition".
It may take a while, however. 15-20 years ago, you kind of had to use Python on the sly in the scientific setting vs the incumbents (MATLAB, C++, Fortran). Julia seems to be in a similar phase.

That being said, Python does have some structural advantages since it positions itself as a universal glue. It's much easier to gain a critical mass in that regard vs a niche area like scientific or numerical computing. That being said, Julia is probably underrated in general purpose usage.

I think Julia has a much better path to wide adoption for numerical computing/HPC. It is a much better language for package developers (you pretty much never have to go to a lower level language and everything can compose together with much less work). If you look at Julia and Python packages with similar functionality, the Julia one will typically be much more general and 1/10th the lines of code. This is a pretty powerful incentive for on-boarding package devs.
That's an old example. It will now default to Enzyme and should do quite a bit faster. I should update that.