Hacker News new | ask | show | jobs
by moelf 1657 days ago
like some other commenters here, https://github.com/CliMA/Oceananigans.jl immediately comes to mind, maybe it would be fun to compare projects on this scale between JAX/Julia.

> JAX offers more than just a JIT compiler: JAX functions are also differentiable

if the downstream library is completely implemented in JAX (numba) ecosystem. Similar for Julia, except implementing fast code in Julia is natural, doesn't involve debugging 3 compilers (Cpython, Numba, Jax). Many python library is only differentiable because the 100x more effort were put in writing C/C++ backend, binding to python, and writing chain rules for foreign functions.

I would imagine Julia to be a good fit for this direction in the future!

3 comments

Oh another thing, in the chaotic regime which is of interest, standard automatic differentiation schemes don't even apply as you require shadow adjoints given the shadow trajectory leads to inaccurate calculations for the gradient. Julia's system is the only one that I know of that has shadow adjoints for differentiation of ergodic properties.

https://frankschae.github.io/post/shadowing/

So unless the purpose is to only differentiate the simulator for short time periods or in the absence of chaos, I cannot see differentiation as a good justification because AD will not give a stable algorithm on that type of problem.

> https://github.com/CliMA/Oceananigans.jl

Off-topic, but I have to say that is one of my favourite package names in Julia. (A more recent one is [MATDaemon](https://github.com/jondeuce/MATDaemon.jl))

The real problem with the Jax code is that the non-composable programming language setup put it into a corner where it's using an extremely inefficient time stepping method that it has "optimized", but how is it optimized if you're doing 100 times more function calls than you have to? Algorithms matter, and "optimizing Adams-Bashforth 2" is a pretty silly idea.
I agree with your point regarding non-composability and ”algorithm lock in” (which may or may not be solvable woth better abstractions), but explicit time stepping schemes are still the main workhorse of global ocean modelling, so I’m not sure whether ”silly” is the right label here.
Why are explicit time stepping schemes the main tool used? Is it because the languages that these models are written in aren't flexible enough, or is there a math reason why dynamic time-stepping isn't better?
Climate models are vastly complex, and you need to bring together many experts from many disciplines to write and maintain one, and analyze the output. This seems to lead to the simplest methods coming out on top. Perhaps it could be solved with better abstractions (a lot of very smart people are trying).
That's precisely what composability solves. We're seeing in CLIMA that using more general highly optimized solvers can greatly decrease the `f` cost count moreso than focusing on really low level optimizations. Especially in things like the land model where you can have many stability issues (such as large complex eigenvalues which happen to work very poorly with multistep methods, even BDF), the ability to split the develop of the time stepping to a huge community of 100's of developers without losing performance gives something where more optimal methods for a domain arise and are found. Yes, the standard is to use something simpler. No, it's not even close to optimal and that is something that is being made very clear.