Hacker News new | ask | show | jobs
by MiroF 2090 days ago
I'm surprised re: blas - that is closer than I thought! Still a huge gap on the GPU, but still impressive.

> in general JIT languages have more opportunities for optimizations than AoT languages

I'm not sure I agree with this. I would say the opportunities for any non-GC mature AoT language (C++, Rust, etc.) is going to be pretty much the same, since you can just attach a JIT to most AoT langs.

2 comments

The GPU gap is only if written in the high level index or loop style. There is little to no gap if done either using array abstractions (broadcast, map etc) or at a level similar to Cuda C (though with nicer Julia abstractions and syntax): https://juliagpu.org/cuda/

The Julialab at MIT is working on making the higher level codegen faster

I guess that makes sense to me.. you can just automatically convert the C in BLAS to Julia and then if they're both being converted to llvm ir by clang anyways than i guess it'll be about as fast!
That's not at all what Julia is doing. It's much more sophisticated in that it has very low level intrinsic primitives that can compose and it optimizes the IR to make it fast and then compiles it to CUDA. These all map to Julia constructs.
Sure, if you give static language a JIT they'll be able to get the advantages of having JIT, though language semantics still matter. A language built for JITs like Julia or Common Lisp have native ways of interfacing with the compiler, and programs are built without worry of exponential explosion of implementations during method monomorphization (as you'll only compile the optimal versions that you'll actually use, based on runtime information, without having to be pessimist as any overspecialization can be fixed on demand). AoT languages would probably need a compiler pragma or type similar to a dynamic boxing but for delayed monomorphization/compilation for methods when you want to avoid compiling all paths AoT (which might be a way to allow for example tensor specialization on sizes, similar to StaticArrays on Julia).
I don't quite follow.

I am not too experienced with Julia, but my understanding was that it uses LLVM to jit itself. Since the LLVM jit compiler is also an API available to C++, anything that can be done in Julia can be done with jit to LLVM api in C++.

Then you just compile the methods that you'll actually use with LLVM right before using them.

Sorry, you're right in that Julia is written in C/C++, so everything Julia does can be solved in those by writing a language (like Julia itself, and not unlike Tensorflow original interface) and compiling it on demand and finding a way to eval the new code and recover the results. I was talking along the ways of how to make it sort of convenient (at least viable to implent unlike the former), as an extension to the C++ compiler itself where you can just tell the compiler what stays AoT and what is JIT'd but otherwise keep the same C++ syntax.

Not to mention if you want to reimplement Julia's logic in C++ you'll have to develop it's sophisticated type inference, since Julia compiler is so aggressive that it will compile at once entire blocks of program (the entire program if it can) as long as it can infer what types are used downstream, which is why it can compete with AoT compiled languages (it's basicaly a "Just Ahead of Time Compiler")

The crux is that which are "the methods that you'll actually use" is very difficult to answer. A lot of effort is put into this on the Julia "Package compiler" and "Compiler" projects.