It's a bit more nuanced than that. It's "as fast" without having to write any C.
I tried to recreate something like AlphaGo in Python using Keras, I never got the learning to work (probably because I was impatient and training on a laptop CPU), but a lot of the CPU time was simply being spent on manipulating the board state.
So I ported my "Board" object to Rust, and it was a lot faster. Things like counting liberties or removing dead stones were a lot faster, which was important.
Then I rewrote the whole thing in Julia and it was just as fast as my Python / Rust combo.
So I saw for myself that Julia does solve the two language problem. It is as pleasant to write as Python (and I like it better actually), and performed as well as Rust, based on my informal benchmarks.
Since neither of the others mentioned it. The nuance is the type system + multimethods. It's a gradually typed system that fully specializes code where it can (aided by the expressive power of multi-methods), and hence with some careful (or overkill) placement of types (and multimethods) it's easy to get large performance boosts with minor edits to one's code (rather than porting the whole thing to C which is the python strategy). But the first pass can still be like one is writing python code, with no typing at all (and sometimes you will get lucky, or be very smart, and that will be fast through specialization without extra work).
As a brief demonstration I can write:
foo(a, b, c) = (a + b) * c
And when I call it on integers, it emits only the necessary integer assembly, and when I call it on floats only the necessary float assembly, and when I broadcast it across vectors it emits SSE assembly. It's only when it can't prove the incoming types that it emits any sort of dynamic type code. It's also possible for the calling function to be ignorant of the types too, and so on, until a user decides to pass in an integer or a float, and all of the code is specialized to be as fast as possible.
I wouldn’t say that Julia is gradually typed in the same way that Cython or Numba is. In my experience you usually improve performance by ensuring that your functions handle different types generically. One example of that is making sure the compiler can infer the types of all your variables to something more specialized than the Any type. Another example is being careful to avoid accidental type changes by e.g. summing a Float32 with a Float64 literal.
As I’ve learned the language it’s become pretty easy to avoid those pitfalls even on initial implementations. That said, providing types in function signatures is still very useful for multiple dispatch and providing a more usable API in libraries.
The nuance is that for someone who mainly just calls functions from packages, they probably won’t notice any real speed difference since performance sensitive packages in python and R are typically written in C or C++. Additionally, there are various tools like Numba for accelerating Python code that will make certain restricted subsets of Python just as fast (or sometimes faster) than Julia.
However, as soon as you try to do something a bit more complicated then you’ll notice the speed and flexibility differences.
I dunno about this. At work, we have an exhaustive model fitting procedure that takes a looooonnnng time.
I prototyped a quick julia implementation of a simple glm (almost identical code in Julia and R), and the julia code was approximately 10-20 times faster depending on the model.
This is definitely worth looking at (mind you, the costs of redevelopment of our code in Julia is probably prohibitive). That being said, this would encourage me to call out to julia from R for some of my more computationally heavy workloads.
Not making any claims about Julia, but “gets compiled native via LLVM” doesn’t imply ”is about as fast as other natively compiled languages”
For example, a straightforward Python-to-LLVM compiler would generate code with every variable being a PyObject (https://docs.python.org/3/c-api/structures.html) instance, and “switch(obj.ob_type)” equivalents that would require a “sufficiently advanced compiler” to get to equivalent speed as, say, C.
I tried to recreate something like AlphaGo in Python using Keras, I never got the learning to work (probably because I was impatient and training on a laptop CPU), but a lot of the CPU time was simply being spent on manipulating the board state.
So I ported my "Board" object to Rust, and it was a lot faster. Things like counting liberties or removing dead stones were a lot faster, which was important.
Then I rewrote the whole thing in Julia and it was just as fast as my Python / Rust combo.
So I saw for myself that Julia does solve the two language problem. It is as pleasant to write as Python (and I like it better actually), and performed as well as Rust, based on my informal benchmarks.