|
|
|
|
|
by codebje
32 days ago
|
|
The article talks about inlining a two-arity call to clojure.core/max to instead be an explicit call to cpp/jank.runtime.max, eliminating the unnecessary argument count matching and recursion portions of the Clojure function. It also mentions that in Clang the runtime max function will itself be inlined, so that's something LLVM ("the LLVM project", anyway) is still doing - and beyond that, as written this IR is likely to leave behind plenty of opportunities for LLVM to do the things it's good at: DCE, load/store optimisation, constant propagation, etc. And register allocation. The jank::runtime::max call is itself complex: it's got to type check its arguments and work out what to actually do based on the two types; if parts of these tests are done before the inlined call to max there's a fair chance that LLVM will be able to eliminate their repetition and slim it all down a long way. In the fibonnaci example the fact that a previous test will have likely identified whether the argument is an int or something else should hopefully carry over for ::lte, ::sub, and ::add and simplify those down to just the single operator call - but sadly I suspect it won't at least for the addition, because the recursive call will lose the information that the return value when called with a tagged integer is always a tagged integer. A future optimisation might be to specialise for unboxed types: far more potential speed improvement over pointer tagging, and IMO quite amenable to analysis with the Jank IR (:metadata tag functions as specialised for <type> with the new entry point, if a function only calls specalised functions (and itself) it too can be specialised, and a heuristic to determine if specialisation gains enough to sacrifice space for it). |
|
> A future optimisation might be to specialise for unboxed types: far more potential speed improvement over pointer tagging, and IMO quite amenable to analysis with the Jank IR
All of these math functions are templates with four specific categories:
1. Object and object
2. Primitive and primitive
3. Primitive and object
4. Object and primitive
We handle the difference between typed objects (like integer_ref) and type-erased objects (object_ref) as well. This template then gets inlined, which is exactly what the last step of the benchmark optimizations (adding annotations) ensured. The return type of these functions will prefer primitive types, rather than automatically boxing. jank's analyzer tracks all types used, at compile-time, and supports automatic boxing. This means that we're already using the most optimal primitive math whenever we can and that it will indeed inline to just an operator call when working on two primitives, or two typed objects, or a combination thereof.
You can see the code for this here: https://github.com/jank-lang/jank/blob/29c2adb344526d26c8e82...