|
|
|
|
|
by Blaisorblade0
3643 days ago
|
|
Are you the OP author, or working on Pyston?
I have basically two questions/curiosities — I'm not asking adversarially:
1) For which code is the C runtime most expensive? Typical Python code tries to leave heavy-lifting in libraries, but what if you write your inner loop in Python? Enabling that is (arguably) one goal of JIT compilation, so that you don't need to write code in C.
2) What about using Python ports of performance-sensitive libraries? In more detail:
I arrived at https://lwn.net/Articles/691243/, but I'm not sure I'm convinced. Or rather: with a JIT compiler you probably want to rewrite (parts of) C runtime code into Python so you can JIT it with the rest (PyPy has already replaced C code in their implementation, so maybe there's work to reuse). For instance, an optimizing compiler should ideally remove abstractions from here: import itertools
sum(itertools.repeat(1.0, 100000000))
Optimizing that code is not so easy, especially if that involves inlining C code (I wouldn't try, if possible), but an easier step is to optimize the same code written as a plain while loop. Does Pyston achieve that? I guess the question applies to the LLVM-based tier, not otherwise.Yes, Python semantics allow for lots of introspection, and that's expensive — but so did Smalltalk to a large extent. Yet people managed, for instance, to not allocate stack frames on the heap unless needed (I'm pointing vaguely in the direction of JIT compilers for Smalltalk and Self, though by now I forgot those details). |
|
Inlining through C code probably isn't really an option[0], but the optimisation itself shouldn't be that much of an issue, the rust version and the equivalent imperative loop compile to the exact same code: https://godbolt.org/g/OJHIwc[1]
[0] unless you interpret — and can JIT — the C code with the same underlying machinery as Truffle does
[1] used iter_arith for sum(), but you can replace sum() by an explicit fold for no difference: https://godbolt.org/g/R1BgQQ