Hacker News new | ask | show | jobs
by Blaisorblade0 3643 days ago
Are you the OP author, or working on Pyston? I have basically two questions/curiosities — I'm not asking adversarially: 1) For which code is the C runtime most expensive? Typical Python code tries to leave heavy-lifting in libraries, but what if you write your inner loop in Python? Enabling that is (arguably) one goal of JIT compilation, so that you don't need to write code in C. 2) What about using Python ports of performance-sensitive libraries?

In more detail: I arrived at https://lwn.net/Articles/691243/, but I'm not sure I'm convinced. Or rather: with a JIT compiler you probably want to rewrite (parts of) C runtime code into Python so you can JIT it with the rest (PyPy has already replaced C code in their implementation, so maybe there's work to reuse). For instance, an optimizing compiler should ideally remove abstractions from here:

  import itertools
  sum(itertools.repeat(1.0, 100000000))
Optimizing that code is not so easy, especially if that involves inlining C code (I wouldn't try, if possible), but an easier step is to optimize the same code written as a plain while loop. Does Pyston achieve that? I guess the question applies to the LLVM-based tier, not otherwise.

Yes, Python semantics allow for lots of introspection, and that's expensive — but so did Smalltalk to a large extent. Yet people managed, for instance, to not allocate stack frames on the heap unless needed (I'm pointing vaguely in the direction of JIT compilers for Smalltalk and Self, though by now I forgot those details).

3 comments

> Optimizing that code is not so easy, especially if that involves inlining C code (I wouldn't try, if possible)

Inlining through C code probably isn't really an option[0], but the optimisation itself shouldn't be that much of an issue, the rust version and the equivalent imperative loop compile to the exact same code: https://godbolt.org/g/OJHIwc[1]

[0] unless you interpret — and can JIT — the C code with the same underlying machinery as Truffle does

[1] used iter_arith for sum(), but you can replace sum() by an explicit fold for no difference: https://godbolt.org/g/R1BgQQ

> [0] unless you interpret — and can JIT — the C code with the same underlying machinery as Truffle does

Pyston can easily JIT the C code because it uses LLVM for it's main JIT tier.

Nope, I have nothing to do with the blog or Pyston. Sorry if I gave that impression.
You may be interested in an optimizing Python compiler called Pythran.[1][2]

[1] - https://github.com/serge-sans-paille/pythran

[2] - https://www.youtube.com/watch?v=Af8B30mXZ7E