Hacker News new | ask | show | jobs
by klmr 3778 days ago
Python and similar dynamic languages suffer from the fact that every name access (variable, function, etc) incurs a dynamic lookup of that name in a (nested) dictionary. Statically compiled languages don’t have this. There are fairly recent, clever optimisations that can avoid many of these lookups but (a) they are not implemented in any of the common implementations of Python, R, etc (JavaScript has them though). But even with these optimisations in place we cannot get rid of such lookups altogether, and they kill cache locality and branch prediction.

There are other reasons for slowdown (automatically managed garbage collection is a big one, and so is any kind of indirection, e.g. callbacks). But usually the big one is name lookup.

3 comments

As a compiler writer, I can tell you that in JS, local variable lookups do not incur any kind of dynamic overhead. The performance of modern JS engines is much closer to C than you might think. Dynamic language optimization is also not so recent. Most of the techniques implemented by modern JS engines were invented for the Smalltalk and Self projects. See this paper from 1991, for example: http://bibliography.selflanguage.org/_static/implementation....

Python is just inexcusably non-optimized. It's a bytecode interpreter, with each instruction requiring dynamic dispatch. Integers are represented using actual objects, with pointer indirection. The most naive, non-optimizing JIT implementation might get you a 10x speedup over CPython. I think that eventually, as better-optimised dynamic languages gain popularity, people will come to accept that there is no excuse for dynamic language implementations to perform this poorly.

I haven’t followed recent development of JavaScript all that closely so my knowledge is somewhat outdated. However, the optimisations that make JS performance close to C in some cases are really recent. Some of the tricks are old, such as the paper you cited. But these tricks only go so far, and in particular even modern GCs simply work badly in memory-constrained environments, which puts a hard upper limit on the amount of memory that JavaScript can handle efficiently. One of the better articles on this subject is [1].

That said, my comment already mentioned that local variable lookup isn’t a problem in JavaScript. It is in R, however; see my example in [2]. Beyond that, both R and Python execution have obvious optimisation potential, which is made hard by the fact that existing libraries rely extensively on implementation details of the current interpreters.

[1] http://sealedabstract.com/rants/why-mobile-web-apps-are-slow...

[2] https://news.ycombinator.com/item?id=11117070

The lookup thing only happens during compilation to byte code or intermediate code, I believe. Once in byte code, there are no variable names, only addresses.
No, unfortunately that is not the case. Lookup happens at execution of the byte code, because variables cannot be looked up at byte compilation. Consider the following case:

    x = 1
    local({
        assign(user_input, 2)
        print(x)
    }, envir = new.env(parent = environment()))
If `user_input` is “x”, the lookup of `x` in the local scope finds a different variable, in a different scope. Hence this lookup needs to take place every time this piece of code is executed.

I’m not sure if Python suffers from similar problems.

The lookup thing only happens during compilation to byte code or intermediate code, I believe. Once in VM, there are no variable names. Only addresses.