Hacker News new | ask | show | jobs
by dagw 3643 days ago
It used to be common practice to cache useful routines (e.g. start with "os_path_join = os.path.join" before the loop and call "os_path_join" instead of "os.path.join"), thus avoiding the iterative lookup on each iteration.

I'll admit to being skeptical to how big an effect this would have, but it was bigger than I thought:

  %%timeit
  s=0
  for i in itertools.repeat(1.0, 1000000):
      s+=math.sin(i)

  10 loops, best of 3: 124 ms per loop
vs

  %%timeit
  s=0
  math_sin=math.sin
  for i in itertools.repeat(1.0, 1000000):
      s+=math_sin(i)

  10 loops, best of 3: 89.1 ms per loop
1 comments

Not sure how this timeit is running, but there are two things going on: Local name lookups are done as array indexing; whereas global lookups are done as cascading dictionary lookups. And the attribute lookup also is a dictionary search. On instances, looking up a method defined in a super class involves failed lookups in the instance and all the base classes in a definition. In a hot inner loop, definitely worth cutting out
But why does the programmer need to do this? Should the language interpreter be able to implicitly perform this operation?

Isn't that what the .pyc files are for, so that it doesn't need to perform lookups like this at runtime?

There is a difference in semantics. Someone else might actually want os.path to change between call. The only problem is that the behavior of the idiomatic version is hard (almost impossible) to optimize, even though the more optimizable semantics are actually what most users want.

The language was designed with expressiveness in mind, and it often comes at the expense of speed. Lua and Nim seem to strike a much better balance, and even JavaScript if you avoid the performance killers like "with".

> Someone else might actually want os.path to change between call.

Who? Why?

The os.path is not a great example, but imagine a loop where inside the body you mutate some state of an instance and then directly access it. Compare the following with the os.path example:

  for animal in circus.animals:
      circus.next_free_clown.assign(animal)
      # circus.next_free_clown changes in every iteration
Monkeypatching for debugging, mocking, etc.

Most of those are rare, but feasible. Monkeypatching a global as a side effect even rarer, but I think I've done it at some point.

No, the .pyc files are just the source code translated to bytecode when you import the .py file (notice that running with "python somefile.py" doesn't generate a somefile.pyc, the .pyc is created when you import, from inside python or another script).

.pyc files are created for "faster importing", and they remove having to parse and "compile to bytecode" the .py files.

The exact issue is that it cannot. The reason is variables can change on the runtime, and often dependant on input. Say I can have : if blah: os.path.join = lambda ...

and then the pre-looked-up version of method are now wrong.