Hacker News new | ask | show | jobs
by bootwoot 1508 days ago
I was reading this as an undetailed description of state available WITHIN the interpreter. Probably there is a table of globals that you can simply check last modification on or something like this. Whether you hit it with eval or some other tricky code, you can't modify a global without the interpreter knowing about it.
1 comments

If that's what they mean, how would that be any faster than what's going on right now? I thought normally when you hit a callable, the interpreter would just look up its name, check to see if it's a built-in, and then call the built-in if so... whereas in this case you'd still have to look up the name of the callable (is the idea to bypass this somehow? what do they do currently?), check to see if it's different than the built-in you'd expect from the name (i.e. if it's ever been reassigned to), then call that expected built-in if it's not... which seems like the same thing? At best it would seem to convert 1 indirect call to a direct call, which would be negligible for something like Python. Is the current implementation somehow much slower than I'm imagining? What am I missing?
You could do something like primitive inline cache. Store "version" of the globals in another variable. Each time globals are modified - bump the version. For each call-site and/or keep what the global name is resolved to + version of "globals object" in a static variable. Now you can avoid name resolution if version hasn't changed between two executions of the line. Now in fast-path you just pay the price of (easily predicted, because globals almost never change) single compare and jump vs full hash-table lookup.
I think the core of the optimization you're mentioning hinges on a normal lookup being a slow hashtable lookup (of a string?)... whereas I imagined the first thing the interpreter would do would be to intern each name and assign it a unique ID (as soon as during parsing, say) and use that thereafter whenever they're not forced to use a string (like with globals()). That integer could literally be a global integer index into a table of interned strings, so you could either avoid hashing entirely (if the table isn't too big) or reduce it to hashing an int, both of which are much faster than hashing a string. Do they not do that already? Any idea why? I feel like that's the real optimization you'd need if checking a key in a hashtable is the slow part (and it's independent of whether the value is being modified).