PyPy performs better, but when you perform 2-3x times better than something ~40x slower than C, you still don't end up with a "fast implementation". Just, "not as slow". If you've got Python code in hand and you want it to go faster, PyPy can have a great bang-for-the-buck, but if you want it to be legitimately approaching the limits of the capabilities of the hardware, you'll need a different approach.
But let me once again underline that if you have Python code in hand, and you want it to be faster, PyPy is a great option. I'm not being critical of PyPy.
A common mantra I've heard dozens of times in the last ~20 years is that there's no such thing as a slow language, only slow implementations. But after witnessing the effort to create "fast" implementations for a lot of slow languages over the past 10 years, and seeing so many of them plateau out at about 10x slower than C, I no longer believe this. Or at least, I no longer believe it is practically true. If there is an implementation of Python somewhere in theoretical program space that is as fast as C, it does not appear to me that it will be possible for humans to produce it.
I agree. Though speed of the JVM for instance is not quite as bad. C comes at a development cost, Rust makes this better, but the memory management is still something that you have to get comfortable with.
The question that really nags at me is why do people want interpreted languages in all of these cases? When you're deploying code, you inevitably go through a series of steps in deployment where throwing in a compile wouldn't destroy the workflow.
I think for many of these cases, the GIL is a great example of this, the language has over-optimized for development at the cost of its runtime.
I'm guessing that people don't usually really want an interpreted language; usually what they want is a language they like, and the one they like happens to be interpreted.
I can imagine reasons to want an interpreted language. As a matter of fact, I've written several implementations over the past 20 years of a hobby language, some of them compiled, some of them interpreted, and in some of the later cases I consciously chose interpretation because dynamic runtime introspection and the ability to see (expanded) source code directly in the runtime during a breakloop was something I wanted, because I was experimenting with runtime semantics and I wanted to be able to see it directly at runtime with the minimum possible change from the source code as-written.
I've also written interpreters sometimes because I'm actually interested in interpreters per se.
But most people probably don't want interpreters for those kinds of reasons. As I said, I think it's more likely that most of the time when someone wants "an interpreted language", what they really wanted is some particular language whose most prominent implementation happens to be interpreted.
That raises the question of why implementations are interpreted, of course. The answers, I think, are some combination of the answers I gave above and the fact that interpreters are really easy to write, especially if you choose the right source language. Simple compilers are not much harder, but easier is easier. I'm generally inclined to start with an easy interpreter, myself, (unless what I'm interested in is compilation strategies) because I get from zero to experimenting with semantics that much quicker, and experimenting with semantics is usually where the fun is.
To be fair, the GIL was included (AFAIK) because it was simple to implement. Also, Guido has said (2007) [1] that he would welcome a patch to remove the GIL if...:
> "... I'd welcome a set of patches into Py3k only if the performance for a single-threaded program (and for a multi-threaded but I/O-bound program) does not decrease."
Unfortunately, experiments thus far have not succeeded to meet these requirements.
There is some work being done by the Gilectomy project to try and meet this bar as well as some other requirements currently though [2]. But it is currently grappling with the afore-discovered performance issues that come with removing the GIL.
Also at PyCon 2016, Guido himself mentions the Gilectomy project and it's potential consideration (if it works) for Python 3.6+ [3].
> no such thing as slow language, only slow implementations
My interpretation of that (and perhaps I'm reading too much into it) was that for modern-day 'retail' tasks (taking an HTTP request, querying a database, dispatching workers, running business logic, and returning a styled webpage) that going for low-level languages over high-level languages did not bear out as much improvement, esp. reconciled against development time and code flexibility.
No, I'm fairly sure this saying is used in the context of the implementations actually being slow. "You're always in IO" is a different defense used for the slow languages, one which I have also found is overstated, for what it's worth. I have found in practice it isn't that hard to build even a simple-looking REST interface that has non-trivial amounts of time spent in non-IO code in Perl or Python. I think a lot of people would be shocked if they sat down and really worked out just how little Python/Perl/etc. code they can run before they webserver noticeably starts slowing down at even small scales.
I spent a little time working with Guido on cache design for the App Engine data access api (ndb). In Java-land, it's a big win to cache datastore entities in memcache because fetch latency is about an order of magnitude faster. In Python-land, caching is a mixed bag - real-world profiling found marginal benefit even with 100% hit rate, and a significant performance disadvantage if the hit rate drops off. This is primarily attributed to pickling overhead.
That's sort of my point, if I understand you correctly.
There are two responses to "Hey, the site is slow!" One is "Well, that's because PHP/Python/Java is slow, we should start coding in C/etc". Another is "The codebase or architecture is probably unoptimized and/or poorly structured in some areas; let's identify, prioritize and start fixing those areas."
The slow languages are "good enough", in an engineering sense, for today's retail features. The payoff of using a fast language doesn't provide enough value for its cost. It doesn't matter that searching a PHP array is slow; that's not where the problems are in today's software.
But let me once again underline that if you have Python code in hand, and you want it to be faster, PyPy is a great option. I'm not being critical of PyPy.
A common mantra I've heard dozens of times in the last ~20 years is that there's no such thing as a slow language, only slow implementations. But after witnessing the effort to create "fast" implementations for a lot of slow languages over the past 10 years, and seeing so many of them plateau out at about 10x slower than C, I no longer believe this. Or at least, I no longer believe it is practically true. If there is an implementation of Python somewhere in theoretical program space that is as fast as C, it does not appear to me that it will be possible for humans to produce it.