Hacker News new | ask | show | jobs
by catblast 2224 days ago
> "Make Python faster" is just a losing game imo. It is fundamentally never going to be as fast as other languages, it's far too dynamic (and that's a huge appeal of the language).

This cannot be overstated. Unfortunately python is especially perversely dynamic. After all Javascript and Ruby are highly dynamic, but for a number of reasons have avenues that allow more meaningful optimization gains without changing language semantics. (And although it is true that Google had tremendous resources to pour into V8, it's not like this point doesn't stand - luajit as an example).

Python imho took the performance/productivity trade off way too far. You can get some very effective dynamism with some minimal restrictions that won't so badly shoot yourself in the foot for performance opportunities later. Frankly, for mainstream development, Kotlin or C# gives very pleasant, productive languages with strong ecosystems without paying such a penalty. Swift is good. Go isn't personally my cup of tea, but sure.

Python sort of got a foothold in science.. but that's never been known for being a field of quality software engineering.

4 comments

Python isn't perversely dynamic, no more so than Self which was running within a factor of 2 of C like 30 years ago, and PyPy optimizes Python just fine. The major stumbling block is that the C API unintentionally sucks and requires lots of work to support https://morepypy.blogspot.com/2018/09/inside-cpyext-why-emul...

LuaJIT is a great example of how experimentation and the right tradeoffs can give you a faster language runtime without a huge effort but it's not at all a great example of a highly optimizing compiler.

There's no "map" or "hidden class" optimization in LuaJIT, for example, and getting good performance means avoiding repeated table lookups.

What exactly is known for being a field of quality software engineering?

100% of Python users find it to be fast enough for their use case. Let's stop beating this horse into a black hole, it doesn't need to be fast at converting integers to unicode 100,000 times, to allow quality software engineering.

To compare speed of CPython with quality of anything is such a narrow view, it is self-defeating.

> 100% of Python users find it to be fast enough for their use case.

This is unlikely and an odd thing to say especially in the context of a thread about people rewriting their software in a different stack in part because of python performance issues. There are plenty of people using python that feel the pain and need to spend resources on performance improvements.

The fact very many line of business apps will require more and more complicated hardware resources compared to using "boring shit" like .NET or Java, or Go (which is actually pretty boring, startup hype aside). I'm no huge fan of Java, but I don't feel any less productive in Kotlin than I do python, for many things it is even better. Python aside from a few things is still looks like a 1989 language with a few newer features - the language other than being basically binding lazy doesn't have many amazing tricks up its sleeve. Meanwhile 30 years of progress has been rolled into the mainstream of C#, Swift, and Kotlin.

> To compare speed of CPython with quality of anything is such a narrow view

I'm not saying you can't write quality software with python. What I was saying is the only niche that Python has maybe picked up a significant mindshare compared to alternatives is scientific computing. And I say this with no insult, but as a former grad student in the sciences and having written quite a bit of monstrous python - it's not a field of quality software engineering.

I've seen badly written scientific Python code, and I've seen very badly written enterprise Java code. I’ve also written probably more than hundred web apps using Django, that never needed to be written in Java. And Django is great work of software engineering, having not much to do with scientific software.

Some apps need to be re-written for performance reasons, or optimized, that is however, not the most common case.

For all the companies that are complaining about the performance issues, I suspect they are complaining about the change of their incentives or circumstances, unless they made an uninformed choice with choosing Python to begin with.

Which is the likely case?

The missing piece of the discussion here seems to be socio-political (human) aspect of computer language usage. When choosing a "slow" language, many other factors beside performance and ecosystem are considered. The primary usage of a computer language is actually communicating with the fellow human beings, and there is a huge cost and overhead associated with all the software writing and interpretation. That is why it baffles me when Python is dismissed entirely on a pure performance basis.
I would wager that in a large percentage of cases you would see the same or larger gains as you see from going to different languages you would see by going for better algorithms. You example of scientific computing (and point of software engineering is well taken) is a good one, because in scientific computing people tend to research and use the fastest algorithms and once you do python (+numpy) is often fast enough.
Using numpy is fast, because it just uses the forever-optimized stuff bundled with it, and uses not much Python. But of course offers a handy interface.

Python suffers from the same problems as C in this regard. It's very powerful, easy to get going, because it doesn't force you into some "better" paradigm/architecture/ideology/thinking. (Like let's say mypy or Rust do.)

And that's okay. 80+% of Python scripts/programs are fine without that rigor, there are more important problems to worry about. (Like making the company profitable sooner instead of spending plus a few more months on figuring out the types/bindings for mypy/Rust.)

> 100% of Python users find it to be fast enough for their use case.

As someone who uses python because it tends to be installed on my targets and comes with a decent standard library, I would like the hours I spend optimizing my code back. Especially those hours I lost before I ended up porting the problematic code to C++ or Java.

> 100% of Python users find it to be fast enough for their use case.

This is absolutely false. Lots of people outgrow Python and use another language.

That's the intended humor, they stop being Python users if they outgrow it. The language and ecosystem is growing, regardless. I've been using it professionally since 2007, and some tasks I know it is not good at, and thats when I know to use another tool. The point is that it is a tool that has its uses, and I consider it the best tool for certain categories. It doesn't need to be the best tool for all categories. That seems pretty "absolutist".
I'm not anti-Python, for what it's worth. I love Python, and part of that is because of the features that make it so slow.
This is always mentioned as motivation factor, yet Self, Smalltalk, Dylan, Julia, Common Lisp, JavaScript are just as dynamic and manage to have a good set of JITs to choose from.

For example in Smalltalk you can completely change the structure of a given class across the whole process space, thus invalidating everything that the JIT tried to optimize.

So no, Python isn't any special snowflake, rather there isn't the willingness to actually spend the resources improving its JIT story.

This is a common view but I've never heard it from someone who has tried to optimize Python. Personally I think that Python is as much more dynamic than JavaScript as JavaScript is than C. (I can't talk for the other languages.)

Just look at the small example in this post: - converting things to strings invokes several layers of dynamic dispatches - Python has an extremely expressive calling convention that can cause a lot of overhead

Another example that I like to give is that in Python you can inspect the stack variables of exited frames! The Python-on-mature-VM projects have also not resulted in great performance.

I hear your argument a lot, and I disagree and I also believe it misguides optimization efforts ("let's just apply JS techniques to Python") including our early ones with Pyston. It's part of what I'm trying to get at with this blog post, but maybe I'll write another one specifically about Python vs other languages.

Smalltalk can do all of that and Python doesn't even have something like become:.
Is Julia really just as dynamic? Isn't the problem with Python that the low-level C API still needs to be respected during JIT, which just kills performance. (That's why PyPy largely doesn't support it, right?)

Even JS doesn't suffer from this, because folks can't just load V8 extensions in their browser, and Node went through quite a few NAPI versions - forced by V8 changes.

That said, of course it'd be possible to speed up Python/CPython with pouring a lot more money into it. But ... the CPython codebase is already old and has many problems, a lot of backward compatibility gotchas, and relatively few people willing to work on it. Because there's not a big reason to do so. Whereas with JS Google (and thus Mozilla too) was very incentivized to make it fast.

Julia (and likely Dylan and CL which are all similar languages) are not nearly as dynamic as Python. Or more accurately, they are nearly as dynamic, but writing code like that is not idiomatic and it will lead to performance similar to Python.

The most important factor is that those languages were designed with JIT in mind, with a clear separation of compile time and runtime. Not only the macros, which are low cost abstractions, but also how all the dynamic parts of the language can actually be resolved during this period, like type inference and static/multiple dispatch, which are enabled by it's carefully crafted type system that bridges the dynamic world and the static world. Idiomatic Julia is not strictly a dynamic language but a superposition of many static programs, one of which will be selected for each runtime pass.

So changing a variable type in Python has basically no effect, but in Julia , while allowed, it causes what's called type instability and the compiler will be pessimist and create a dynamic box for the type that can't be optimized. Which is also why global variables are so damaging to performance in the language since they can't be inferred. Defining or redefining a function (not a lambda) or types or importing a library dynamically during runtime is another feature that is also allowed, but avoided in practice since they'll invalidate the JIT cache and force a recompilation. The culture of performance aware programming is the second key factor in their speed.

Julia, I am not sure how far its dynamism goes, but Dylan, Common Lisp and Smalltalk certainly.

You can even at any point in time stop execution in the debugger, rewrite part of the world and just press continue as if that was how the code was originally written to start with.

> You can even at any point in time stop execution in the debugger, rewrite part of the world and just press continue as if that was how the code was originally written to start with.

Bit of a straw man, because you wouldn't do this regularly in code. Python on the other hand is a relatively large language at this point, and plain idiomatic python code leans relatively heavier on JIT unfriendly constructs compared to the other languages mentioned. Meanwhile, CL has a whole concept of "compile-time" that doesn't really exist in python. Hence the "perversely" part.

PyPy has used similar tricks as Smalltalk, Self, and JS/V8, many which were old hat in the 90s, but PyPy demonstrates that writing a performant JIT with reasonable memory requirements for real world code is much harder for Python.

For me the only thing that PyPy sadly demonstrates is that the Python community doesn't care about JIT support and anyone that wishes to use languages that embrace JITs should look elsewhere instead of having a continuous disappointment.
> there isn't the willingness to actually spend the resources improving its JIT story.

We should admit that funding is really important here. The Opensmalltalk-vm is currently sitting on a known optimization (speculative inlining) that has the potential to improve performance 2-3x. But those guys don't have the funding to implement it. Things like JIT/VM work are specialized and complex, and small free-time contributions from people FOSS style won't cut it.

JS has less dynamism in various places (operators especially, which allows you to statically type more, and therefore elide guards).

But yeah, I very much agree: Python isn't special here, it's just a willingness to spend resources (though a lot of that is tied to the CPython C API: either you constrain your VM design choices to be compatible with it directly, or have to add overheads to any/many C API calls).

> Frankly, for mainstream development, Kotlin or C# gives very pleasant, productive languages with strong ecosystems without paying such a penalty

If dynamic is important, Groovy is an excellent option - it preserves almost all of the desirable dynamic properties of Python but is by default 3-5x as fast and with trivial effort 10-20x as fast, approaching Java speeds essentially, and with all of the advantages of that runtime (full concurrency, etc).