Hacker News new | ask | show | jobs
by viraptor 3796 days ago
It depends what "using python" means though. Cython is pretty good at optimizing basic code. Numpy will process your matrices and vectors using specialised libraries faster than most manual C approaches. Shedskin will give you a nice code framework which you can optimise in parts that matter. (insert other specialised examples)

CPython is slow as an interpreter, true. "Programming in Python" may or may not be many times slower than compiling the comparable code in other language. Depends what you're doing and how you're doing it.

Also, I care about performance in any language to some extent. If I can write a backup bash script that takes 2h, or write one that takes 20min, I do care about performance and will choose the second one. Why shouldn't I?

2 comments

Even Cython will be several times slower than a carefully tuned C/C++, Java, C# or Go program for most practical problems. And at least in the case of C/C++ it'll also likely use several times more RAM. Now for a company like Uber it may not matter if something is 3x slower and uses 3x the RAM, just throw more hardware at the problem, but if you're going to introduce typing into Python, you might as well go with a language where typing is not optional, and which, more importantly, has been used in production by thousands of teams over the past decade or more. Of the languages I listed, only C++ is not really viable for most programmers.

As to caring about perf, you shouldn't care about it until you have to. Take that 2h vs 20min example, for instance. If you only need to run it a few times and there's plenty of time available, who cares how long it takes. If the 2h one is easier to write that's by all means what you should do. OTOH if you're under severe time constraints and need to run it every hour, then obviously 2h script won't do the job. Or alternatively if 20 min script takes the same time to write as 2h one, then of course you should go with it. All too often I see people optimizing things that don't matter one iota, simply because they like things to be fast. Something gets executed once a day and runs for 5 minutes? Let's spend two weeks making it complete in 30 seconds. As long as the employer is paying, why not.

Cython is the same speed as a carefully tuned C/C++ program; carefully tuned, Cython maps directly to C/C++. I've done many benchmarks of low level numerical code when implementing SageMath (which uses Cython for several 100K lines of code).
As someone who has spent almost 20 years (on and off) writing C and C++, I don't believe it. C/C++ lets you go as close to hardware as you would possibly want. Want SIMD? Easy. Want custom memory allocation (a big deal if you allocate/deallocate a ton)? Sure, why not. Want to profile and optimize cache locality and memory layout? Knock yourself out. Memory alignment? Yup. Branch hints? Of course. I could continue with this, but as someone who has written performance-sensitive code you already know most if not all of this. It's not a coincidence that e.g. high performance linear algebra libs are written in C.

And it baffles me that anyone would even consider writing 100KLOC+ project in something as lax as Python. That's just asking for trouble.

>It used to be that Python's lack of performance didn't matter because disks and networks were so slow things were IO bound. In more and more cases that's just not true anymore.

That's generally a valid statement. But it's why PyPy and Nuitka exist. Pyston and Pyjion are up and coming in this area.

>And it baffles me that anyone would even consider writing 100KLOC+ project in something as lax as Python. That's just asking for trouble.

That's why Python 3.5 has type annotations. It increases the amount of sane usecases for Python going forward.

Let me add I'm with you though in your general thought. The DB/IO bottleneck myth needs to die. We do need more performance, on the order of Elixir. CPU performance at that level or above removes a whole class of application issues. Not sure C's performance is needed though unless you're doing systems programming.

I just happen to like Python as a language, so happy to fly the flag on the various solutions to make it fit that truly needed aspect.

> As someone who has spent almost 20 years (on and off) writing C and C++, I don't believe it. C/C++ lets you go as close to hardware as you would possibly want. Want SIMD? Easy. Want custom memory allocation (a big deal if you allocate/deallocate a ton)? Sure, why not. Want to profile and optimize cache locality and memory layout? Knock yourself out. Memory alignment? Yup. Branch hints? Of course. I could continue with this, but as someone who has written performance-sensitive code you already know most if not all of this. It's not a coincidence that e.g. high performance linear algebra libs are written in C.

Which you can use in Python. I'm a C guy myself, but this bullshit "do it all in C" attitude needs to be calmed down a bit. Python is a great language for many reasons, and you can make up for its downsides by using Python libraries that are implemented in C.

Have you seen numba? Compiles numerical python approaching fortran speeds
...unless you have to call into Numpy or Python C-API frequently in a hot loop. That's bit of a bummer. I would rather write array indexing in Numpy notation than do that error-prone indexing by hand (in Cythonic C). I think there was something in the works to deal with this problem, not up do date on it.
Is Shedskin still actively maintained? I haven't looked in a long time but last I saw it hadn't gotten any updates in a couple years (maybe I have corrupted memory though ;)

Anyhow, the relatively new Nuitka project seems to be aiming to tackle the python-to-c++ compiler problem, and seems to have a lot of promise. Really good compatibility, apparently decent speedups, and cross platform support. Works into Python3 too. I have a lot of hope!

I don't think it is. But that doesn't mean it doesn't still work :) But yeah, Nuitka is probably a more interesting target for new code.