Hacker News new | ask | show | jobs
by melted 3795 days ago
Even Cython will be several times slower than a carefully tuned C/C++, Java, C# or Go program for most practical problems. And at least in the case of C/C++ it'll also likely use several times more RAM. Now for a company like Uber it may not matter if something is 3x slower and uses 3x the RAM, just throw more hardware at the problem, but if you're going to introduce typing into Python, you might as well go with a language where typing is not optional, and which, more importantly, has been used in production by thousands of teams over the past decade or more. Of the languages I listed, only C++ is not really viable for most programmers.

As to caring about perf, you shouldn't care about it until you have to. Take that 2h vs 20min example, for instance. If you only need to run it a few times and there's plenty of time available, who cares how long it takes. If the 2h one is easier to write that's by all means what you should do. OTOH if you're under severe time constraints and need to run it every hour, then obviously 2h script won't do the job. Or alternatively if 20 min script takes the same time to write as 2h one, then of course you should go with it. All too often I see people optimizing things that don't matter one iota, simply because they like things to be fast. Something gets executed once a day and runs for 5 minutes? Let's spend two weeks making it complete in 30 seconds. As long as the employer is paying, why not.

1 comments

Cython is the same speed as a carefully tuned C/C++ program; carefully tuned, Cython maps directly to C/C++. I've done many benchmarks of low level numerical code when implementing SageMath (which uses Cython for several 100K lines of code).
As someone who has spent almost 20 years (on and off) writing C and C++, I don't believe it. C/C++ lets you go as close to hardware as you would possibly want. Want SIMD? Easy. Want custom memory allocation (a big deal if you allocate/deallocate a ton)? Sure, why not. Want to profile and optimize cache locality and memory layout? Knock yourself out. Memory alignment? Yup. Branch hints? Of course. I could continue with this, but as someone who has written performance-sensitive code you already know most if not all of this. It's not a coincidence that e.g. high performance linear algebra libs are written in C.

And it baffles me that anyone would even consider writing 100KLOC+ project in something as lax as Python. That's just asking for trouble.

>It used to be that Python's lack of performance didn't matter because disks and networks were so slow things were IO bound. In more and more cases that's just not true anymore.

That's generally a valid statement. But it's why PyPy and Nuitka exist. Pyston and Pyjion are up and coming in this area.

>And it baffles me that anyone would even consider writing 100KLOC+ project in something as lax as Python. That's just asking for trouble.

That's why Python 3.5 has type annotations. It increases the amount of sane usecases for Python going forward.

Let me add I'm with you though in your general thought. The DB/IO bottleneck myth needs to die. We do need more performance, on the order of Elixir. CPU performance at that level or above removes a whole class of application issues. Not sure C's performance is needed though unless you're doing systems programming.

I just happen to like Python as a language, so happy to fly the flag on the various solutions to make it fit that truly needed aspect.

> As someone who has spent almost 20 years (on and off) writing C and C++, I don't believe it. C/C++ lets you go as close to hardware as you would possibly want. Want SIMD? Easy. Want custom memory allocation (a big deal if you allocate/deallocate a ton)? Sure, why not. Want to profile and optimize cache locality and memory layout? Knock yourself out. Memory alignment? Yup. Branch hints? Of course. I could continue with this, but as someone who has written performance-sensitive code you already know most if not all of this. It's not a coincidence that e.g. high performance linear algebra libs are written in C.

Which you can use in Python. I'm a C guy myself, but this bullshit "do it all in C" attitude needs to be calmed down a bit. Python is a great language for many reasons, and you can make up for its downsides by using Python libraries that are implemented in C.

Have you seen numba? Compiles numerical python approaching fortran speeds
...unless you have to call into Numpy or Python C-API frequently in a hot loop. That's bit of a bummer. I would rather write array indexing in Numpy notation than do that error-prone indexing by hand (in Cythonic C). I think there was something in the works to deal with this problem, not up do date on it.