Hacker News new | ask | show | jobs
by timtadh 4865 days ago
This is completely true. It is indeed well known and common wisdom.

However, I think the point the parent was trying to make is: Python is much slower than C and many other languages, however most of the time speed is unimportant. When it becomes important, there are many technologies to mitigate the problem in your "hot loops."

If speed is your primary concern don't use Python et. al. If it isn't your main concern go ahead it probably won't become an issue and if it does you probably will be able to get around it.

4 comments

I think just saying don't use python where speed is important is missing the whole point of this slideshow. The author is saying, hey, if we thought about this and put in a few different features we could make it a lot faster without programmers having to do much other than use an alternate implementation of some data stuctures.

While I'd agree that for 99% of us we're not going to find python/ruby/php/javascript to be a bottleneck that can't be mitigated, that's no reason to say it's not worth trying to make them faster. If we can make changes to these languages that will make them more efficient, why not do it?

When it becomes important, there are many technologies to mitigate the problem in your "hot loops."

There is an implicit assumption there that most of the time in your program that could be saved is spent in a small number of hot spots. This will often be true, but unfortunately it is not necessarily so.

This is a particular problem in languages like Python, which are useful (among other things) for their support for rapid prototyping and their easily readable code. All of that is lost if you can’t perform local optimizations to reach an acceptable level of performance, leaving a ground-up rewrite in a faster language like C as the next most likely strategy.

The kinds of techniques mentioned in the linked slides could help to create a middle ground that would be very useful for performance-sensitive projects that currently find themselves between a rock and a hard place.

>Python is much slower than C and many other languages, however most of the time speed is unimportant. When it becomes important, there are many technologies to mitigate the problem in your "hot loops."

I'm not convinced by this "speed is unimportant".

Well, if you're writing shell scripts in Python/Ruby etc, OK, it might be. It might not even be important in web programming.

But for using any language as a generic programming language speed is very important.

The reason you cannot build full blown desktop apps like a browser or GUI libraries in Python? Lack of speed and memory control. And yes, you could offload the work to some extension. And that's a barrier.

Suddenly knowing Python is not enough. You got to also learn, e.g, C, and you have a segmented program structure, with some stuff here and some stuff there. Or you relegate Python to just the scripting layer for your program and do the real stuff in C/C++ (like Adobe Lightroom uses Lua).

I don't want to mitigate the problem in my "hot loops" with another language. I want to not have that problem in the first place. That would make me more productive.

One example: imagine NumPy in pure Python.

For one, it would be trivial to include in your project. Without building anything, it would work in all platforms.

Second, it would be far more accessible to people that don't know C/Fortran/et al to hack it.

Third, it would have been available for Python 3 or PyPy in a few months, not after several years.

Alright. Now, another way of achieving better speed is parallelism. But due to the bad support for it (GIL, lack of first class support) it's not easy to achieve this in CPython/MRI. Sure, you could use multiple processes but then you get all the issues of handling them and synchronising them with your own ad-hoc solution, and without first-class support from the language. Which is a barrier.

Yet another way to get more work done --for some kind of programs-- is evented code. So you have something like Node or Twisted. But Node doesn't have language support, so you get the "callback spaghetti" and Twister and co are external dependencies to the language, so they add another overhead.

Again, barriers.

People say "Speed doesn't matter" because they are trained by their language to only work on problems where speed doesn't matter. So it's more like a self-fulfilling prophecy.

Or course, if you constrain yourselves in "convenient" domains that your language supports fast enough, speed doesn't matter. But every step out of this and you are in need of clutches, from C extensions, to Cython, to Psyco, to Numpy, etc.

The deck is about the performance of the language.
@chadcf and @tptacek

I was responding to @tptacek criticism of the parent not the deck. The deck is great and it mirrors the wisdom I have picked up from optimizing my own code over the years. I personally find it really frustrating not being able to easily pre-alloc lists in Python. I think that having better APIs would go a long way.

As the deck says:

"Line for line these languages are fast!"

"We need better no-copy/preallocate APIs"

"Take care in data structures"

Forgive the naive question, but why not:

    l = [object()] * 100
Perhaps the difference is stack vs. heap?
That will create a list of 100 instances of the same object.

  object[0].x = 1
  print object[1].x
  > 1
Edit: On second read, it looks like you're asking something other than what I thought you were asking. Yes, you could create a list of 100 items and then replace its elements, but that's not idiomatic.
but it is idiomatic in C, which is the point of the slide. C was built around a performance focused idiom, which is to pre-allocate memory and then do in place writes and swaps to mutate the buffer to the state you need it to be. Python is built around an idiom of largely creating copies of objects and appending them to dynamically allocated lists. Its a much slower idiom.
Yes, the question was regarding this line in the original post: "I personally find it really frustrating not being able to easily pre-alloc lists in Python."

So my mind of course wandered in the direction of how to do that.

And this comment thread is about why some people don't care in some applications. HN meta!
This comment thread is isomorphic to a comment thread on Packrat parsing with comments about how "I don't parse anything I just use sexprs".