Hacker News new | ask | show | jobs
by blixt 3204 days ago
Cool, this is another valid way of doing it and I suppose it comes down to preference.

A few notes I'd like to mention though, that are not relevant to our toy examples, but may have implications in a production environment:

1. Unless you intend to store the resulting list from a list comprehension, it's good practice to prefer generator comprehensions because they don't grow in memory as the number of iterations increases. Obviously in our examples the sample size is so small that it doesn't matter, but the difference between ''.join(x for x in y) and ''.join([x for x in y]) can grow very large if y consists of many thousands of items.

Here's an example with just integers which requires over a million items before it starts becoming a concern – ultimately it depends on the memory footprint of each item:

    In [2]: %memit sum(x for x in xrange(1234567))
    peak memory: 46.87 MiB, increment: 0.00 MiB
    
    In [3]: %memit sum([x for x in xrange(1234567)])
    peak memory: 68.52 MiB, increment: 11.78 MiB
2. I still prefer the list of tuples in this situation as you have control of word order (you may not want to go through the factors in ascending order). There's [x, z].insert(1, y) which would change the list to [x, y, z] to avoid having to add to the end. Finally, because it's a list you can very easily do a one-time cost in-place sort:

    [(5, 'Buzz'), (3, 'Fizz')].sort(key=lambda t: t[0])
Again, it's silly to talk about performance in our toy examples, but small details like these can actually have memory and execution time implications if you don't consider the differences.

Ultimately, the best advice is to always write the code as readable and maintainable as possible and then optimize. Especially when dealing with a language like Python which was built for expressiveness, not leading performance.

2 comments

By the way, speaking of sum, I find it a bit strange that Python allows

    'hello' + 'world'
but not

    sum(['hello', 'world'])
Intuitively I would have expected the latter to be possible given the former but I guess it comes down to how the + operator and the sum function are implemented in Python, such that counter to my expectation sum is not a function that "applies the + operator" to it's arguments. The notion that this is how it should work stems from my impression that "sum" belongs to the same family as do "map", "reduce" and "apply" -- that these are somehow "functional" in nature in the sense that is observed in the Lisp family of languages.
I guess sum was only implemented to support numeric values. However you can easily roll your own:

    >>> def add(it):
    ...   return reduce(lambda x, y: x + y, it)
    ...
    >>> add(['hello ', 'world'])
    'hello world'
    >>> add(x for x in xrange(10))
    45
Edit: almost forgot a possibly even more Pythonic way:

    >>> import operator
    >>> def add(it):
    ...   return reduce(operator.add, it)

    >>> import operator
    >>> def add(it):
    ...   return reduce(operator.add, it)
Ooh, I like this one. Thanks!
Looping back to why sum doesn't work with strings, it looks like they implemented it like this:

    >>> def sum(it):
    ...   return reduce(operator.add, it, 0)
Basically the first value is always added to 0. This has one important difference which can be a valid compromise if you expect to almost always sum numbers. If you try to call add([]) without the initial 0, you'll get an error because there's no way to know what the zero value is.

In a typed language you could use type inference and use the inferred type's zero value (if the language has such a concept for all types like for example Go does). In Python I guess you could fall back to None, but then you'd have code that doesn't behave consistently for all inputs.

The notion that that is how it work probably stems from you not having flunked computer science.
Thanks for the response, I appreciated it :)