Hacker News new | ask | show | jobs
by MartinMond 1266 days ago
When I looked at Python vs Ruby many years ago, I found the opposite: Why does Python have (special) functions like len() and map(), instead of 'properly' supporting both OOP (len should just be a method on objects) and/or FP (support multi-line lambdas so I can actually use map/filter etc).

I never understood how this can be considered consistent at all, and those IMHO language design warts made me look into Ruby at the time.

Has this improved since? I know print was changed in Python3 to make it not-special.

3 comments

map() and filter() being functions means you can run them on any object that implements __iter__(). To make them methods they'd have to be added to each individual type that might want to use them. IIRC Ruby does that using mixins but that takes up a lot of room in the namespace of each type and makes it a little harder to figure out where the methods come from.

The justification for len() is thinner: Guido van Rossum thinks it looks better, and it enforces a consistent name with a consistent meaning (you don't get length methods with different names, or with the right name but strange behavior). Under the hood it just calls obj.__len__(), so it's only the notation that's not OOP.

There are no multi-line lambdas because nobody could come up with an indentation-based syntax for them that Guido van Rossum was happy with.

In short: none of them improved, all of them have reasons, some of those reasons are bad.

Those functions take any object that supports the iteration protocol. There’s no need for adding say, len, to every object that needs a len function. I’d argue that is consistent.

Sure, multi line lambdas might be nice, but equally you can define functions wherever you need them so it’s not really all that different.

> There’s no need for adding say, len, to every object that needs a len function. I’d argue that is consistent.

Python's len works by calling the __len__ method, which must be added to every class that needs a len function.

Since you're already defining a method with a standard name on every class that needs it, Ruby just has you call that method directly as opposed to the absurdity of intentionally obfuscating the method name with underscores so that people use a global helper function instead.

Even C++, with all its warts and its 1980s design got this right in the std lib with the size method.

Honestly, I find some of these defenses of Python quite humorous.

It's almost like they have become memes in themselves. "There is only one way to do it" -> except for those cases where there is a multitude, including the base choice of major release of interpreter, package manager, and so on. Python is very usable but it definitely isn't perfect and the only way you get to improve on stuff is to recognize its shortcomings.
I'd agree that __len__ is a fairly trivial example, but it is used for truthiness in Python too. It seems like a reasonable choice to mark these blessed methods that the language is using deeply as part of the runtime in a special way but I'll happily concede that it could have been X.len() instead.

`sorted`, `list`, `set`, are more interesting cases where they all work with the underlying __iter__ protocol. You don't also want to add X.sorted(), X.as_list(), X.as_set() etc too. Again, you could have X.as_iterable() to implement these, or you could mixin sorted, which will call the __iter__ function. But honestly, it's really neither here nor there.

For the full avoidance of doubt - I'm not arguing for these other "memes" and, in particular, "There is only one way to do it" has never made that much sense to me.

I am arguing that Pythons `sorted` api is reasonable, consistent and not worth worrying about.

Language design is a hard problem. You have to make so many compromises to get it to the point where it works for a large variety of use cases that the degree of ugliness is almost directly proportional to the breadth of application and adoption. The only languages that manage to stay clean are the ones that nobody uses.

I don't think that's avoidable. Mistakes made early on have a habit of compounding over time and calcification makes it harder and harder to deal with them decisively and in a non-breaking way. Python made a couple of bad decisions but on the whole the language came out relatively unscathed, most of the original design constraints are still satisfied. As opposed to say PHP or Java which ended up very far removed from where they started out.

Case in point: Python's GIL must have seemed like a good idea at the time, a quick fix for an urgent problem. And now that quick fix is the albatross that we can't seem to get rid of.

On that we very much agree.

I think it's definitely worth considering what could / can be done better, but people often argue deeply about things that aren't really a massive deal. You've pointed out examples that are much more interesting than "should __len__() have been called len()".

Certainly makes one appreciate Guido's stewardship in keeping the language fairly clean but applicable to large number of use cases for so long!

> There’s no need for adding say, len, to every object that needs a len function.

fwiw, Ruby does the same thing, but using the Enumerable mixin rather than a free floating function.

> I’d argue that is consistent.

I don’t understand this argument. It’s convenient and more efficient to write, yes. But how is it more consistent to have two different calling conventions?

The claim that there’s one way to do something irks me as well. There’s one way to find the length of something, but that introduces two ways of querying an object. It’s a good guiding principle, but when people use it to make stronger claims, it ends up superficial.

By which I mean that you can consistently use `len` etc on any iterable and they will always behave in the same way. You learn them on day one and they work consistently in every place you see them forever.
Map/filter are considered inferior in Python to list comprehensions.

  res = [x**2 for x in range(10) if x != 5]
Before Ruby introduced filter_map:

res = (1..10).select { |x| x != 5 }.map { |x| x ** 2 }

With filter_map:

res = (1..10).filter_map { |x| x ** 2 if x != 5 }

In both cases, I think the Ruby solution is more readable.

Python list comprehensions invert the subject (data) and the verb (action). You see what will be done before you see what the subject is. I would argue that showing the subject first allows easier code review as you know immediately what you are working with.

But beyond that, the first Ruby example tells you in English what is happening. "take this range", "select a subset", then "map some actions to the elements".

And the filter_map abbreviation does the same, telling you "take this range, filter it and perform an operation on the remaining elements".

Python tells you nothing... and what it does say is in awkward order.

As functional and data-oriented programming is gaining in popularity (for good reason), adopting some functional practices in Ruby is a pleasant experience. Doing the same in Python exposes more of these... irregularities.

Edit - I always forget how to format symbols in these comments!

The Python one looks fine to me, although I am a Pythonish person.

It uses what people already know: the for something in somethings syntax of the for loop, and the if syntax. Also it's nice that this works in dictionaries, generators and lists.

It also has the same narrative flow of Haskell's list comprehensions, which I think come from set theory:

  [x^2 | x <- [0..10], x `mod` 5 /= 0]
As for your Ruby examples: I think you could argue that the filter_map version is very readable, but not necessarily more so, but the select one looks pretty painful.
> As for your Ruby examples: I think you could argue that the filter_map version is very readable, but not necessarily more so, but the select one looks pretty painful.

The select does two passes, which makes it quite inefficient. One does not even need filter_map, since the example is essentially a reduce operation.

   res = (1..10).reduce([]) { |a, x| x != 5 ? a.push(x**2) : a }
This works in ruby 2.5.1. Probably works in 1.9 and mruby as well.
True, although that seems less readable than the comprehension versions. I might be just biased, though.
you can define pretty much everything as a reduce operation though
> I think the Ruby solution is more readable.

I disagree and I’ve used both professionally for about the same amount of code.

I think this is purely a personal preference but I also think there is a bias towards list comprehensions being more difficult to mentally parse.

I do a lot of contract work and chatted with a ton of folks ranging from beginners to veterans. A lot of them (well more than half) avoid list compressions, especially when working with teams because it's such a mixed bag of either being able to instantly understand them or it requires more effort. Personally I don't use them in my code (for both reasons).

Both Ruby solutions are much more clear to me even though I have no functional programming background. I have no preference towards functional styles either, I would say it's the opposite. I struggled with Elixir long enough that I stopped using it.

I disagree about list comprehensions, as do more than half the people I’ve asked over the years.
I don’t use either, and also have an admittedly irrational dislike for Python. That said, the Python variant is more readable imo as well.
And yet, in the production Python codebases I've worked with, list comprehensions are rarely seen. Usually it's typical loop iterations. I wonder why that is?...
This says much more about you and the developers and codebases you work with than anything to do with Python list comprehensions.

Coming from a point of zero knowledge of the codebase, I picked Flask. I picked the cli.py module in Flask. And what do I find?

https://github.com/pallets/flask/blob/main/src/flask/cli.py#...

https://github.com/pallets/flask/blob/main/src/flask/cli.py#...

https://github.com/pallets/flask/blob/main/src/flask/cli.py#...

https://github.com/pallets/flask/blob/main/src/flask/cli.py#...

https://github.com/pallets/flask/blob/main/src/flask/cli.py#...

https://github.com/pallets/flask/blob/main/src/flask/cli.py#...

https://github.com/pallets/flask/blob/main/src/flask/cli.py#...

https://github.com/pallets/flask/blob/main/src/flask/cli.py#...

The keyword "for" occurs twice as often in comprehensions and generator expressions as it does in "typical loop iterations!"

Thinking further about this, how does this work with multiple loops? E.g.

  [x + y for x in range(10) for y in range(5)]
This may be a nitpick but Ruby’s naming seems inconsistent. If `filter_map` combines `select` and `map`, why is it not called `select_map`?
> In both cases, I think the Ruby solution is more readable.

Nope.

By who?

I always have to stare at list comprehensions very closely to understand operation being done. The source of the data is in the middle, where it should logically come first. The filter is at the end, where logically it should come after the source. The mapping is at the start, where logically it should come at the end.

I find the monadic, additive style of Ruby much easier to understand:

    10.times.select { |x| x != 5 }.map { |x| x ** 2 }
IMO it's more composable. What if you want to exclude even numbers from the result? Just add another filter:

    10.times.select { |x| x != 5 }.map { |x| x ** 2 }.select { |x| x % 2 != 0 }
Incrementally building up a streaming computation this way is much more useful to me than a list comprehension.

For example, you can add a lazy to the stream to avoid performing all the operations eagerly, and now you have a way to process sequences without blowing out your memory.

> By who?

By the Python developers and its wider community. As Python doesn't have anonymous function blocks in the same way as Ruby (only lambda expressions), tutorials, lessons and the Python docs steer users toward list comprehensions instead.

I'm not saying the ruby syntax is not elegant (it is), I'm saying in Python list comprehensions are recommended over filter/map functions.

On the composable front, personally I prefer to breaks these down into smaller chunks with descriptive variable names rather than chaining.

Python also has the sister "generator" (() rather than []) syntax which also ensures it remains efficient as it pipelines the whole sequence of generators. (Lazily rather than eagerly as you say)

You're effectively saying something like: Canadians prefer Canada. What about the rest of the world?

Once you start adding more "and" to the if-statement in the list comprehension, it becomes a mess. Breaking them down to smaller chunks is required because comprehensions are messy. You are doing smaller chunks due to a shortcoming of comprehensions. Chaining is nice option to have, especially when the chained functions are straight forward.

> Map/filter are considered inferior in Python to list comprehensions.

> By who?

> By the Python developers and its wider community.

> You're effectively saying something like: Canadians prefer Canada. What about the rest of the world?

They're actually saying that Python developers prefer one particular way of doing something rather than a different particular way of doing the same thing. You're suggesting that they're saying Python developers (Canadians) prefer Python (Canada).

I don't mean to speak against your broader points, just that this specific call out is mistaken.

That's an utter mess of control flow right here. Read it left to right ? Wrong. Right to left ? wrong.

Looking at it makes me miss Perl oneliners

Map, filter and similar can be chained together and composed much better than comprehensions I think. I realize that it's probably not your opinion that comprehension are superior, (although it might be) but rather the general python style.

I also think that map and filter style computations can be much more powerful, there are quite a few things other than just map and filter, like count, take, skip, find, flatten, fold, map-while and quite a few more!

In python I guess you are supposed to use a standard for loop to do these things instead.

No. You are supposed to use intermediate variables that contain generators, and write short named functions to call within generator expressions, instead of writing big run-on chains and compositions with little bits of anonymous logic floating around in it.

The = binding is the chaining/composition tool of choice. This is why generator expressions are so important, relative to list and dict comprehensions. They both defer the allocation of space for intermediate values and allow the space to be bounded no matter what the input length is.

Often the “reduce” step, or even more commonly, realizing the side effects of such a generator composed of generators is a simple for loop—because that’s the most readable way to walk through it.