Hacker News new | ask | show | jobs
by cookiecaper 3752 days ago
For me, readability is about how intuitive something is, how much can be understood with basic knowledge of programming in general and perhaps a quick primer on the specific language. I think this is the best measurement because it emphasizes a reliance on the most commonly used, general concepts, and encourages people to use those unless there's a good reason not to. A side benefit is that the more basic language primitives and flow controls tend to be more performant, better tested, and have fewer weird edge cases where they behave in an unexpected manner.

I dislike your first example because

a) imo, the lack of spacing makes it harder to see what's going on. it's more obvious that something is being iterated in a for loop with a new indentation level than in the map function.

b) it depends on the Python-specific implementation of lambda. The behavior of a lambda varies substantially from language to language and lambdas are used rarely enough that it's pretty likely someone who doesn't spend all day every day in Python is going to have to go back and look up the specific behavior. The syntax is much less clear than a full function definition.

In my opinion, it's much easier to read code like:

    def capitalize(str):
        return "{}{}".format(str[0].upper(), str[1:])

    for word in long_string.split(' '):
        captizalize(word)
    ...
This makes it much more obvious what's going on, and it should be readable to anyone with a passing knowledge of Python, and possibly anyone with a knowledge of programming languages in general. Invocation of map and lambda in this case only make the intent of the program more obscure.

I'm not saying that map or lambda are never appropriate to use; sometimes they are. But I don't think it's wise to use them when more basic, universal language constructs do an equally adequate job, especially if the only benefit is "fewer lines of code".

The second example is just a list comprehension form of my original example, which IMO is less readable for much the same reasons. If you're not super familiar, you'll need to go back and look up list comprehensions. There is no spacing to make it obvious that something particular is being iterated or branched.

I understand that ultimately, ease of reading comes down to what style one is most familiar with, which makes it subjective, as you said. But I think there is a stronger rational basis for always preferring the simplest construct that adequately performs the function, which is that in the general case, there is less need to refer back to docs, less possibility of unexpected behavior, and less possibility of strange performance issues.

1 comments

Not having touched Python for many years, four things jump out at me:

1) .format, which isn't immediately familiar and doesn't resemble similar string formatters in other languages 2) [1:], which I believe is a string slicing syntax that doesn't resemble similar syntax in other languages 3) Bug: nothing is done with the capitalized words, since the return value is thrown away 4) Bug: the name of the function was misspelled when used.

The last one may seem like a nitpick, but it is true that when you add a name to your code for the sake of clarity, you also take on the additional burden of ensuring the name is used consistently and accurately everywhere. This can be a particular pain in cases where you are generating a lot of uninteresting temporary values -- which is precisely why people end up writing chained function or method calls.

Point 1 is conceded in my earlier response. You're right that someone who is unfamiliar with Python is going to be stopped by the format syntax. However, it's simple to acclimate and learn the basics of it and is something that is very pervasive in Python code and is one of the safest ways to interpolate string data, so its use is justified.

Point 2, this type of string slicing syntax is pretty common in languages similar to Python. A very similar syntax exists in Ruby and Perl. See https://en.wikipedia.org/wiki/Array_slicing for more examples of slice shorthand in the wild.

Point 3. I didn't intend to rewrite the whole thing again, just enough to demonstrate my problems with the reply's lambda-based approach. This is indicated by the ellipsis. If I were to create a full version instead of the quick demonstration of the more-clear full function definition here, then yes, I would've assigned the output of the function to something.

4. Conceded.

OK, but that doesn't seem much different from "code written in Python should be written in a style familiar to Python developers", which is a good principle regardless of the language. In Ruby, the use of Enumerable methods is generally preferred; it would be as "surprising" to use a for loop in this case as it would be to use map in Python.

I also don't see a good argument that string slicing, `for ... in` and Python-style string formatters are more "universal" than lambdas and map/filter/reduce. All of them exist in large subsets of commonly used programming languages; all of them are used heavily in some languages and rarely in others; only one of them (in the form of sprintf) exists in C.

> OK, but that doesn't seem much different from "code written in Python should be written in a style familiar to Python developers", which is a good principle regardless of the language. In Ruby, the use of Enumerable methods is generally preferred; it would be as "surprising" to use a for loop in this case as it would be to use map in Python.

Yes, ultimately, it's a judgment call about the degree to which language-specific caveats and conventions are considered substantially beneficial to justify their introduction.

I will, however, state that I hate that Ruby uses .each instead of for. I do use .each when I write Ruby because as you stated, most Ruby devs will look at you sideways if you used a real for loop, but I really dislike that it's become that way. That's an example of something that's different just for differences sake; any benefit derived from it is marginal and corner-case (something like wanting to override the standard Enumerable behavior), and it makes the whole thing more insular (meaning the behavior is difficult to generalize or extrapolate beyond Ruby, you have to look up the specific behavior), less friendly (meaning it draws attention to itself and takes away productive time; when your code style does this, it needs good justification), and harder to read (because of the two preceding points). IMO, that's a tradeoff that wasn't worthwhile.

>I also don't see a good argument that string slicing, `for ... in` and Python-style string formatters are more "universal" than lambdas and map/filter/reduce. All of them exist in large subsets of commonly used programming languages; all of them are used heavily in some languages and rarely in others; only one of them (in the form of sprintf) exists in C.

Again, it's about being as minimally disruptive to the typical programmer that would be reading the code as possible. While map/filter/reduce may exist in some form in most languages, they're not very commonly used by programmers with an imperative background. Languages like Python have formatting rules that make them harder to read than their language-construct counterparts like for, and while most other imperative languages won't enforce specific formatting rules and preclude the programmer from formatting his code such that map's iteration is equally visually obvious as a for loop's typical indentation and/or bracing, it'd be difficult and unusual to maintain them.

String slicing is a very common need and most languages provide a simple way to perform it, whether it's the slicing shorthand or substr() calls. You're right that C doesn't provide tools for this, but C doesn't even recognize the string as a thing; you just work with groups of chars. The programming community has clearly repudiated that philosophy and demonstrated that it expects its languages to do most of the string dirty work directly. Same goes for string formatters, even though, as I've stated 3 times now, Python's is unfortunately a bit anomalous for not much benefit. This is improved with PEP 498.

"for in" is a nearly self-evident and easy to remember. While this may or may not differ slightly from the specific syntax used in other Python-style languages, it's fairly obvious to someone who is familiar with that language class, and easy enough for someone who has only been exposed to C to hook up and remember once they've read about it once. There also isn't really a more or equally obvious way to express this in Python.

I accept that there are some classes of languages where this is not the case, primarily functional languages. In those cases, the most simple, universally-applicable approach for that language family should be used.