I'd suggest an alternate approach: add enough commentary to your one-liners so that a reader can parse that instead of the statement. Bugs can hide in the expanded versions of code just as easily. By offering the reader of the code a clear summary of the next one-liner, they can quickly scan it if they are trying to find a bug nearby and decide if it is potentially the culprit.
On top of that, consider making every bit of clever code a reusable library snippet instead. If it becomes part of the vocabulary of your codebase it'll be easier for others to understand it by looking at call sites.
eg:
# Capitalize the first letter of every word
@sentence = @sentence.split(' ').map!{|x| x = x[0..0].upcase << x[1..-1]}.join(' ')
I agree. Someone's "darling" code might be a truly innovative or unique way of solving a problem, and providing enough commentary should help avoid the problems the author warns about.
Code that's clear enough not to need a comment is better than code that has a comment. Comments can be outdated or mistaken and they consume screen real estate.
I'd wrap it in a function called capitalizeWords and be done for the day. I also had 0 problem understanding the meaning of this line. If your code base is consistently made of small functions that look like this I actually don't see a major readability issue. (The article remains very valid)
I haven't written a single line of Ruby in my life and I guessed that it would do that. While I 100% agree to make code readable, maybe this darling was a bad example. And yes, a comment would help with the problem of not understanding it.
`map!`, `<<`, and `0..0` are all unnecessary and distracting here. The availability of `capitalize` aside, `map`, `+`, and `0` would be better choices.
Maybe I'm still just showing off, but I don't even use Ruby that often and it's pretty clear what that code does. Split the sentence into words, capitalize the first letter of each word, and re-join the words into a single string.
I don't know Ruby and reached the same conclusion in about 15 seconds. Could be nice to make it a function [or name the results] "capitalizeWords(text)". Possibly adding indentation could help:
Same. Don't know Ruby at all, but in this case the language is pretty straightforward. Perhaps Ruby was the wrong example. Let's get more magical with Perl:
map s {(\w+)} {\u$1}g, @sentence;
I'll give a dollar to anyone who already knows Perl and can tell me what the fuck is going on there.
The number of people here defending long 'darlings' (or over-generalizing 'long one-liners'), is both shocking and dimsaying to me.
Beyond basic functionality one of any (serious) programmer's _top_ priorities is to maximize the readability of their code for the next (unkown) programmer that touches the source.
We all have large monitors -often rotated 90degrees to portrait. Extra lines are not a bad thing (especially if it helps visually break things into multiple smaller steps), and more work on a single line does not == elegance.
I consider the priority to be to help maximize the productivity of the next (competent) programmer that touches the source.
Optimizing my source for incompetent programmers isn't something I care about.
Additionally, optimizing the source to aid understanding for competent programmers is only something I care about to the extent that it doesn't otherwise impair their productivity.
Code spread over five lines instead of one might be easier to read, but not necessarily easier to edit.
Code should be written to be easy to edit first, and comments used to patch up where that conflicts with making it easy to understand.
We also have nice wide monitors. Tons of short lines with unnecessary names tying them together is usually less clear though, and less robust. (Not that there are only two solutions either.)
Target semi-skilled developers and up, format the code for domain readability and ignore rules that get in the way.
The threshold for 'darling' depends upon the facility of the developer. Some of these are obvious to those with extensive experience.
So where does it end? Dumbing down code until its so wordy, so bland, that it takes minutes to digest every phrase? To rewrite it in your minds eye until its back to that 'darling' concise expression. I recall taking a contractors 2 pages of code, and reducing it to a line. Not even a very complex line. In my view that was all improvement.
It's about clarity, if you reduced all those lines to a single understandable line of code, that's great, if you obscured logic by using a one-liner, I'll say it was not an improvement, but it is indeed a thin line
In this thread are literally 300 comments saying "Huh, I don't even Ruby, and I understood it", thereby (in my opinion) completely proving the point.
It's exactly because people pull this sort of thing "Hey, it was really easy to understand for me, how about you?" that I have seen developers feel compelled to put clever oneliners in codebases. Clever oneliners that later end up causing problems for whatever unluckly newbie has to troubleshoot their edge cases.
Just stop it, and unless you're on a word budget or this drastically improves performance, do something sensible that looks like the pseudocode:
for word in sentence.split(' '):
word.capitalize_first_letter
(If you want to be all functional, by all means use a map instead.)
So I don't have to deal with the case where your clever function barfs on (hypothetical example, don't think this happens with the current code) the special case where there is a space between the last word and a period that ends the sentence; and you're trying to capitalize the period. [Edit: As expected, looks like this code fails with even common graphemes. Which is fine, but is an argument towards at least trying to make it more comprehensible.]
Oooooor, its not a 'clever one-liner'. Its just a piece of code. Which, in its native environment e.g. Ruby is what you're expected to understand to be a journeyman of the trade.
The code is not written for newbies. It never will be. That's why they're 'newbies' and not 'professionals'.
As I said, until it hits some kind of edge case. Again, I'm not trying very hard here, but this oneliner doesn't give the expected output if your sentence has a string like "æsir".
irb> @sentence = "\u00e6sir are gods"
=> "æsir are gods"
irb> @sentence = @sentence.split(' ').map!{|x| x = x[0..0].upcase << x[1..-1]}.join(' ')
=> "æsir Are Gods"
(Expected output being "Æsir Are Gods")
If you want to understand why this is failing, the code I gave above will make it way simpler. Of course, viewpoints differ – you might claim that troubleshooting it is only a job for "professionals".
Its all a matter of what gets the algorithm into your head most clearly. If extra names and steps do it for you, then cool. But at some point extra names get in the way - they can suggest the wrong thing (what was intended but not what is actually happening). That's how troubleshooting goes astray.
Code should be written in the easiest to understand form that both works and provides the necessary performance characteristics. Often that is the form that "newbies" also find easy to understand, for obvious reasons.
I've used Ruby somewhat actively for almost 10 years now and I still hate it when people barf out a bunch of Ruby-specific shorthand and think they're clever for doing so. Write it in the simplest form that works. I don't care if it takes 10 more keystrokes.
I've seen people write fizzbuzz with lambdas and such thinking that it would signal they're super-serious Ruby all-stars, but all it really does is let me know I should keep my distance, because they're going to be writing needlessly convoluted code.
I am sympathetic to the author's argument. Programmers should strive to write understandable code. However, I think you should also assume that people reading your code are fluent in the language and its idioms. You wouldn't expect a novelist to write books at a first grade reading level just to appease people that haven't put forth the effort to learn the language beyond just the basics.
Long long ago when I encountered the string copy idiom in C 'while (s++ = t++)' for the first time, it left me scratching my head for a bit. Then it becomes a part of your vocabulary.
Yes, that is almost exactly how I would rewrite it. The original doesn't communicate its intent at all. It makes one itching to rewrite it but, it takes some time to take all the "smartness" into account.
The original line raises surprisingly many questions:
- The default for `split` is to split on whitespace. Is it the intent of the author to only split on spaces? (I guess so) What about tabs?
- Why is the author using #map! (with exclamation mark)? I have to admit this put me on the wrong foot for a while. My best guess right now is that the reason is speed; Mutating the array that split produced is (slightly) faster than creating a new one.
- Why is the capitalized string assigned to x again? I can think of no good reason at all. Am I missing something?
- Why is the author using x[0..0] instead of x[0] or x.first? I know why: The difference is that the latter two return nil if x is the empty string; but it's far from obvious and one could easily break the code by trying to improve this.
- Why are the parts of x concatenated with << instead of with +? For speed? Is the author not aware of String#capitalize ?
One-liners are fine if you know you'll never need to extend the logic later on. Problem is, how often does that stay true?
For instance your 'titleize' method, in order to properly title case any string, ought to support a second option of words to ignore, such as "a, an, the, ...". If you wrote it as a one-liner at first, then you've got to go into your mapper and add conditionals if an ignore list is passed, and you've got more work than if you left it as a more expanded piece of logic.
Again, this is somewhat of a trivial example, but building on the original post's point of "how easily can I understand this later on" can also include "how easily can I extend this later on". Avoiding one-liner cleverness can help as a general principle.
pretty experienced rubyist here, would use your version instead of what's in the blog unless benchmarking indicated a penalty that was going to hurt in some egregious way.
I've found I often prefer reading and writing the actual guts of the inlined function, over naming it something semantically significant. It's often difficult to craft a good English name for the function. It can inhibit understanding to have to change context to learn what a function does by looking at its code (OtherModule.capitalizeEveryWordInThe). There can be small semantic variations (capitalizeEveryWordInThe, capitalizeEveryWordButFirstInThe) requiring an exploding util package.
Edit. I'm saying I find the one-liner form frequently more understandable than one embodied in a function.
Yeah, I guess it depends on how many instructions are in that one liner and how intelligible it is three years from now.
The expressiveness of the language also helps a lot.
a.map { x in x * 3 } is probably intelligible enough (in Swift) so no need for util.multiplyEveryElementBy3(a)..
There are examples of code which is unnecessarily complex, or difficult to read, but this is not a very good example. I'll join the chorus of those who say they don't even know Ruby and yet they could easily figure out what the example was doing.
The most you could could say about this example, IMHO, is that it could be made more readable by breaking it up into more than one line.
The last thing we need is more people writing C code in dynamic languages that have much better and more concise programming constructs. If a programmer does not wish to learn how to use these constructs, that's their business. But I intend to use all the power available to me in whatever language I am programming in -- when it is appropriate, of course.
Just because some programmers don't understand how a particular language construct works, that does not make its use the product of a self-absorbed hacker who thinks he has "superpowers".
Yes, we should all guard against writing code that is difficult to read, but I think such code is rarely caused by the language features we use (although some languages make that easier than others -- I'm looking at you, C++). In fact, you could argue that it is more commonly caused by the features we don't use. Unnecessary verbosity is often the product of someone who is new to a particular programming language, and I think most of us are happy to discover shorter, simpler, clearer ways of expressing things.
In fairness to the author of the article, I think there is a point when you can chain a few too many things together without any intertwined temporary variables, separate function calls, or at least comments, and the reader of your code can get lost. However, in my mind, the code example given hasn't quite reached that threshold yet.
I'm all for clever code when playing around but there are limits to its usefulness. When its over the top, you're not doing yourself or anyone any favors. If something took you a while to come up with, and on top of that its not commented out, its going to annoy the next person, and maybe even yourself when you get back to it 6 months down the road. Well, either annoy or stroke your ego.. Neither contribute much to the quality of the end product.
Here's a quote from Brian Kernighan: "Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it?"
I've come to appreciate simplicity especially when spending a lot of time dealing with other people's code. If you spend most of your time in other peoples code, simplicity will save you endless amounts of time.
Having said that, it's much more likely that a "clever one-liner" would be
@sentence.gsub!(/\b\w/){|x|x.upcase}
Code like the article's would generally not be the result of cleverness, but of code that has organically mutated into something silly you'd never have written from scratch that way.
In fact, I actually coined a code metric for this - Jones Complexity, which is the mean operations-per-line (My thesis is that "good" code has a Jones Complexity of less than 8.) For some reason, it looks like code density of this type wasn't ever really considered before. I value readability and maintainability of code even more than I value efficiency and speed (in most cases.)
I do it because it reduces physical code size and helps me "chunk" code in my brain. Since I wrote it, I know what I intended it to do and how it does it, possibly with the aid of a small comment.
This helps me have a more compact complete representation of the surrounding code, both in my editor and in my brain.
Is it bad for verifying correctness and allowing others to understand the code? Sure. But there is absolutely a well-considered reason why I do it, and it's not to show off.
Any code whose readability depends on information stored only in your brain is bad code, sorry. Perhaps there are times when bad code is excusable, but there is no argument that such code is high-quality. The best code expresses an idea in the simplest way possible, which is not synonymous with the fewest number of lines possible.
Adding to this, I've found from profiling a lot of Python 2.7 code that a lot of those little cute one liner list comprehension expressions are slower than the expanded for loop forms. And also the expanded loops are a lot easier to change after the fact, for example if you want to do two operations on an item in your loop, you then have to expand it out anyway.
As no less authority than Brian Kernighan wrote, in 1974, "Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it?"
These days I seem to like relating programming to music, so here I go again...
I think virtuosic displays like this are important for the ego of the creators at times, and should be commended as it represents a deep understanding of the craft and leads to even deeper and more interesting and perhaps profitable explorations later on.
But as in music, virtuoso's often, as they mature, realize that they work in ensembles, and that requires a tempering of their prodigious abilities in order for the "whole" to be its best.
In other words, its good to show off, but in the end the true master knows how to temper it for the greater good.
There are a lot of people asserting that, while you should be writing (in part) for readability, you should be able to assume some level of competence on the part of those who follow. I would say that is probably the case but really consideration should be given to who your audience is actually likely to be given the circumstances in which you write.
It is also the case that we often overestimate how understandable things are to others - I write above discussing goals before any compensation for this is applied.
There was an HN discussion quite a while ago on a one-line Game of Life in APL. Much more difficult to understand than using any currently popular language.
If it takes more cleverness to debug code than to write it, and one writes code that is at the limits of one's own cleverness, then clearly one will not be able to debug it.
Yesterday, I was complaining that Rust encourages this sort of thing. In Rust, it comes from the error handling, though. You have to write lots of
x().and_then(|foo| exp).and_then(|foo| exp)
to handle errors. The imperative form requires a match statement after every function call that can return an error. Or "try!()", which bails with a return. It looks like we just have to get used to this Haskell-like style.
At the very least I'll agree that terseness is not always a virtue, and the time saved in keystrokes or formatting now may be easily lost later when someone tries to figure things out. That said, not all statements of this form are unreadable or incomprehensible, and in some cases there may actually be performance implications as well. As always it is a matter of judgement.
I actually love such code, as long as the vars or funcs are named capitalizeWords.
This type of code is the difference between long term investment/cure vs short term gain/steroid.
I cannot imagine a boiler plate 'for loop' with assigning and re-assigning and doing all sorts of convoluted branching.
Great post. This is one of my biggest pet peeves with Ruby and languages like it; they encourage developers to "show off" by using the most esoteric features they can find (even better if these features use weird symbols).
Compared to a language like Python that has few neuroses, the same developer writes much less readable code.
The post may have been better if the author included a Python sample that does the same thing:
long_string = 'ajix mxozl xoap'
new_string = []
for word in long_string.split(' '):
new_string.append("{}{}".format(word[0].upper(), word[1:]))
new_string = ''.join(new_string)
Admittedly there are a few annoying symbols in here, mostly due to the non-intuitive operation of the join function and the well-meaning but less-than-ideal string formatting syntax (which could've been avoided if we used conventional strong concatenation, and which PEP 0498 attempts to improve). Also admittedly, coders who want to show how smart they are would try to use a list comprehension to do this, which is slightly less readable and imo shouldn't be used over this format without a good reason. But it's still much easier to parse than the Ruby version because everything is explicitly spelled out in the loop, and you generally shouldn't have to consult the language docs to look up 4 different rarely-seen operators.
Remember, debugging is twice as hard as authoring, so if you write the most clever code possible, you are by definition not intelligent enough to debug it. ;)
I understand that people can't be expected to stick to that on their own, so they need languages that promote function, uniformity, and ease of use over showmanship.
The simplicity of the code is one of the main things I look for in interviews. If you could've done something with the conventional, simple language construct that doesn't require someone to go back and refer to the docs, even if it takes more lines, but instead you used the super-arcane construct to prove how well you know the language or to "get it all on one line", I'm going to look on that pretty dubiously. The last thing I want to deal with on my projects is the residue of someone's ego impacting our ability to read their code and get things done quickly and easily.
P.S., in Python, there are a couple of shortcut functions for capitalizing each word in a sentence: str.title() and str.capwords().
new_string = ' '.join(["{}{}".format(word[0].upper(), word[1:]) for word in long_string.split(' ')])
is much clearer and simpler to me, because it explains the intention of the code: split the string into words, map each word to it's capitalized version, then join it again. With the loop, I have to mentally "run" the code to realize that the for loop and append are to go through the split version and add the converted words to a new list.
Obviously, this is a pretty subjective topic, and which version you prefer will depend what you're familiar with/the languages you use.
For me, readability is about how intuitive something is, how much can be understood with basic knowledge of programming in general and perhaps a quick primer on the specific language. I think this is the best measurement because it emphasizes a reliance on the most commonly used, general concepts, and encourages people to use those unless there's a good reason not to. A side benefit is that the more basic language primitives and flow controls tend to be more performant, better tested, and have fewer weird edge cases where they behave in an unexpected manner.
I dislike your first example because
a) imo, the lack of spacing makes it harder to see what's going on. it's more obvious that something is being iterated in a for loop with a new indentation level than in the map function.
b) it depends on the Python-specific implementation of lambda. The behavior of a lambda varies substantially from language to language and lambdas are used rarely enough that it's pretty likely someone who doesn't spend all day every day in Python is going to have to go back and look up the specific behavior. The syntax is much less clear than a full function definition.
In my opinion, it's much easier to read code like:
def capitalize(str):
return "{}{}".format(str[0].upper(), str[1:])
for word in long_string.split(' '):
captizalize(word)
...
This makes it much more obvious what's going on, and it should be readable to anyone with a passing knowledge of Python, and possibly anyone with a knowledge of programming languages in general. Invocation of map and lambda in this case only make the intent of the program more obscure.
I'm not saying that map or lambda are never appropriate to use; sometimes they are. But I don't think it's wise to use them when more basic, universal language constructs do an equally adequate job, especially if the only benefit is "fewer lines of code".
The second example is just a list comprehension form of my original example, which IMO is less readable for much the same reasons. If you're not super familiar, you'll need to go back and look up list comprehensions. There is no spacing to make it obvious that something particular is being iterated or branched.
I understand that ultimately, ease of reading comes down to what style one is most familiar with, which makes it subjective, as you said. But I think there is a stronger rational basis for always preferring the simplest construct that adequately performs the function, which is that in the general case, there is less need to refer back to docs, less possibility of unexpected behavior, and less possibility of strange performance issues.
Not having touched Python for many years, four things jump out at me:
1) .format, which isn't immediately familiar and doesn't resemble similar string formatters in other languages
2) [1:], which I believe is a string slicing syntax that doesn't resemble similar syntax in other languages
3) Bug: nothing is done with the capitalized words, since the return value is thrown away
4) Bug: the name of the function was misspelled when used.
The last one may seem like a nitpick, but it is true that when you add a name to your code for the sake of clarity, you also take on the additional burden of ensuring the name is used consistently and accurately everywhere. This can be a particular pain in cases where you are generating a lot of uninteresting temporary values -- which is precisely why people end up writing chained function or method calls.
Point 1 is conceded in my earlier response. You're right that someone who is unfamiliar with Python is going to be stopped by the format syntax. However, it's simple to acclimate and learn the basics of it and is something that is very pervasive in Python code and is one of the safest ways to interpolate string data, so its use is justified.
Point 2, this type of string slicing syntax is pretty common in languages similar to Python. A very similar syntax exists in Ruby and Perl. See https://en.wikipedia.org/wiki/Array_slicing for more examples of slice shorthand in the wild.
Point 3. I didn't intend to rewrite the whole thing again, just enough to demonstrate my problems with the reply's lambda-based approach. This is indicated by the ellipsis. If I were to create a full version instead of the quick demonstration of the more-clear full function definition here, then yes, I would've assigned the output of the function to something.
OK, but that doesn't seem much different from "code written in Python should be written in a style familiar to Python developers", which is a good principle regardless of the language. In Ruby, the use of Enumerable methods is generally preferred; it would be as "surprising" to use a for loop in this case as it would be to use map in Python.
I also don't see a good argument that string slicing, `for ... in` and Python-style string formatters are more "universal" than lambdas and map/filter/reduce. All of them exist in large subsets of commonly used programming languages; all of them are used heavily in some languages and rarely in others; only one of them (in the form of sprintf) exists in C.
I don't even know Ruby (well I did play with it briefly years ago for a week) and I didn't have any trouble parsing what @sentence was doing. There is undoubtedly much tightly packed cleverness in code but I don't think this is it.
I don't mind the implementation of this code, but 99% likely, it's in a method called `capitalize_sentence` or similar. If that is strewn through the middle of your code, it's probably breaking single-responsibility-principle.
Every discussion of code style eventually devolves to a discussion about readability, which eventually devolves to a knife fight in a conference room because one man's readable expression is another man's opaque one-liner.
I think there's an objective way to resolve a good chunk of this, which is language exclusivity. If the code expresses an idea in a way that a programmer familiar with a similar language can understand without having to refer to the docs, it's great. If the code uses language-specific operators or functions, there needs to be a reasonable justification for it.
This basically means that these keywords and operations are always OK:
* function definitions
* if conditionals
* full for loops
* full while loops
* return statements
and these things should only be used when it would be unreasonably intrusive or problematic to perform the same operation using only the above flow control mechanisms:
* classes
* lambdas/closures
* language-specific shorthands like list comprehensions
* language-specific operator aliases, like << for .append in Ruby
* built-in functions that are likely to behave substantially differently from the intuitive expectation
The goal is to keep the program as simple and straightforward as possible. Of course you're going to need some of the things in the second category to do that sometimes. But my opinion is that one should start with the basic building blocks, and if one finds that it will save a lot of effort and time in both reading and writing the code to use one of the secondary things, only then should those things be deployed.
For example, if you're using a lambda, you should have a good explanation for why a function definition wouldn't have worked. "Lambdas take one line and defs take two" isn't a good reason.
What about languages like Haskell that don't have for loops, while loops, or return statements? And anonymous functions exist in most languages and cannot be avoided in many of them.
On top of that, consider making every bit of clever code a reusable library snippet instead. If it becomes part of the vocabulary of your codebase it'll be easier for others to understand it by looking at call sites.
eg: