Hacker News new | ask | show | jobs
Beware the Siren Song of Comments (leastastonished.com)
23 points by wwarnerandrew 4315 days ago
17 comments

"Comments decay. They aren’t compiled, and they’ll never get executed at runtime. If they become out of date or incorrect, no test is going to fail and no user is going to complain. Programmers work around them out of fear that “somebody might need this comment or it might provide some value in the future”, pushing them along far after they’re useful"

The developers who say this are ticking time bombs of laziness on your team.

Comments only "decay" if you don't read them, or if you don't take the 30 seconds to update them when you make a change. And if you're not doing that, then you're not doing your job, because the previous developer left the comment for a reason, and your defeatist philosophy of "comments are useless" is implicitly overriding that developer's judgment. Respect your colleagues enough to at least read what they've written and update it when it's incorrect!

It's pretty disheartening to hear this from a "director of engineering", actually. I expect it from a new grad who has never worked on a team, or with code that has to live for more than a quarterly project, but I want a team leader to be doing everything possible to improve communication amongst team members. That includes compelling them to write and maintain good comments.

> Comments only "decay" if you don't read them, or if you don't take the 30 seconds to update them when you make a change

Comment decay is a rehash of "code that isn't written is never incorrect". Aka every change you make introduces an opportunity to make a mistake, including both writing and updating comments.

>"director of engineering",

...at Genius.

I've found quite the opposite. Most developers hate comments because they represent extra work. That work manifests itself in the creation and upkeep of the code. We've found that addressing comments as part of our code review process made a big difference in keeping things up to date.

Regarding the issue of expressiveness: I never find myself commenting about what is obvious in the code itself. Often I'm writing about some thinking behind why something was done a certain way OR identifying expectations of downstream systems that can't be expressed using code. Code enough in environments where you're interacting with multiple disparate systems and you'll find that some well placed comment providing the broader context for the code is extremely helpful to folks encountering the code the 1st time or to yourself a year later when you're incorporating a new feature and have forgotten about all the concerns wrapped up in the code.

With the todo issue- we require folks to include a JIRA issue number along with the todo. If something was left undone then it needs to be recognized where we manage our backlog and flagged as a technical debt to be addressed.

+1. Commenting code or config files with an issue number and a date turns out to be a lifesaver. Your future self will thank his past self (instead of cursing him as usual).

(I am a sysadmin, not a dev, but pretty much everything Operations writes counts as code for these purposes.)

> The right place to document [performance] hacks is a commit message.

This advice strikes me as patently insane.

Serious performance hacks often involve subtle interactions among different parts of the code. When that happens, properly documenting them requires explaining what's going on in a line of code, for every line of code, right next to that line of code. Just sticking a wall of text somewhere else entirely and leaving it up to your successor to go find it and figure out how the explanation cross-references with the code is not an adequate alternative.

Likely to be using a workflow that uses `git blame` or equivalent to understand the context of code. The problem that is intended to fix is that comments don't stay in sync with the code they're commenting about. The problem it introduces is discoverability. You have to search to see where this code came from in order to understand it. That isn't going to happen unless you're disciplined. And we're all undisciplined at some point or the other.
I only find blame-type tools useful for discovering either recent context, or context of rarely-touched code. They typically only easily show you the most recent change to a line of code, though you can sometimes dig through older changes in a kind of per-line commit history. So older changes get more and more buried under an accretion of more recent ones. To find the commit message discussing a performance hack from 8 years ago, you'd have to wade through all the more recent commits that also touched those lines of code (refactoring, fixing unrelated bugs, fixing library bitrot, etc.). I'd rather that important context stay right there in the code, not get buried among hundreds or thousands of other commits.
I've worked on teams that emphasize commit logs as a place to keep significant information about the code.

This inevitably coincides with a "minimize change deltas" mentality to development because else the commit history gets confused. Continuous refactoring goes out the window and so does the soundness of your code base.

The code should speak for itself, period. It should be clear and sound before and after each transformation it undergoes. And there should be no limit to what kinds of transformations it undergoes.

Comments which reveal the intent of a block of code are very useful. They provide useful clues precisely when the code does not meet the expressed intent:

    /* Dear Maintainer,
     * I write the following in the sincere hope that
     * it sorts the variables a, b and c in ascending order.
     */

    if (a > b) swap(a, b);
    if (b > c) swap(b, c);
They are useful when the code is complicated and the intent is not obvious. Or when there can be multiple plausible intents based on what the requirements are. Sometimes code contains implementations of requirements that are not spelled out in any requirements document. A high level requirement can break down into lower-level ones in umpteen ways. Whatever isn't in the detailed design document is only in the comments and the code. A programmer's statement of intent can in fact be a requirement specification: the only such document for that piece of code.
This approach works well in only one situation: when your code isn't doing anything substantial in terms of data processing.

For me, comments are best used at explaining _why this is happening_, not what is happening.

By far one of the best examples of code commenting I've seen is the explanation of the L2ARC (the level 2 adaptive cache) in ZFS: http://fxr.watson.org/fxr/source/cddl/contrib/opensolaris/ut... . Most of the ZFS codebase is extremely well documented due to its comments, and it helps a lot having the documentation _in the code_, to provide context (something that commit logs, by default, don't).

The hard part IME is keeping your audience in mind.

Your future self? Your coworker who might be hired next month? Your past self who would have wished to read said comment?

Who is the audience for the comment?

Am I the only one that doesn't find looking up commit messages as a followup to `git blame` as convenient as a code comment?

When I see something like,

  42   def wobble(self):
  43       # Make sure the vent core has been frog-blasted, 
  44       # so that we know {rare bad things} won't happen
  45       frog_blast(self.vent_core)
  46       self.do_important_things()
I find that a much faster answer to "Do I really need to foo the bar?" than if I have to decode it:

  42   def wobble(self):
  43       frog_blast(self.vent_core)  # see commit msg
  44       self.do_important_things()

  git blame wobble.py
  git log abcd4242

  Author: Bob
  
    Ensure that vent core is properly blasted

    We rarely have frogs, but the vent core must be cleared.
Is this a matter of insufficient knowledge of my IDE? (I'd love to be able to easily show the last commit messages for specific lines in a tooltip or other popup.) Having to context-switch into commit-sleuth mode seems (to me) to be more disruptive than having the code comment.

  Is this a matter of insufficient knowledge of my IDE?
After writing this, I realized that it a symptom of exactly that: I'm not fully savvy with PyCharm/IntelliJ, my Python IDE of choice. I'll run down the ways I know of to access blame history (in case others don't know it exists).

1) Git -> Annotate: This is almost exactly what I wanted!

This gives a `git blame`-like column next to each code line, and clicking an item will show a summary of the commit, including both the commit message and fields modified. I can right click one of the annotations and select "show diff", and can even see the modifications to any of the files in the commit if necessary.

Unfortunately, this depends on a commit message explaining why that change mattered, AND on it being the most recent changes to those lines. Moreover, it doesn't show the full history of changes for those lines, such as if Alice added the real change, and Bob just did some whitespace changes.

I can find such information (with effort) using `gitk` or other git log browsing tools. If the most recent change (shown with git-blame) doesn't have what I need, I basically need to look at all of the changes for this file, going back from that commit, and see if I can find the one with a message that explains why ${hackity-hack} is necessary.

2) Right click a file tab or in the editor, and select Git -> Show History

This is a faster way of browsing the commit messages, but doesn't give enough information for me to tell whether that change was the one that affected the lines I care about. If it showed the diff of what the commit changed in my file, it would be more informative. (This is what gitk shows, for example.)

As far as I can tell, neither of these options is as comprehensively informative as using `gitk myfile` and backtracking from the commit that `git-blame` (or PyCharm's Git->Annotate feature) lists as the most recent change to those lines.

If comments don't describe what Is Happening or What Should Happen, they should be changed. This is much harder than maintaining unit test coverage, but I think the information is still very valuable to have in code comments rather than solely in commit messages. (It is certainly valuable to have them in commit messages as well, of course, as that documents how you felt at the time.)

The only place I have seen this kind of code history integration is Visual Studio 2013 Premium which requires Windows, TFS, and that you bought the Premium version!
> Let’s say you want to resurrect this reslug_all_tag_pages method. Just use: $ git log -Sreslug_all_tag_pages

You "only" need to remember that the method is named reslug_all_tag_pages.

With respect to TODOs and FIXMEs in the comments, I use them to mark places I know I will need to go back. Before calling a piece of code done, I grep for TODO and FIXME to make sure I haven't left anything unfinished.
I agree. I use TODO indicators as a code smell for things I forgot to add before making my pull requests, but I also see them as valuable reminders for future needs.

Sometimes there's a quick-but-less-generalized way of solving something, and you know there's a more elegant way of doing it, but for now you need to get something working/fixed and don't want to make sweeping changes.

Your immediate change might be to add a way for users to see that X is finished. A useful TODO message would be something which explains what you're doing

  # Tell users when their Solve-My-Foo is done
  # (Currently done in an ad-hoc way)
  # TODO: Add a notification infrastructure so that 
  #    arbitrary components of the apps can present
  #    notifications of Important Things
  #    e.g.:
  #    - celery job is done
  #    - a thing you've subscribed to has changed
  #    - a report/download/etc is ready to be retrieved
Building such a system might be overkill right now, if this part of your system is the only one that uses it, but a TODO message helps me see (the next time I make changes to it, or try to extend it, or use it as a template for reimplementation elsewhere) that there's more than one place we need this feature, and so it might make sense to plan/do those more comprehensive changes.
Eclipse / Pydev even pulls these out and lists them in a "tasks" panel for you.
It would have been helpful if the author explained how this has benefited him in concrete ways, rather than asserting that his way is best and leaving it at that.
Indeed. "I hate reading articles that make abstract arguments, so enough bloviating" Author then proceeded to do exactly that.

The single item I agreed with was the blocked out old code that will never be used again - source control handles that.

Otherwise, what a load of nonsense.

I agree that patently obvious comments are not necessary... and descriptive naming helps avoid patently obvious documentation... But I disagree with adding more code to replace what would be a single line of documentation. To me that's insanity.

My teams find more and more value from documentation as they are forced to return to code months/years later.

And commit logs as the place to document performance issues? Seems like an easy way to make sure that a "weird" bit of code gets removed, and breaking your app.

As with all things, ideologies and blanket statements suck. I think what was talked about in this article could've been much better communicated if it wasn't so polarizing.

I've found comments are only not added or updated when people feel rushed. Or a culture which doesn't put value in them... Which is a problem with one's organization not the act of writing comments.

P.s. This is an edit: isn't it rather peculiar someone tasked with annotating all text doesn't annotate their code? Irony?

I really dislike his notion of using the commit logs as the correct place to explain your code. First, it means that any time I need to find out why something is being done in code I need to leave the code and switch from my IDE to my source control. Second, most tools like blame will only show the most recent change. Minor fixes and changes happen all the time. What are the odds that the explanation I want will come up instead of something like "Fixed a typo". Third, what about code from other branches and external code drops?I may be using code from somewhere else and not have access to the full history of changes, just stable versions. Fourth, what happens when you change source control? If all of the comments are in the Git commit logs that'll be a big problem if the company switches to Perforce.

Instead of just making meaningful commit logs part of the code review process why not make "reading the comments" part of the code review process?

As someone who is currently debugging a codebase that was written 5+ years ago by persons long gone, please, comment wherever you think it would be appropriate, and then some more. There are only a few things more hellish than being able to read the code and know what it does, while not being able to understand why something was coded a certain way.
1st Point: use self documenting code instead of commenting | Ok fine

2nd Point: Explain performance hacks in commit description | might was well not explain them at all. This is a great place to hide the reason for the performance hack

3rd point: Don't use TODO / FIXME | I'm on the fence.

4th Point: Don't comment out old code use version control for old stuff | Ok

TODOs are presented in task lists, etc, by many IDEs. For me, they're included in my workflow, not forgotten.
Thanks Andrew.

I think you missed a key point, or maybe I overlooked it while scanning.

Because modern programming languages allow very long variable and method names, you should be able to construct a "grammar" in your code. That is, the code itself should be able to be read -- and when you read it, it tells you what it does.

So "if CustomerHasALargeAccount then WriteThemASalesLetter()" not only is a piece of code, it's also directly explaining what's going on. As long as you make sure your names are long and explicit enough, your comments and code become one. (Except for some oddball cases)

I like comments while I'm constructing my thoughts. Once I've coded it, I delete the comments. I can also see using comments as sort of a mnemonic for future work.

A lot of good points here. Keep up the blogging!

Nice article, but one passage gave me the willies:

> If you’re writing a public API for a library that’s getting exported to the world, or to a bunch of developers at your company, it may be easiest to maintain the documentation for that API in code comments.

An important idea that I always try to keep in mind when coding is that the me of a few months from now might as well be a different person. All those things I understand so well today, will be no more obvious to me next year than they will for some other person.

So I think that if I follow a methodology that makes a distinction between code for me and code for others, then I'm Doing It Wrong.

I can't see why he thinks his second example is any more readable than his first. The commented version explains what will happen. The second version requires at least some knowledge of the context.

There are plenty of times when I have been given bits of other peoples code and told to make them work. All I have is an error message. Usually very little context about what the program is doing or why. In these cases, comments make a world of difference.

Even in my own code, looking back on it after 6 months. Sure, I can read the code and understand it, but it will take me 5 times longer than if there is a helpful comment near each step.

I do agree about the clutter of auto-generated comments.

And to some extent some comments do decay. However, with experience, we learn to write comments with a minimal decay profile. As an example, avoid including constants in comments. Also, if you find yourself writing the same text over and over again, it's a sign you're being too redundant and can elevate that comment to a high level.