Hacker News new | ask | show | jobs
by Yen 1624 days ago
On the topic of "who wrote this shit", I'd really like to plug the idea that some of the most high-impact documentation you can write is a good commit message.

Say you track down a bug, find a line of code that makes no sense, and `git blame` it, to discover that you wrote it yourself, 2 years ago. If the commit message is "bugfix flaky builds", good luck figuring it out.

If the commit subject rather, is "bugfix flaky builds", followed by a message that explains what the flakiness was, why you think the change will fix it, what other bugs or limitations you were working around, and what upstream changes you might be waiting on that prevented further work, you're in a much better position. Suddenly you have a lot more context on what you were doing, why you were doing it, why you didn't do it better at the time, and in some cases it can even catch you from making an obvious but subtly-wrong mis-step.

Similarly, if someone's confused by your code during code review, that's a great opportunity for either in-line comments, or commit messages, as appropriate.

Unlike PR discussions, tickets, emails, slack threads, wiki pages, or photos of whiteboards, commit messages + git blame has an uncanny ability to be exactly the documentation you need exactly when you need it. Good git history practice can be one of the highest returning investments.

5 comments

Eh, I'm not sure I agree.

What has gotten me the most value is having either the branch or the commit message tie back to a ticket somewhere. -That- has the original bug, the comment thread that led to the decision around why this particular fix, any additional comments around tradeoffs we were aware of, and what other options we dispensed with, etc.

A well written commit message might explain what the issue was, but it won't have anywhere near the context the ticket and resulting comment thread should have.

> What has gotten me the most value is having either the branch or the commit message tie back to a ticket somewhere. -That- has the original bug, the comment thread that led to the decision around why this particular fix, any additional comments around tradeoffs we were aware of, and what other options we dispensed with, etc.

That works until the bug tracker goes down or the company decides to use a different bug tracker and the import doesn't preserve information, or the link in the commit message doesn't resolve to the corresponding ticket in the new bug tracker. This is far less likely to happen to the git history given that it's distributed.

That being said, adding information to the merge commit message linking to the discussion or actually summarizing it in the commit message itself would definitely be an improvement. The merge commit has references to the commit the branch is based off of and the head commit of the branch, so you can limit git log output to just commits in the branch long after it has been merged.

These two aren't mutually exclusive. Tickets, however, have lower long-term survivability (in my experience). Outsourcing, migrations, there are many scenarios in which the original tickets become inaccessible over time - and some codebases do last for years and years. Meanwhile the repository content (and thus the complete version history) usually survives as-is.
I didn't mean to imply they were mutually exclusive; just that in terms of "most high-impact documentation you can write", I find ensuring I link the ticket higher than making sure I have a thoughtful commit message, for the reasons listed.

Fair that it can disappear eventually if you change ticket trackers or whatever; that's a risk of changing ticket trackers. Hopefully you maintain both for a bit, and once you're six months out or whatever and retire the old, you don't need as much context since things have moved on (and there's a generational effect in tickets akin to that in garbage collection; you tend to need recent things more often than old things, and the older, the less likely you are to need it).

But just in terms of "what would I rather have", a link to the ticket every time. And in terms of "what am I more likely to provide", a link to the ticket every time as well (since all the communication on the ticket came about out of need; writing a thorough commit message is out of preparation, and I, and everyone else, am WAY better at consistently doing things that I need to do than preparing for possible future things)

> But just in terms of "what would I rather have", a link to the ticket every time

In practice over the past 20+ years, I've had to rely on commit messages far more than tickets, but a well-written ticket is defnitely awesome to have. When I ran Engineering for a startup, one of the things we invested a lot of time in was making sure commits had good messages, tickets had good writeups, and the two were linked. We required a pull request to close a ticket, and our CI system would automatically append a link to the ticket to the PR when it was merged. It was such a level of awesomesauce.

Just out of curiosity - how many different CVS did you use in the past 20+ years?

I went through 5 different CVSs and the history was gone forever in each migration - but actually JIRA is still the same after 16 years )))

Obviously my experience is my own, but in many cases it was the ticket system that changed vs. the version control system, which is why history wasn't always there. A lot of my early experience was at startups and I think I saw a version control migration only happen once (VSS to Git). I've even seen a couple places that didn't even have a ticket system. Unsurprisingly, those no longer exist.

In any case, I think the "correct" answer is proper commit messages AND solid issue tracking. My preference for commit when looking in the past was more around trying to understand particular changes to specific files or lines of code, which are more easily navigated in source control. A good commit message helps narrow down things when there is a long history, but a link in that message to the actual ticket would be a dream since that would likely have the larger context.

All that said, I have spent some time at a FAANG and neither commit messages nor tickets were useful at all there. Commit messages were usually along the lines of "fix a bug" or "add a feature" and the tickets rarely had more detail than "fix X" or "add Y". That was more of a symptom of the "go forward" culture there. Little time was spent making it easier for the next person since that wasn't really rewarded in the performance process.

The commit message idea always felt a little strange/off to me. It's a string that you can't (generally) fix/extend later for those who may seek this information. Also nobody except the committer can write them. (Imagine an explicit @docsguy role for documenting commits along with writing ticket-based documentation.)

What if VCSs used a single file or a folder, like .gitcommits, where anyone could append any sort of info in the same commit, so it could be a part of it. Then, when you commit a feature, you add to this file(-s):

  @@ @@
  +---
  +added websocket support to the server
  +  /ws - main socket
  +  /ws-events - events socket
And few commits later you decide to extend it, editing the same record:

  @@ @@
  +---
  +added json-rpc over websockets
  +  /ws-json-rpc
And VCS would then extract these records at `git log`:

  ...
  4509812 added websocket support to the server
  0732691 <no .gitcommits message>
  8712389 added json-rpc over websockets
Few commits later @docsguy expand on json-rpc:

  @@ @@
   ---
  -added json-rpc over websockets
  +added lifetime-related json-rpc over websockets
  +task: ./tasks/1873.md
  +supports 'start' and 'stop' methods: ./doc/ws-lifetime.md
   /ws-json-rpc
  @@ @@
  +---
  +enhanced commit descriptions

  A  tasks/1873.md
  A  doc/ws-lifetime.md

  4509812 added websocket support to the server
  0732691 <no .gitcommits message>
  8712389 added lifetime-related json-rpc over websockets
  6034007 enhanced commit descriptions
Full commit messages would then be just diffs. Also, one could write a commit message gradually, with the sources they are modifying. Or write two commit messages at once (because we all do commit two+ changes sometimes):

  @@ @@
  +---
  +refactored foo bar heavily, @docsguy please expand
  +---
  +fixed a bug in baz, didn't care to backport

  ...
  0923423 refactored foo bar heavily, @docsguy please expand
          fixed a bug in baz, didn't care to backport
I've never used git-notes, but it sounds like this is what you are describing: https://git-scm.com/docs/git-notes
While looking similar, they are something different, not versioned in the same way (alongside) as content, or editing a commit message.
A well-written commit message should explain "why", this may partially consist of linking to external things (bug tickets, whatever). Although it is probably important to have enough of a "why" that the reviewer can make an informed decision if they need to go check the external reference or can continue with the review as-is.

I ended up writing https://github.com/vatine/sressays/blob/main/change-requests... to try to clarify to myself what I thought a good change request ("PR", "CL", "CR", whatever you want to call them) needs.

I believe commit messages should also summarize the context.
I fixed a bug the other day that I was so embarrassed about, I intentionally left the commit message cryptic.

(It was a personal project.)

Sometimes the best documentation is seared into your soul as a mark of shame. I think I’ll wake up a few times wincing about it.

Spare me your shame.

Should someone inherit your project and end up fixing another bug in that part of the code they may be benefit from any information you share.

Shame is temporary, public repos are not.

To err is human. Don’t feel bad about it. You fixed it :)

(I say this as someone who has a track record of being too hard on myself!)

A team I worked with had a fun little habit I have since borrowed: you add a "BOGUS" comment next to the offending line. Sort of like:

// BOGUS: assuming 'x' will never be greater than 1024.

Sort of tells future engineers, yeah, I know it's shit.

// I'm so sorry...

A comment I left that I am sometimes reminded of by ex-coworkers still at that company.

Put the reason for the weird code in the comments explaining why it is. Don't make somebody run git blame.
Code review discussions are precious context. Unfortunately, git does not keep them. This is a major shortcoming of git. We need a new source code management tool that stores code review comments. It can lower the cost of software maintenance.
> If the commit subject rather, is "bugfix flaky builds", followed by a message that explains what the flakiness was, why you think the change will fix it, what other bugs or limitations you were working around, and what upstream changes you might be waiting on that prevented further work, you're in a much better position.

Those are very different needs.

"Why?" belongs in code as a comment. "How?" only sometimes belongs in a comment--generally if the code is "clever".

"What?" generally belongs in the commit message as it can touch multiple files and subsystems.

"Who?" and "When?" generally belong in your ticketing system.