Hacker News new | ask | show | jobs
by swanson 2206 days ago
I've worked for 10 years across 20 different commercial codebases. None of this stuff about writing good commit messages really matters. No one reads old commits. The "truth" is what the current code is doing and it doesn't really matter how it got there.

I'm sure someone will say "but I use the history ALL THE TIME to source dive and paragraphs of context are super helpful". This is not the case for 95% of developers or projects so I can't really endorse spending time learning this "best practice".

It's fine to be aspirational, but it's such a shame if people see posts like this and think they are failures or "bad" developers or that this is a widespread practice.

If it helps you personally or you have an open source project and you want to help with a changelog, knock yourself out. But there are so many more impactful skills to be learning or spending your time on if you're a working developer in a typical environment.

12 comments

In my experience the engineers who write good commit messages are also the engineers who read them and also the engineers who understand their value and also the best engineers on a team.

Your argument that "most engineers don't care about commit messages, so they must not be important" is akin to "most people are overweight, so health is unimportant".

> most people are overweight, so health is unimportant

I would say, yes, health is unimportant to the average person because of their actions. I don't think that's good or fair, but it is the current state of the world.

If you want to continue the analogy: my argument is akin to telling someone trying to lose weight to research the most bioavailable supplements -- despite the fact they still eat a Snickers bar every afternoon. It's a micro-optimization that has been elevated to "table-stakes".

I can't disagree with this more. Good commit messages are just so, so, so important. Especially when I'm triaging an issue and have no other context than the code itself. I want to know not the _what_ which the code gives me, but also the _why_ which is what the commit message provides.
That's fine. I think your opinion is the majority opinion (at least online). It just does not match my own experiences of reality.

You can and should value practices based on your context. But I will be the asshole and ask if writing good commit messages is "so, so, so important" -- what things are less important? Is it more important than a good test suite? Well factored code? System documentation? Capacity for senior staff to answer questions? These things cannot all be so important and, in my experience, worrying about crafting amazing commit messages is way down the hierarchy.

I'd certainly rather jump into a codebase with a great test suite and crappy commit messages than a crappy test suite and great commit messages. But these are apples and oranges. Keeping a great test suite is a constant fight against tech debt, and testing things properly can be harder than the actual implementation.

In contrast leveling up from the terrible "WIP WIP do the thing" to something slightly less awful takes maybe an extra 1-2 minutes per commit. And every time you do it you're doing your future self and future co-workers a huge favor.

There's a trend towards automation built around 'commits' that encourages this sort of thing.

"WRITE BETTER MESSAGES".

but put WIP in the title so we know not to review it.

but ... do commit often, so we can get builds out. put WIP in the title so we can decide to build with tests or not. or something.

Tying automation steps to 'commit messages' is begging for "WIP JUST COMMITTING TO GET A BUILD OUT" messages, which people then complain about.

You're starting to contradict yourself a bit, I think.

Will I disagree that good test suites or well factored code or systems docs are more important than commit messages? No.

However... most projects do not have those things and most developers on those projects are in a "I don't know what I'm missing" sorta state where they don't bother adding them. Exactly like you point out for good commit messages.

So if 95% of projects have inconsistent and incomplete test suites, never-refactored spaghetti code, and almost no system docs, that doesn't mean we should tell people not to try to do those things. In the same way that the existence of poor commit messages doesn't mean we shouldn't try harder.

The nice thing about commit messages, being a small and simple thing, is that it would be much easier for someone to learn how to write good commit messages overnight than to learn how to write a good test suite, or to refactor their code.

It's commendable to strive to improve. I agree that 95% of projects are a mess. I don't think there is value in writing good commit messages in those environments, even if it is easy. If you're going to spend 30 minutes writing a super awesome commit message, I'd rather you spend those 30 minutes on improving the code or the test suite or even writing the awesome commit message in the JIRA ticket.

If you've reached the point when the next "optimization" you can do is to work on your commit messages, that is awesome.

I'm picturing 1 to 2 minutes on the message, which is why I don't see it as competing with refactoring or test time.

I also think spending paragraphs on "commit message standards" is overkill. I don't care about full stops or capitalization or anything, I just want to know some basics: "What is this fixing or adding. Why is it done this way. Are there any special considerations that kept it from being done a different way?"

You bring up a good point about location, too; i don't particularly care if the info is in a ticket or commit message or a pull request, as long as I can get to it)

I think they can in fact all be "so, so, important". Maintaining a large, long-living, complicated codebase is a genuinely difficult problem involving a plethora of actually critical components and processes.
I find old commits more useful than old commit messages; seeing what changed in tandem can be hugely informative - doubly so when tests were a part of it. But good commit messages can still be a big help, for sure.
The why should not be in your source control, the why should be in comments in the code itself. There are many reasons why a line of code may have changed, and it's unreasonable to hide important contextual information in a commit and ask engineers to waste time digging through the commit history to find the initial change (which may have been in another file!).
Comments almost always become lies with time, because code changes and nothing enforces that they are correct.

Having commentary built into the artifact that introduced the change ensures your commentary applies to the specific change introduced. It's then possible to tell why someone did a specific thing, and also to tell how things are today, and to decide if that justification still applies, or if things no longer match the description.

If you dig around a little bit, using e.g. the git pickaxe (-S), or github's excellent blame browsing tools, it's not that time consuming and you might find out some pretty surprising things.

> Comments almost always become lies with time, because code changes and nothing enforces that they are correct.

Maintaining comments is part of the job of the software engineer, to keep code understandable and maintainable by others on your team. If your comments are constantly going out of date, then either:

1. The team is not putting effort into keeping comments relevant for each other.

2. The comments are too far away from the code that they are referring to.

There's definitely a balance to strike between too many and too few comments, but comments are extremely important.

Commit messages describe why the code _changed_, not why the code is currently implemented in the way that it is. Those are two different pieces of information, and the latter should not be hidden away in a commit message that you have to go searching for. And if that code came from another file, it's extremely difficult to trace it back to the first time it was written.

Not only that, but how would this even work with the "atomic commits" requirement? Most of the time you can't change a single line and create a commit message explaining why that line changed, because it's not atomic. That line change happens in coordination with a bunch of other changes. How do you explain why one of those lines was implemented a certain way in your commit? It just doesn't work.

I agree that maintaining comments is difficult and that they tend to drift. A good way to mitigate this problem is to explicitly include checking comments as part of the code review process.

I'm not saying you did this, but I think that most people who point out little issues with the mundane processes within software development haven't yet grokked the dev process in its totality. Commit messages, code reviews, comments, documentation, unit tests, design patterns and idioms, all these practices' strengths and use-cases compensate the others' weaknesses. I.e. to basically any gripe (again, not saying you griped) like "process X isn't worth it because of some maintenance issue" there's an answer along the lines of "well that's what process Y is for". Together all these processes produce high-quality codebases but when you lack one or more of them the whole thing falls apart quickly.

The why is typically better provided by descriptive comments and tests. I'd rather people put more effort into those.

The exception is ticket numbers - it's sometimes useful to be able to link commits to tickets to see the context of a feature, especially since it's cheap to do.

I hear you, almost no one goes back to look at it. But it's a chicken and an egg situation: no one uses history precisely because it's useless, and no one makes useful history because no one uses it.

So, status quo rules, right?

But as your predicted person who nearly always writes this style of messages and who frequently dives into old commit messages, PRs, and tickets: this shit is not hard. It does not consume much time. It is not a deep skill, it's a basic skill, almost at the level of correct indentation in terms of effort required.

It just takes a second or two to think of what your commit does and maybe why you didn't do it the other way, then drop in a ticket reference. A tool can even do that last part! As a bonus, there are times when I tried to explain why I did something and then thought of a better way or a bug I'd missed.

If everybody did this basic exercise that takes just a minute, the commit history might actually be useful. Then most people might use it, who knows!

Counter point: I have also worked for 10 years, and I keep running into people wishing developers spent more time writing better commit messages.

I think what matters is the type of project. A lot of commercial software is all about moving fast and not caring too much about the past. In such a setup, writing good commit messages may be less useful or even seen as a waste.

For software built to last however, I think good commit messages are invaluable. Issue trackers are replaced, documentation is rewritten or lost, comments are modified; but commit messages stay the way they are. This helps you get a better understanding of why decisions were made, something rarely documented properly.

> and I keep running into people wishing developers spent more time writing better commit messages.

that's not at odds with what the GP posted.

I've worked in places where people spent a lot of time and effort 'improving' commit messages - reviews on those, rewriting, meetings, etc. after several months, the 'team' such as it was was writing 'better' messages per the few people who made the determination as to what 'better' was. they were the people who made it a focus and said this was necessary.

bug reports didn't go down. time to turn around functionality and fixes didn't go down. code review time didn't really go down. we didn't get more code coverage. code didn't execute faster. no one we delivered business value to was happier, or got more value. not in the immediate moment these changes took place, nor in the months that followed.

I worked at a bank and I am reading super old commit messages all the time. In fact half of the job is reading ancient code that could use some love (#python3). I can tell you from the dates and the authors alone how hard a project is gonna be to take over and fix or rewrite.

Last sensitive code I touched is a trading platform that must be responsible for a few billions of dollars a day (could be a trillion, not sure, didn't finish adding logging). It was initially created in 2006, followed by a few commits over the couple next years, then basically untouched for the last decade or so.

Just from the first 10 commits message and the author's name. I can tell you this was hacked-in grossly over a few weeks, because everything that guy did was hacked in quickly and filled with security vulnerabilities (he worked under very limited time constraints). Yet the software does the few things it was meant to do very well, so much so that it lasted this long mostly untouched and was built upon. The history shows the critical core code has less than one patch a year, mostly fixing trivial matters like a new path or syntax change from the language. I bet the project will be moderately easy to patch and upgrade because work from that guy at that time is usually decently organized and limited in scope to the essentials.

Lesson is. If you work in a long term industry (aerospace, defense, finance, healthcare) and long term projects, it's very likely that the product will still be running a decade from now and it is expected that future developers will be reading the history of commits.

It's invaluable when reverting other's changes during an on-call emergency. Atomic commits that encapsulate one feature vs. trying to piece together five commits that are named "trying out singleton approach", "adding tests", "figured out corner case" is a world of difference during a rollback. I don't even have to really understand the commit message, so long as you squash them.

Also, pointing to a JIRA issue, etc. if your company uses that is a nice way to complete the loop, the one link that explicitly ties code to planning.

I think having commit messages that point to a an issue tracker and don't have the context from that issue tracker are a bad idea.

At some point things are likely to change and the issue tracker that was being used will be changed. And now all those commit messages are useless because the underlying issue tracker is gone.

I'm not saying don't put a link to the issue tracker, but please make the commit message self-sufficient.

> I've worked for 10 years across 20 different commercial codebases. None of this stuff about writing good commit messages really matters. No one reads old commits. The "truth" is what the current code is doing and it doesn't really matter how it got there.

> I'm sure someone will say "but I use the history ALL THE TIME to source dive and paragraphs of context are super helpful". This is not the case for 95% of developers or projects so I can't really endorse spending time learning this "best practice".

> It's fine to be aspirational, but it's such a shame if people see posts like this and think they are failures or "bad" developers or that this is a widespread practice.

I'm that person who's often digging into the history.

It's often to understand other things that the people writing poor messages were sloppy about.

So sure, focus more on writing good code than writing good messages but the truth is that the context of the code and the feature will change over time so preserving the original context you had in your head when you wrote it is super valuable when the next poor sap has to come along a year later and change it.

It's simply cost-benefit.

I mostly work on relatively short term projects these days rather than long lived products and so my commit messages tend to be single sentences, with an extra paragraph if it was particularly involved. It takes little time and once in a while will be useful.

I probably lean more towards them not mattering that much relative to what anyone who talks about "best practice" would like to see. That said, I picked up a codebase from another team the other day where one commit message was simply a ";)" and the rest not much more in depth. That did make me wince a bit.

I agree with you 100% unless your commit message is "Fixes" or "WIP commit" or "Need to change branches" or "Checkin" or something equally bad. I don't need a paragraph but when I'm looking at a line of 15 commits in your PR I need to know a general sense of what you did. Good single-line commit messages help with this.

You're right that 99% of the time they will never get looked at again. And that's fine, but when deciding how much time to spend on your PR they're very helpful.

I can say that I fix quite a few bugs that people with your sentiment give up on.

There are a lot of things in software that need to be possible, but you don't need (or want) the entire team doing it all the time. A team made entirely of 'people like me' would be a disaster. But a team without anyone like me will die a slow and lingering death and not even know why.

It's important when thinking of group behavior - be it work or personal - to differentiate between "I don't need this." and "nobody should have this."

Yup. It's also an unreasonable burden to be asked to architect your commits so perfectly that each commit is an atomic change, especially since you're often iterating and trying different approaches.

The only method that consistently results in atomic commits is to squash commits when merging, in which case the commit message is your PR title/contents. The nice side effect is that these messages should absolutely be readable since your audience was your code reviewers.

> I'm sure someone will say "but I use the history ALL THE TIME to source dive and paragraphs of context are super helpful". This is not the case for 95% of developers or projects so I can't really endorse spending time learning this "best practice".

Perhaps the solution is to train developers to look at commit messages then? Source control is amazingly powerful and nearly universally used at this point. Why not make the most of this?