Hacker News new | ask | show | jobs
by acemarke 1945 days ago
Exactly. I want to "tell a story" with my commits, and that story is really more of an idealized retelling of what I actually did.

Five years from now, no one needs to know that I forgot to add that one line to a prior commit and had to add it separately, or that my first attempt didn't quite pan out as expected.

What that future person _will_ care about is:

- What final changes actually got made?

- What task was I working on?

- What was the reason for any of these changes in the first place?

- Why did I make some of these changes specifically to implement that task?

- What additional side info is important context for understanding the diffs?

4 comments

Exactly. It's also great to compartmentalize different aspects of your change.

Often my changes are

1. Refactor the existing code to support the new feature

2. Add the new feature

It's great to keep these separate, because someone can look at number 1 and see that the two versions of the code ought to be functionally the same (same tests pass, app looks the same, refactor is easy-to-understand), and look at number 2 and see the new feature.

There are countless other times where you want to tell the "story" in a logical fashion.

(Honestly, I expect that there is a significant correlation between being a good git committed and being a clear story-teller.)

I understand that you want to tell a story. But as someone examining your code, I also want to know how you got there. While you're throwing out your junk, you're also throwing away valuable information. If I'm taking the actual time to review the code history, then let me play it out in real-time, mistakes and all. I know how to step back and summarize, I don't need you do do that for me.

This is especially true if your code is clever. I'm much more likely to understand your polished gem if I can see all the things that you bumped into while you were discovering it.

That is what comments and commit messages are for. I trawl history all the time. Running into an unbisectable mess of a branch (because a bug that was introduced in commit X~15 is fixed in X on the same branch) is a complete nightmare. I have to discect the branch history and understand what is because of the branch and what is debugging/review/CI cycle cleanups. Commit messages for fixups also tend to be 100% terrible and utter trash. "Fix review comments". Thanks. If we're doing that, let's copy what the comment was in too and why it fixes it.

The problem with your request is that 90+% of the time (with the way I develop), the dead ends are on MRs that got closed or code that never got pushed in the first place. So again, comments as to why this approach is used is way better than hiding it in the history because someone coming to "clean up" code sees the thought process instead of having to remember to search for it.

I don't do much work like that -- I suspect you're part of a much larger developer team -- but I think I understand the problem you're describing.

Couldn't you simply review/bisect at the fork/join points? i.e., take the commits at which forks began or ended, ignore any intermediary commits, and run the bisect (or, read diffs) across that subset? That way you're only comparing at the chapter-markers of the story, so to speak, and not getting mired in the gory details.

Yes, `git bisect --first-parent` was a feature I wanted for a long time. It finally exists now, so yes that helps, but is not a complete solution.

Even with `bisect --first-parent`, I still want useful commit messages which "fixup" commits, again, are uniquely terrible at being on the whole.

I do software process and other things, so some of my branches tend to be gigantic (e.g., revamping the build system) and can be 200+ commits because one cannot meaningfully land a build system rewrite incrementally. That one in particular was meant to be bisectable because when rebasing on top of new development, I wanted fixes to be in the "port this library over" commit instead of after some random merge commit based on when I decided to sync up that week (it took a year to do it). So once I get it down to a particular MR, being able to inspect that topic is still a useful property.

Note that this only works with a `merge --no-ff` workflow too. The `rebase && merge --ff-only` pattern and `merge --squashed` are both terrible, IME, at making useful history. The force-rebase workflow is just as confounding to me as the no-rebase workflow (the former de-parallelizes your MR merge process and the latter tends to make a terrible commit history).

Note that even for single-developer projects I run, I tend to make PRs even for my own changes (once it's gotten off the ground).

While I understand and somewhat empathize with this desire (I'd use it all the time for personal repos, for example)... current VCS systems are terrible at supporting it.

What you probably want in this case is something like "automatically commit on every change (possibly recording every keystroke)" + "automatically tag based on tests/builds passing or failing" + "allow manual comments at any time, whether based on files changing or not". All of that is technically possible with git/hg/fossil/etc, but it's so much work for both the recorder and the viewer that it's infeasible.

This is great, except that we’re often bad at recounting this idealized history without lying in ways that make later maintenance more difficult
> or that my first attempt didn't quite pan out as expected.

Actually that's still important, it's just important from an architecture perspective.