Hacker News new | ask | show | jobs
by imiric 637 days ago
> The PR is the thing that gets signed-off on, and the thing that goes through the CI build/tests, so why wouldn't that be the thing kept as an atomic unit?

Because it often isn't. I don't know about your experience, but in all the teams I've worked in throughout my career the discipline to keep PRs atomic is almost never maintained, and sometimes just doesn't make sense. Sometimes you start working on a change, but spot an issue that is either too trivial to go through the PR/review process, or closely related to the work you started but worthy of a separate commit. Other times large PRs are unavoidable, especially for refactorings, where you want to propose a larger change but the history of the progress is valuable.

I find conventional commits helpful when deciding what makes an atomic change. By forcing a commit to be of a single type (feature, fix, refactor, etc.) it's easier to determine what belongs together and what not. But a PR can contain different commit types with related changes, and squashing them all when merging doesn't make the PR itself atomic.

> I don't think I've ever cared about the context for a specific commit within a PR once the PR has been merged. What kind of information do you expect to get out of it?

Oh, plenty. For one, when looking at `git blame` to determine why a change was made, I hope to find this information in the commit message. This is what commit messages are for anyway. If all commits have this information, following the history of a set of changes becomes much easier. This is helpful not just during code reviews, but after the merge as well, for any new members of the team trying to understand the codebase, or even the author themself in the future.

2 comments

> I find conventional commits helpful when deciding what makes an atomic change.

I already know if I’m doing a fix, a refactor, a “chore” etc. Conventional commits just happen to be the ugliest way you can express those “types” in what looks like English.

Yeah, well, that's just, like... your opinion, man.

But I've worked with many, many developers who don't strictly separate commits by type this way. I myself am tempted to do a fix in the same commit as a refactor many times. Conventional commits simply suggest, well, a convention for how to make this separation cleaner and more explicit, so that the intent can be communicated better within a team. I've found this helpful as a guide for making atomic changes. Whether or not you write your commit messages in a certain way is beside the point. But let me know if you come up with a prettier way to communicate this in a team.

The prettier way is English.
> Because it often isn't. I don't know about your experience, but in all the teams I've worked in throughout my career the discipline to keep PRs atomic is almost never maintained, and sometimes just doesn't make sense. Sometimes you start working on a change, but spot an issue that is either too trivial to go through the PR/review process, or closely related to the work you started but worthy of a separate commit. Other times large PRs are unavoidable, especially for refactorings, where you want to propose a larger change but the history of the progress is valuable.

In my experience at least, PRs are atomic in that they always leave main in a "good state" (where good is pretty loosely defined as 'the tests had to pass once').

Sometimes you might make a few small changes in a PR, but they still go through a review. If they're too big, you might ask someone to split it out into two PRs.

Obviously special cases exist for things like large refactoring, but those should be rare and can be handled on a case by case basis.

But regardless, even if a PR has multiple small changes, I wouldn't revert or cherry-pick just part of it. Just do the whole thing or not at all.

> Oh, plenty. For one, when looking at `git blame` to determine why a change was made, I hope to find this information in the commit message. This is what commit messages are for anyway. If all commits have this information, following the history of a set of changes becomes much easier. This is helpful not just during code reviews, but after the merge as well, for any new members of the team trying to understand the codebase, or even the author themself in the future.

Yeah but the context for `git blame` is still there when doing a squash merge, and the commit message should still be relevant and useful.

My point isn't that history isn't useful, it's that the specific individual commits that make up a PR don't provide more useful context than the combined PR commit itself does.

I don't need to know that a typo was fixed in iteration 5 of feedback in the PR that was introduced in iteration 3. It's not relevant once the PR is merged.

I don't have time to reply to all your points, but regarding this:

> I don't need to know that a typo was fixed in iteration 5 of feedback in the PR that was introduced in iteration 3. It's not relevant once the PR is merged.

I agree, these commits should never exist beyond the PR. But I go back to my point about atomic commits being misunderstood. It's not about keeping the history of _all_ changes made, but about keeping history of the most relevant ones in order to make future workflows easier. A few months from now you likely won't care about a typo fix, but you will care about _why_ some change was made, which is what good commits should answer in their commit message. Deciding what "atomic" truly means is often arbitrary, but I've found that making more granular commits is much more helpful than having fewer and larger commits. The same argument applies to the scope and size of PRs as well.