Hacker News new | ask | show | jobs
by natrius 3732 days ago
I used to strongly believe what you do until my company started using Phabricator, which forces the squash workflow on you. It makes your history more useful, not less. The pull request is the appropriate unit of change for software. Make small commits as you develop, then squash them down into a single meaningful change to the behavior of your software.
4 comments

As a git novice I wonder, doesn't a proper workflow do the same thing? When I submit a feature branch it might have a lot of ugly commits. However, once I merge it to an integration branch there is one nice commit explaining what I did.

When coworkers create Pull requests I don't go through all of their commits and changes along the way. I just look at the diff so, I don't see the need for them to squash it first.

Once merged, the history of your feature branch becomes part of the history of the integration branch.

Sounds like you're using GitHub (Enterprise) or something similar where the pull request view shows you all of the changes in a "squashed" fashion.

Yes Bitbucket so, maybe that explains it.
Sometimes yes, sometimes no. Merge commits are nasty imo and I'm glad we can now forbid them outright, but a full squash isn't always the solution either.

Take a look at this PR for example:

https://github.com/HearthSim/python-unitypack/pull/4

Lots of back & forth. All the commits are related, and the PR is there to land all of those commits at once. I could land some of them right now (as they're safe to land), but keeping them in the PR keeps everything related in the same place (and none of them are required until that last commit lands).

A PR mirrors a "patchset" on mailing lists. You don't always want to squash all of it.

What you do want to avoid is a situation like this:

https://github.com/jleclanche/django-push-notifications/pull...

Where the original author creates their original commit and doesn't know how to --amend + push --force to the PR, and you end up with a ton of commits which you don't want to land all at once.

To clarify, pull requests shouldn't be squashed before review. They should be squashed when they get merged in, like this GitHub feature does. The granular commit history is useful during code review. It is not useful as a future developer trying to evaluate how the behavior of the program changed over time.
Right, sometimes. Pull requests are not necessarily a "unit of change" like you mentioned, though. For example, the first link I gave should not be squashed. But I don't want it to create a merge commit either.

I'm a little underwhelmed with the feature, it looks like it's either "squash everything" or "make a merge commit". There's no option to rebase & merge and/or selectively squash.

Exactly. In order for always-squash to work, then every PR must be the smallest possible atomic change. But often times certain features don't degrade to nice small atomic changes. It's sometimes useful to see that as a development process smell and consider using feature flags and other things to incrementalize the development, but sometimes the cleanest thing to do is just have a series of commits. I don't see a good reason to take this option off the table since source code history management is truly a craft in its own right.

FWIW, I prefer a carefully curated rebase and then merge --no-ff so that you can still see the branch and get at the branch diff trivially, but the history is still linear for practical purposes so bisecting is clean, etc.

Interesting. It's the first time I hear a fairly solid argument in favour of --no-ff. My main argument against it is keeping a clean commit log on github itself (and the default git log), as it's an entry point for new contributors.

Example:

https://github.com/lxde/pcmanfm-qt/commits/master

vs

https://github.com/facebook/react/commits/master

I think your first link should be squashed once the code has been reviewed. When looking back on history, commits are most useful when they're a list of behavior changes in the software. That pull request "exports meshes as OBJ." That's what's useful for future developers, not "add Transform fields." Leaving those in your history makes it harder to work with, not easier. If someone cares about the back and forth that happened within that behavior change, the pull request is always available.
"Add transform fields" may not be a very descriptive commit message, but those fields are fully independent from the OBJ exporter. They are the implementation of unity's Transform class. This implementation was stubbed before, it happens to be needed for an OBJ exporter to go through, but has nothing to do with the exporter itself.
It seems like it'd be nice to have two levels of granularity exposed in views of a source control system's history, basically corresponding to pull requests and commits. So you could drill-down to individual commits as needed, but would normally be able to work at the PR level.
Does "git log --merges" get us there?
I suppose it does, though at least the tool my org uses doesn't write good descriptions on the merge commits that are created upon merging in a PR — they're just like "Merge pull request 190 from …" when I'd prefer it to be named for the changes that are actually in that PR. That's good to know that exists, it could be useful. Thanks.
That's the `diff` tab on the PR
But the PR isn't retained (afaik) in the repo's history as a distinct entity (I'm talking about git proper here, not extra tools like GitHub, etc). In the end, git just has commits. [Edit: as prodigal_erik points out, perhaps merge commits are really what I need to be looking at.]
Nope, it does this by default, but doesn't force it.

$ cat ~/git/ATLAS/.arcconfig { "project_id" : "ATLAS", "repository.callsign" : "ATLAS", "conduit_uri" : "https://phabricator.$MYCOMPANY/", "arc.land.onto.default" : "develop", "immutable_history" : true }