Hacker News new | ask | show | jobs
by DavidWoof 639 days ago
One of my biggest pet peeves about modern development is just how many people don't know jack about git even though they use it every day. It really annoys me when I see a pull request with 20 random commits (with messages like "temp" or "checkpoint") or when people merge main into their personal feature branch instead of just rebasing their branch (yes, sometimes that's not right, but I'm not talking about the corner cases).

I always think about using "clean up a pull request" as a fizzbuzz-ish screen in interviews. It just seems like a decent proxy for "do you care at all?".

11 comments

Just to be clear, and you probably won't disagree with this, there's nothing wrong with commits like "temp" or "checkpoint" or "WIP" (Work In Progress). I often make these sorts of commits as I'm working on stuff.

The issue is in submitting an MR/PR with those commits. There's an expectation among professionals that you make your work presentable before submitting it for review, although those who are new to the profession don't realize that this cleanup step is necessary (how could they? the intro courses don't teach this and they're usually struggling enough with the code).

I just wanted to throw this comment in here in case some newbie sees this and comes away thinking "oh, I can't have 'temp' or 'checkpoint' in my commit messages"

I encourage developers to clean up their commits but so often they're either unduly nervous about breaking things or have a fascination with "preserving the history" (of their branch, which is not yet merged in anywhere).

I partly blame the excessive fear mongering around rebasing, where the strict Never Rebase a Pushed Branch rule is drilled into them and they never learn why or when they can break the rule safely.

So it's an uphill fight but I just try to teach by demonstrating, frequently, exactly how they can tidy up for the merge request.

> I just try to teach by demonstrating, frequently, exactly how they can tidy up for the merge request.

Some recommended resources for this?

I couldn't care less about the commit messages on a PR, I care about the diff. as long as the messages are cleaned up in the squash and merge with main
If it's large I want to be able to look at the commits in isolation and understand the work in the logical chunks that made sense to the author.

If what made sense to them is temp, checkpoint, temp, temp, undo the temp, fix test, try this, that didn't work try other thing, maybe?, temp, fix test - then I don't stand a chance. Recently I reviewed one that had multiple 'rebase' commits, I have no idea.

> If it's large

Then make the PR small.

This is a conversation that keep repeating

- Many temp commits

- Doesn’t matter: just squash

- But sometimes I want to have a few distinct (isolated commits for my PR)

- Then make the PR small

- But I have several changes

- Then make several small PRs

We keep coming back to “just make more PRs”. Which is curious, given that GitHub doesn’t even support dependent PRs. The thing you need immediately when your PRs start depending on each other (like refactor X before implementing Y, where both touch the same code).

But I can’t come to any other conclusion than that this is because of the over-focus on PR as the one and only reviewable unit. Thanks to GitHub.

I can’t really get on the small PR train. A PR has overhead and I often want to do small changes that I only happen to notice are necessary when I am in the middle of doing that one PR.

Don’t make dependent PRs. Make small PRs, review, merge and deploy them before starting the next.
1. I'm reading not writing.

2. Is our 'large' the same?

3. I'd rather the PR were whatever size it needs to be to entirely do the thing it's supposed to do (and nothing else) than conform to some arbitrary size requirement.

You're over-thinking it. Large PRs have a known effect of reviewers' eyes just glaze over and the PR just gets a cursory glance and LGTM :+1: on it.

That's when you know a PR is "large."

What's stopping you breaking such a PR into smaller chunks? Some arbitrary "it does what it's supposed to do" definition?

But that's the thing, in most cases there's no review step anymore after the squash/merge has been made; while it's in a branch / MR, you can still edit the commit messages and content, but in most cases the squash and merge is an atomic, unreviewed step. Of course, maybe the merge commit should be generated, or should be filled in and reviewed as part of the main code review.
Why is squash and merge a trend now? This destroys all advantages of git in my view.
Yeah, I’ve noticed that many around the Internet just comment with “the squash and merge” as if it’s a given that that’s the strategy that is used (even mandated).
Git rebase aka People rewriting history to make it LOOK like the commit was a compilable, working work result at a point in time even though the real history was something completely different, then f*cking up the rewriting of said history, then whining "please mighty Senior SWE heeeelp I don't know what happened please help my work is gone".

It all has trade-offs.I prefer seeing the dirty laundry.

I don't; once the feature has been merged, how you got to that point is no longer relevant; it's noise. Git is a log of code changes, if you use it as a work log you're using it wrong.
I don't think you're wrong. I think we have different priorities.

And we obviously have different expectations of what should happen if the code of a certain commit is run.

That the code is “compilable, runnable” is never a given even if you never use rebase.

We can sling around stereotypes of people failing and doing a bad job with their strategy. Then going to cry to someone (presumably you?). But I don’t think that advances the conversation.

True. And I don't think I ever claimed that never using rebase is a guarantee for anything.

I rather wanted to point out the following: Using a commit strategy based on git rebase is neither right nor wrong. It is not even best practice IMO. It has its own footguns.

Since the parent comment was very opinionated and cast judgement, I responded in kind.

I have been the person crying as well as the one who solved the mess, let's not kid ourselves.

> Since the parent comment was very opinionated and cast judgement, I responded in kind.

Fair. :)

It's a mindset problem; people use git commits as the equivalent of hitting save in their editor, and they feel like they should use it as a work log, justifying their time spent or feeling like they have to demonstrate how they got to a certain solution. It's not relevant. Ego is not important. You don't need to justify your time.
I agree with you in spirit (people need to learn their tools more) but your examples...not so much. Merging main into the feature branch was the original intent on how to do it. PRs are sent the way you describe because GitHub literally offers the maintainer a squash option and it's a lot easier to review a pull request when history is unedited.
PR should be sent to review in a state that author considers ready to merge. Assuming it will be squashed and taking that as a reason for submitting dirty history is just sloppy and unrespectful.

Merging is for keeping track of a group of commits that has been taken from a feature branch and included in the mainline. Why would you clutter the feature branch with periodic main merges when you can cleanly rebase it and keep it tidy?

> Merging is for keeping track of a group of commits that has been taken from a feature branch and included in the mainline.

Merging is for bringing a group of commits from branch A into branch B. It is, quite literally, the original way to perform this operation. It's not "clutter", it's a correct picture of how the code was developed.

It's clutter because it adds no information value on a short lived branch. If it's a branch that periodically syncs with another it's ok, if you're just basing off current master for a feature branch rebase is the way to go.
I've developed branches against a moving target all the time where the moving target introduced a problem in my code that wouldn't have been found by a simple merge conflict resolution. It's much much easier to find the source of the problem when you have the real history (a merge) instead of a rewritten history (a rebase).
You are right, makes sense.
> it's a lot easier to review a pull request when history is unedited.

Ist it really? If you would See my uncleaned history, it would take you days to understand what I was even trying to do.

Your statement seems based on the assumption that someone knows how to achieve a particular thing right from the start. But that isn't always the case and there might be a lot of different approaches until the correct or best one is found.

Do you really want to review dead code that's somewhere in the commits of a feature branch?

Yes, I'd much rather review a PR with the full unedited history, not the history that the submitter thinks is good or pretty. You often don't know what information you need until you need it. I'll take the entire, dirty history, and once I'm done with it it can be squashed.
> PRs are sent the way you describe because GitHub...

Oh, and here I though this discussion was about git. But you're talking GitHub.

They used the phrase pull request, which means GitHub, because git does not have a concept of "pull request".
To request a pull has been a term in git(1) since 2005.

https://opensource.stackexchange.com/a/380/30121

(Is this pedantic? I guess in a lot of contexts. But you talked about the original, intended way to do it. So it seems on-topic here.)

Yes, pull is old. Pull request, which the original poster was explicitly discussing, is a GitHub thing.
I was talking about “pull request”. So is the link I used.
> Merging main into the feature branch was the original intent on how to do it.

Based on?

The merge script existed as far back as git 0.5. Rebase came later.
Linus Torvalds needs the merge operation as an integrator.

Considering the email workflow of the kernel I can’t really make sense of “intended way to do it”. For individual commits people send out patches. I’ve never seen an email thread where some merge topology is recorded: it’s just a list of patches. A straight line.

I’m pretty sure that people used patch queues before Git (and even now with quilt). Restacking a bunch of commits on top of the mainline is the same operation as a rebase.

I’ve certainly seen Linus get mad at another maintainer for allowing a back-merge into his history (merge main back into feature branch).

It seems there are no best practices here. I remember senior devs working with a gui for git that didn’t have rebase in it. I worked on the cli, whenever I mentioned that I rebased people looked at me with raised eyebrows since I was not a senior and new.
> It just seems like a decent proxy for "do you care at all?"

Because the reality is, when it comes to Git history, no, I don't care in the slightest. I get all the information I need by:

- Reading previous PRs (the final diff)

- Checking the name on a git diff of a line

- The ticket reference

Git commits are a tool to help me write code and reverting to a "known-good" state. Once it's merged into master/main, I don't care how messy it is because 99.999999% of the time, I'll just go back to the merge commit.

One of the nice things about Git is that it is a fast (not just local but fast—these people care about speed) lookup program for all things relating to the code. I want all immediately useful code information inside Git. Because then I look it up quickly. Unlike having to go to at least two different web applications (PRs and issue tracker) and find the info there, often in an inferior and more convoluted format.

But it takes surprisingly little to sell back centralization and lock-in to developers, even when working on top of a decentralized tool.

> But it takes surprisingly little to sell back centralization and lock-in to developers, even when working on top of a decentralized tool.

I don't care about centralisation / decentralisation in my work. What I care about is that I have the information I need to do my job.

> Unlike having to go to at least two different web applications (PRs and issue tracker)

PR descriptions can be part of your merge commit message so I don't know why you need to go to a web application if you don't want to. You can also read the full diff in git diff so I don't really see what you're upset about.

> You can also read the full diff in git diff so I don't really see what you're upset about.

Since you don’t care about what I care about I have nothing to be upset about.

I care about getting things done. You seem to care about making git commits look pretty. I know what my customers prefer.
I thought I was ending this on an amicable note.
> I don't care about centralisation / decentralisation in my work.

Yeah, that's why it's easy to sell centralisation to lots of people.

> It really annoys me

Have you set those expectations with your team or the people you are working with? Your ‘style’ shouldn’t just show up in reviews

From my experience, nobody teaches you Git properly in school. And once you get your first job, it is too late.

We had various lectures on languages models, math, algorithms, networking... absolutely nothing on git (I did my classes between 2008 and 2013, things might have changed now)

same case here. almost new guy on my team we need to train them because basically just know to use what vs code exposes as git interface and some of them can replicate the same flow on terminal. rebase? that's a forbidden command for them, while our entire company use this as daily basis.

that just kinda sad. hopeless.

It's so frustrating. We had a reasonably okay process going on, but then another team started working in our repository on their feature completely ignoring the fairly lightweight commit message format we ask them to uphold, and they were like "doesn't matter, we'll squash merge it". But it's not even about the commit messages; it's about the workflow they have behind it, where they commit and push as often as they hit ctrl+s, which to me communicates they're just hacking around until it works.

A commit should be atomic; it should be a complete change with code and tests all adjusted, and ideally a message explaining what and why it changes. (In practice / in my line of work this doesn't happen because it's all front-end code implementing some poorly documented user story in jira, but hey).

If I'm ever involved in hiring again I'll add git usage to my list of criteria. Along with whether they can actually touch type. I can't believe that the standards have dropped so far that basic computer skills are no longer necessary apparently.

I share your frustration, personally. But unfortunately Git is more user-unfriendly than it needs to be, and doing things the proper way tends to get you into a rabbithole.

Like merging main back into your feature branch: just rebase. But then you often need to re-do conflict resolution. You have git-rerere but, eh, it’s not discoverable at all. But let’s say you get over that hurdle. Now the next obstacle is the “never rewrite shared history”. And if you’re in a “corporate” environment chances are that you publicize your branch when you make a PR. And it can take a few days to get approval.

Now I care. But sometimes I have doubts about whether the caring is well-founded. Exactly because sometimes people around me seem to care not one bit.