Hacker News new | ask | show | jobs
by cranium 252 days ago
Commit even as a WIP before cleaning up! I don't really like polluting the commit history like that but with some interactive rebase it can be as if the WIP version never existed.

(Side ask to people using Jujutsu: isn't it a use case where jujutsu shines?)

5 comments

Assuming you squash when you merge the PR (and if you don't, why not?), why even care? Do people actually look through the commit history to review a PR? When I review I'm just looking at the finished product.
I don't typically review commit by commit, but I do really appreciate having good commit messages when I look at the blame later on while reading code.
Which is just the PR message if you squash. To be clear, I'm not advocating for bad messages, but I am saying I don't worry about each commit and focus instead on the quality and presentation of the PR.
Definitely, but if you've made your PR with several different commits the message will contain all of the information for the whole PR, instead of just the message that pertains to the changes in that commit. It's not a HUGE deal, but it can make it harder to understand what the commit message is saying.
Indiscriminate squashing sucks. Atomic commits are great if you want the git history to actually represent a logical changelog for a project, as opposed to a pointless literal keylog of what changes each developer made and when. It will help you if you need to bisect a regression later. It sucks if you bisect and find the change happened in some enormous incohesive commit. Squashing should be done carefully to reform WIP and fix type commits into proper commits that are ready for sharing.
> It sucks if you bisect and find the change happened in some enormous incohesive commit.

But why are any PRs like this? Each PR should represent an atomic action against the codebase - implementing feature 1234, fixing bug 4567. The project's changelog should only be updated at the end of each PR. The fact that I went down the wrong path three times doesn't need to be documented.

> Each PR should represent an atomic action against the codebase

We can bikeshed about this for days. Not every feature can be made in an atomic way.

That's true, some are big and messy, or the change has to be created across a couple of PRs, but I don't think that the answer to "some PRs are messy" is "let's include all the mess". I don't think the job is made easier by having to dig through a half dozen messy commits to find where the bug is as opposed to one or two large ones.
> I don't think that the answer to "some PRs are messy" is "let's include all the mess"

Hey look at us, two alike thinking people! I never said "let's include all the mess".

Looking at the other extreme someone in this thread said they didn't want other people to see the 3 attempts it took to get it right. Sure if it's just a mess (or, since this is 2025, ai slop) squash it away. But in some situations you want to keep a history of the failed attemps. Maybe one of them was actually the better solution but you were just short of making it work, or maybe someone in the future will be able to see that method X didn't work and won't have to find out himself.

This simply isn’t true unless you have to put everything in one commit?

To be honest, I usually get this with people who have never realized that you can merge dead code (code that is never called). You can basically merge an entire feature this way, with the last PR “turning it on” or adding a feature flag — optionally removing the old code at this point as well.

So maintaining old and new code for X amounts of time? That sounds acceptable in some limited cases, and terrible in many others. If the code is being changed for another reason, or the new feature needs to update code used in many places, etc. It can be much more practical to just have a long-lived branch, merge changes from upstream yourself, and merge when it's ready.

My industry is also fairly strictly regulated and we plainly cannot do that even if we wanted to, but that's admittedly a niche case.

> Each [X] should represent an atomic action against the codebase

That's called a commit. Not sure why some insist on replacing commits with vendor lock-in with less tooling and calling it progress.

yes, that would be ideal. especially in a world with infrastructure tied so closely to the application this standard cannot always be met for many teams.
Yeah "should" is often not reality, BUT I'm arguing that not squashing doesn't make things better.
I so miss bazaar's UI around merges/commits/branches. I feel like most of the push for squashing is a result of people trying to work around git's poor UI here.
Alternative to squashing is not a beautiful atomic commits. It is series of commits where commit #5 fixes commit #2 and intruduces bug to be fixed on commit #7. Where commit #3 introduces new class that is going to be removed in commits #6 and #7.
Yeah, I don't see the value in looking through that. At best I'll solve the problem, commit because the code works now, create unit tests, commit them, and then refactor one or both in another commit. That first commit is just ugly and that second holds no additional information that the end product won't have.
It is often easier to review commit-by-commit, provided of course that the developer made atomic commits that make sense on their own.
I feel like that requires a lot of coordination that I, in the midst of development, don't necessarily have. Taking my WIP and trying to build a story around it at each step requires a lot of additional effort, but I can see how that would be useful for the reviewer.

We can agree that we don't need those additional steps once the PR is merged, though, right?

I have literally never met a developer who does this (including myself). 99% of all PRs I have ever created or reviewed consist of a single commit that "does the thing" and N commits that fix issues with/debug failure modes of the initial commit.
Yeah, make it work. Commit. Build unit test. Commit. Fix bugs. Commit. Make pretty. Commit and raise a PR.
You never design a solution which needs multiple architectural components which _support_ the feature? I do, and would make little sense to merge them as separate PRs most of the time as that would mean sometimes tests written on the inappropriate level, also a lot more coordination and needs a lot more elaborate description then just explain how the set of components work in tandem to provide the user value.
Git is a distributed version control system. You can do whatever you like locally and it won't "pollute" anything. Just don't merge that stuff into shared branches.

I automatically commit every time my editor (emacs) saves a file and I've been doing this for years (magit-wip). Nobody should be afraid of doing this!

Honest question - What DO you merge into shared branches? And, when your local needs to "catch up", don't you have to pull in those shared commits which conflict with your magit-wip commits because they touch the same code, but are different commit hashes?
The magit-wip commits go on a separate branch and ideally I'm never even aware of them. They just disappear eventually. They exist purely in case of a disaster à la the article.

I make "real" commits as I go and use a combination of `git commit --amend` and fixup commits (via git-autofixup) and `rebase --autosquash`. I periodically (daily, at least) fetch upstream and rebase on to my target branch. I find if you keep on top of things you won't end up with some enormous conflict that you can't remember how to resolve.

Feature branches that have been cleaned up and peer-reviewed/CI-tested, at least in the last few places I worked.

Every so often this still means that devs working on a feature will need to rebase back on the latest version of the shared branch, but if your code is reasonably modular and your project management doesn't have people overlapping too much this shouldn't be terribly painful.

Exactly this. I can make a hundred commits that are one file per commit and I can later go back and

    git reset --soft HEAD~100 
and that will cleanly leave it as the hundred commits never happened.
I assume Jujutsu only commits the file when you use one of the jj commands. I don't think it keeps a daemon running and checking for changes in the files.
It does the former by default, and the latter if you configure it.
I have heard of jj. I have tried jj, I love jj but I couldn't get myself towards using it.

This itself seems to me the thing which will make me push towards jj.

So if I am correct, you are telling me that I can have jj where I can then write anything in the project and it can sort of automatically record it to jj and afterwards by just learning some more about jj, I can then use that history to create a sane method for me to create git commits and do other thing without having to worry too much.

Like I like git but it scares me a little bit, having too many git commits would scare me even further but I would love to use jj if it can make things less scary

Like what would be the command / exact workflow which I am asking in jj and just any details since I am so curious about it. I have also suffered so much of accidentally deleting files or looking through chat logs if I was copy pasting from chatgpt for some one off scripts and wishing for a history of my file but not wanting git everytime since it would be more friction than not of sorts...

> I can then use that history to create a sane method for me to create git commits and do other thing without having to worry too much.

It's easier than that. Your jj commits are the commits that will be pushed - not all the individual git commits.

Conceptually, think of two types of commits: jj and git. When you do `jj new`, you are creating a jj commit.[1] While working on this, every time you run a command like `jj status`, it will make a git commit, without changing the jj commit. When you're done with the feature and type `jj new` again, you now have two jj commits, and many, many git commits.[2] When you do a `jj git push`, it will send the jj commits, without all the messiness of the git commits.

Technically, the above is inaccurate. It's all git commits anyway. However, jj lets you distinguish between the two types of commits: I call them coarse and fine grained commits. Or you can think hierarchically: Each jj commit has its own git repository to track the changes while you worked on the feature.[2]

So no, you don't need to intentionally use that history to create git commits. jj should handle it all for you.

I think you should go back to it and play some more :-)

[1] changeset, whatever you want to call it.

[2] Again - inaccurate, but useful.

I can’t see myself going back to git after I actually went back and was very confused for a second I need to stash before rebase.
Happy to talk about it, for sure :)

> you are telling me that I can have jj where I can then write anything in the project and it can sort of automatically record it to jj

By default, yes, jj will automatically record things into commits. There's no staging area, so no git add, stuff like that. If you like that workflow, you can do it in jj too, but it's not a special feature like it is in git.

> and afterwards by just learning some more about jj, I can then use that history to create a sane method for me to create git commits and do other thing without having to worry too much.

Yep. jj makes it really easy to chop up history into whatever you'd like.

> I would love to use jj if it can make things less scary

One thing that jj has that makes it less scary is jj undo: this is an easy to use form of the stuff I'm talking about, where it just undoes the last change you made. This makes it really easy to try out jj commands, if it does something you don't like, you can just jj undo and things will go back to the way before. It's really nice for learning.

> Like what would be the command / exact workflow which I am asking in jj

jj gives you a ton of tools to do this, so you can do a lot of different things. However, if what you want is "I want to just add a ton of stuff and then break it up into smaller commits later," then you can just edit your files until you're good to go, and then run 'jj split' to break your current diff into two. You'd break off whatever you want to be in the first commit, and then run it again to break off whatever you'd want into the second commit, until you're done.

If you are worried about recovering deleted files, the best way to be sure would to be using the watchman integration: https://jj-vcs.github.io/jj/latest/config/#watchman this would ensure that when you delete the file, jj notices. Otherwise, if you added a file, and then deleted it, and never ran a jj comamnd in between, jj isn't going to notice.

Then, you'd run `jj evolog`, and find the id of the change right before you deleted the file. Let's pretend that's abc123. You can then use `jj restore` to bring it back:

  jj restore --from abc123 -- path/to/file
This says "I want to bring back the version of /path/to/file from abc123, and since that's the one before it was deleted, you'd get it back as you had it.

I tend to find myself not doing this a ton, because I prefer to make a ton of little changes up front, which just means running 'jj new' at any point i want to checkpoint things, and then later squashing them together in a way that makes sense. This makes this a bit easier, because you don't need to read through the whole evolog, you can just look at a parent change. But since this is about restoring something you didn't realize you deleted, this is the ultimate thing you'd have to do in the worst case.

I can second that `jj undo` is awesome!
> (Side ask to people using Jujutsu: isn't it a use case where jujutsu shines?)

Yes! For the case discussed in the article, I actually just wrote a comment yesterday on lobsters about the 'evolog': https://lobste.rs/s/xmlpu8/saving_my_commit_with_jj_evolog#c...

Basically, jj will give you a checkpoint every time you run a jj command, or if you set up file watching, every time a file changes. This means you could recover this sort of thing, assuming you'd either run a commend in the meantime or had turned that on.

Beyond that, it is true in my experience that jj makes it super easy to commit early, commit often, and clean things up afterwards, so even though I was a fan of doing that in git, I do it even more with jj.

I always commit when wrapping up the day. I add [WIP] in the subject, and add "NOTE: This commit doesn't build" if it's in a very half-baked state.
I do a bunch of context switching, and I commit every time I switch as stashing would be miserable. I never expect those WIP commits to reviewed and it'd be madness to try.
same - my eod commits are always titled 'checkpoint commit: <whatever>' and push to remote. Then before the MR is made (or turned from draft to final) I squash the checkpoint commits - gives me a psychological feeling of safety more than anything else imo