Hacker News new | ask | show | jobs
by spacechild1 30 days ago
> Most people on small projects just want "checkpoint the current code in my directory and put a comment on it".

Interesting, that's definitely not how I use git. My current code is rarely in a shape that can be fully committed. It often contains additional stuff I did on the way (small bug fixes, TODO comments, debug printf statements, etc.) that I don't want in the commit. Very rarely do I type `git add .` Am I the exception?

11 comments

My use of `git add` - and the explicit staging area more generally - is mostly a workaround for the fact that the repos I work with have checked-in dev setup scripts, IntelliJ/Visual Studio/Xcode/VS Code configurations, and so on.

My own setup differs in slight ways from what those scripts expect, and even where they match I like to do my own customizations. I don't want to commit those changes, and staging makes it easy to not do that MOST of the time. The rest of the time, it's a `git stash` dance, which I sometimes screw up and lose the customizations.

I've tried to manage the configurations a different way, such as by having a private branch with my own settings checked in, but that doesn't usually work out. I'm aware that the REAL problem is that my coworkers have checked in those settings to begin with, but I would counter-argue that the REAL REAL problem is that those tools don't have a good way to combine "settings that I override or that only I care about" and "settings that have project-wide defaults but are safe for me to override." (Visual Studio gets it close to right with its .xyzproj and .xyzproj.user files, but VS Code's single .vscode/ folder breaks down in shared repos.)

You can ignore them once and then edit to your liking, git will not notice any changes to them and will assume them to be untouched.

https://git-scm.com/docs/git-update-index#Documentation/git-...

If you feel like fucking around with new source control tools, jj (jujutsu)'s megamerge workflow is really good at this.

(If you're not interested, feel free to skip the rest of this).

I have each in process workstream in a commit that is merged at the top level, then I have a new wip commit off of that where stuff I'm typing right now sits.

It's easy to split/squash/absorb parts of that commit into the right destination, but also to introduce parents of the megamerge that will never get merged.

(This is a better/longer writeup of this concept)

https://isaaccorbrey.com/notes/jujutsu-megamerges-for-fun-an...

People make this claim but it never made sense to me. How do you know the version that you are committing is buildable if you never tried building with it? And if you tried building with it, you can just do `git add .` or `git add -u` at that point.

So yes, your usecase does not make sense to me.

There was another comment that said similar thing...

https://news.ycombinator.com/item?id=48175289

>So you're just constantly committing untested versions of you work?

But it is "dead" for some reason...

1. CI

2. Why should comments or printf statements affect the build? When it compiles with them, why shouldn't it compile without them?

3. the commits might be temporary and get squashed anyway

1. Not a very good reason. Some projects might have slow CIs. Some projects might not have a CI at all. Some project's CI might not be checking everything (front end for example)..

2. Because people make mistakes. You might think you are only excluding a comment, but might be excluding something that is required by mistake.

If I really want to make sure, I do:

1. git stash

2. build + test

3. commit

4. git stash pop

> Some projects might not have a CI at all.

Well, then you have bigger problems. Without CI, how would you even know if your projects compiles on other platforms?

When I really want to do your workflow, here is what I do. Add all the debugging print statements and commit them separately in another branch. When I want to include the debug statements, I just cherry-pick the commit with those things.

This way I remove the overhead of doing a staging before every damn commit and still retain the ability to pull in debugging changes whenever I want them.

>Without CI, how would you even know if your projects compiles on other platforms?

Not everything need to be cross platform! And not everything need CI..

This sounds more complicated overall. Also, I would still need the staging area to only commit the debug statements.

As I said, I like to work on several things in parallel and I don't want to switch branches back and forth. That's just my workflow for my own projects and apparently I'm not alone.

I'm with you. My current code is a superset of the task I'm trying to accomplish, test code, leftovers from experiments, etc. I often have to break it up into logical chunks that get merged separately. I tried the jj flow and it's just not my thing. Git matches my mental model exactly, but I used it second (after subversion) and in my most formative years as a developer. Maybe there's a universe out there where things worked out differently.
To be clear, Mercurial does not have a staging area, but it does have allow selective commits (and selective uncommits) via prompt-based or interactive UI selection of hunks. Disagreeing with the need for a staging area is not the same as saying selective commits are unnecessary (I use Mercurial more than git and I rarely commit everything in my working directory in a single commit - I like small commits).
When there's an expectation or requirement that each commit builds (and even passes tests), how can you do partial commits? Do you work exclusively on projects without such requirements? Do you rely solely on CI to ensure that your commit compiles? Do you not use CI and not care if a commit is broken... you'll squash a fix in later, or not even squash it and leave a broken commit in the repo?
Each commit should build and pass tests, yes. When I say "partial commits", I don't mean that the commits are arbitrary - each commit should be as small as possible to implement a specific fix/feature. I've also heard it described as the smallest unit that you may want to revert.

For example, if you are working on something, but it requires adding an API to some module, then the first commit 1 is to add the new API (+ tests), and the second commit is the new code that uses that API.

Unfortunately many developers I have worked with would just combine these (and more) into a single commit (because they are part of the same work task). However this makes review, bisect, blame and revert harder (if you need to revert commit 2, you don't want to also revert the API you added if that was tested and bug-free).

Why would partial commits necessarily break anything?

In fact, often partial commits are necessary for builds.

As an example (and to be fair, this was a transitional project), I once worked on a project where the local dev directly acquired packages from different parts of the application, but the actual CI was broken up into different pipelines which required some parts to be built first, its outputs packaged and added to the registry, and downstream parts to be built after.

Committing everything at once would literally break the CI.

If the dev's working tree isn't exactly what they checked in, how do they build or test the commit? Do they YOLO a partial commit and wait for it to be accepted or rejected by the CI? Isn't that a problem to be solved by improving the CI pipeline?
Is there an equivalent to `git stash` in Mercurial?
Yes. Shelve.

But there are also other extensions that can achieve similar behavior.

That makes sense, thanks for clarifying!
Same. I absolutely don't use git for snapshotting what I'm currently doing. That goes both for work and for my numerous hobby projects. I always cultivate commits so that they're focused on a single type of change or feature, and then the next commit is typically something which uses that feature, etc. I don't mix in whatever else I'm doing - be that whitespace changes, update comments elsewhere, or other features I'm working on. This helps tremendously when (as I do) I leave my hobby project for a while and then I come back months (or sometimes years) later. And, both for work and for hobby stuff, if I want to add something, e.g. support for a new function, and I had done something similar in the past, it's easy to look at the particular commits about that from the past, and I can see that I need to update this, this, and this file so-and-so, and with these kind of changes. I don't have to wonder about what belongs to this feature and what doesn't.

Oh, and I use git add ---patch almost exclusively. It's rare that I just do a "git add". I'm building up my stage, I'm checking it, I'm fixing it (if I accidentally stage something which doesn't belong), then I commit.

Having done it like this for a great many years I'm benefitting from it all the time. I can look at all my hobby projects (looking at the commits), and I'm back in where I left off, and I see excactly what I was doing back then (which, obviously, I wouldn't be able to rembember otherwise).

CVS though.. that was harder to do right. So a lot of stuff became just snapshots. You had to plan much more carefully. And then there was SCCS before that.. and before that again, well. Manual "keep two versions" svc.

The way I do things is that there is no such thing as a shape that can't be committed. Committing is just like saving. It's fine to commit haphazard checkpoints and all manner of crazy stuff. You can use tags or merges or whatever to indicate that something is "done" but for me those kinds of commits are the exception, not the norm.
I don't see why there has to be a special staging area when you could just edit the HEAD commit instead. In git you could do "git commit --patch" to commit selected parts and then add more changes to the HEAD commit by "git commit --patch --amend".
A normal situation in my tasks is when the working copy contains lots of changes that are used for debug (mainly prints) but these changes shall not be committed to the proposed change. For this, even interactive adding (`git add -i`) does not satisfy; I need `git add -e` which allows editing in a patch form, and remove the temporary local changes.
Yes, I pretty much do the same thing with git-gui. You are right that the staging area isn't strictly necessary for this kind of workflow.
Git add is a reflection of git commit being such a heavy operation.

It’s a pointless addition. Making commits easier to modify and undo would eliminate any need for git add.

But git can’t really do that since it’s so fundamentally based on the idea that commits are immutable. Any modifications r does allow are workarounds, and dangerous ones at that.

> Am I the exception?

Supposedly, Meta has the data to support the claim that you (and I) are the outliers here. Staging is confusing to users, especially new ones, which is why jujitsu explicitly doesn't have staging.

The reason jujutsu doesn't have staging is that staging is incompatible with concurrency. The UX is a happy coincidence.
In discussions with people who made jj, it deliberately does not have Git’s staging area / index as a core concept because that was confusing for users.
Not solely because it’s confusing, but because it’s a more powerful and orthogonal design. The usability stuff matters too, but it’s not one or the other.
So you're just constantly committing untested versions of you work?
No.
Apparently the world went dumber in the last 20 years and staged commits are DIFFICULT now.
Both things can be true (the second being that staging was never necessarily not a desirable abstraction in light of easily and safely amendable commits)