Hacker News new | ask | show | jobs
by rrosen326 1938 days ago
I'm also a solo developer and use git in a much less sophisticated fashion. I tend to use it as, "freeze my code here, so in case I f something up, I can get back to a moderately clean state." It's kind like a snapshot-based local history. And, quite frankly, I rarely revert one, but it makes me feel safer. I don't care if I have a lot of commit messages that say, "interim". The good ones are clear.

Is this a terrible coding practice? I don't have enough non-me experience to know what an anti-pattern this probably is. I probably won't change my process, but I'm curious.

35 comments

> it makes me feel safer

This for me is one of the biggest things I like about working under version control, even solo. It gives me the freedom to explore some crazy idea or refactor without having to think about the way back if it doesn't pan out. If it turns out to be more complex than I am willing to do now, I can stash or branch.

If I think back to my pre-source control days, I used to leave commented code everywhere, or just make a full copy of the folder. It doesn't take long before this becomes an absolute mess. Copying in particular was a barrier: you had to realize it was necessary then interrupt your flow to do something that would take a several seconds. (By contrast, if you commit as you go - especially everytime you get to a working "checkpoint" - there's zero extra effort needed.)

Exploratory refactoring turns out to be very close to an exercise in creative writing, as I learned one day accidentally from my lit major friend.

Take the section you are stuck on, print it out, cut it up into sentences or phrases, and just rearrange them until either it makes sense, or you figure out where you went wrong.

Rearranging code statements until something makes sense is exactly what refactoring is.

I beg to differ.

Refactoring is not merely rearranging code statements. Refactoring is restructuring of the code starting from the architectural and abstract goal and then looking at how pieces of existing code would fit. Sometimes, that requires writing new code and tests. Refactoring by definition also means not breaking the user space.

I've never heard of any serious writer printing out their prose and cutting it and rearranging it. That just sounds absurdly unnecessary to me.

> code starting from the architectural and abstract goal

you are either using a different definition of architecture or this is wrong. Refactoring is bottom up construction. Most of the time when I see people frustrated or struggling (including myself) it's because they have forgotten this and need to take a break.

I am using the standard architecture difinition.

More information here: https://en.wikipedia.org/wiki/Code_refactoring

There are many goals in refactoring, specifically this section:

> Potential advantages of refactoring may include improved code readability and reduced complexity; these can improve the source code's maintainability and create a simpler, cleaner, or more expressive internal architecture or object model to improve extensibility. Another potential goal for refactoring is improved performance; software engineers face an ongoing challenge to write programs that perform faster or use less memory.

I was addressing OP's analogy to cutting pieces of written prose and rearranging.

A bad paragraph does not get the point across because it is doing things in the wrong order, or taking to long to get there. Hoisting code can be removing repetition, performance... So many things. Rearranging or deleting code so that a piece is not trying to do three things at cross purposes, for one.

Code is meant to be read by humans and only incidentally by computers.

A lot of architecture is just being clear about what is intrinsic complexity and what is accidental, be it cognitive or computational.

> I've never heard of any serious writer printing out their prose and cutting it and rearranging it.

Writers definitely do this. Maybe not at the prose level, but for sure at the plot level and chapter level.

I've found myself doing this in e-mail recently. I naturally try to set the stage/explain the situation, then ask for something (opinion/resources/prioritization/...). But I've been advised to lead with the request, and then explain. It often takes a minor rewrite to make it work, but I've become convinced it helps motivate the reader to read the explanation.
I’ve definitely heard of doing this when plotting something out, ie at compile time instead of runtime to stretch an analogy.
I think you can buy a box full of words: https://magneticpoetry.com/
Yeah, writing code without source control sounds horrible. Can't imagine what it must've been like for those who had to suffer through such time.
I know a contractor who has worked for a major company that I won't name. He has told me that their source control was, for a time, Google Drive. He knew it was a recipe for disaster but real work was nonetheless getting done, and the client was satisfied. They didn't know how the sausage was being made, but they liked the output.

I think a lot of people who haven't been around the scene wouldn't believe these stories, but this stuff happens a lot. Like major commercial projects with no tests whatsoever (unit, integration, or otherwise), that are still successful and making a lot of money.

Not surprised. A lot of this stuff doesn't get set up because people are lazy. Or developers don't want to, or are unable to, do sysadmin work.

I worked at a small "startup" inside a larger, several billion dollar company, back in the late 90's. Nobody set up source control for that division, despite the parent company being over two decades old and having people very experienced with that sort of thing. We were also integrating code from third party contractors, and it was a big mess. Files getting overwritten, people copying stuff off their local desktops, consultants FTPing in updates, etc. After a couple months of copying junk everywhere, I finally got fed up. As a 22 year old, basically straight out of college, I was training the entire team how to use CVS...

We’ve had effective source control since the 70s, latest 80s; for the most part working without it was self inflicted.
Exactly. When people say 'before we had version control' I want to ask, how old are you? And by the way, I am older than just about all of you.

Started with SCCS with versioned control lists to determine what got pulled from SCCS. The outer wrapper was all written in shell. 1980s.

Talking about a large system, eight or ten sub projects, each sub project in its own versioned source tree.

A release spec pulled the SCCS deltas of all the sub project control lists, and then SCCS was directed by those versioned control lists to pull all the source code for each sub project.

So yes, version control that I am aware of was firmly entrenched in 1980. And I am certain it goes back further than that.

Yes totally. It gives you freedom to try things you don’t fully understand in your IDE or framework too then decide whether you want to revert everything afterwards.
Modern IDEs often ave basic source control baked in. You don't even need to commit anything. I wonder whether there is any point in using Git for basic version control if those features are already available.
At least in IntelliJ, I find using the local file history stuff painful. With explicit source control, I'm making specific decisions to check in known states. When I have to resort to the local file history stuff, there's a lot more of "oh, here I undo'd a typo" and so on type of things.

That said, it can be a lifesaver when I didn't make an explicit commit, then started doing stuff, then realized "ok this got out of hand AND I wish I could go back five minutes but it's gonna be annoying."

The intellij local history works best when your code is a raging monolith. It punishes you when you find coupling in otherwise uncoupled code, and that punishment is there whether you leave the coupling or try to fix it.

One of the failure modes for people leaving a mess is if it's too hard to fix it they give up. So that's no good.

Committing aside, git also has stash. Many people prefer using one consistent tool (git) over the variety of IDE's equivalent features available.
I posted elsewhere. Stash is your friend. If you aren't using stash today, learn it. And learn the difference between 'stash pop' and 'stash apply'. Each has its place and time.
Yes: you like the interface for git already and are productive with it versus learning how to achieve the same productivity while learning the idiosyncrasies of your chosen IDE platform.
I'm butchering something someone said here about a month ago:

Some people cut a trail through the jungle, others just push the branches out of the way and expect everyone behind them to do the same.

Most IDEs that have basic source control baked in typically means it has a git client and UI. I know that years ago I used Netbeans local history but it's a per-file history and does nothing to keep a set of files at a specific "save point" together.
Quite a few people are suggesting that, when it's time to share your code with others, maybe you should squash/rebase it to clean things up. That's totally up to you... but just know that not everyone thinks rebasing is a good idea. See [1], for example.

[1] https://fossil-scm.org/home/doc/trunk/www/rebaseharm.md

I think we often feel the urge to rebase and squash not because it actually makes our code changes easier to understand, but because it makes us feel better about ourselves. That's a red flag. Understanding how you got to the goal -- encoding all the fumbles and disoriented thoughts right in the commit history -- that can be a genuine benefit to the reader. Who do we really help by pretending that we're more organized, coherent, and linear than we actually were?

> Who do we really help by pretending that we're more organized, coherent, and linear than we actually were?

We're helping the future reader who's reading the history because they want to understand why a change was made - and "because the author of the branch initially had the wrong idea" is almost never the answer they're looking for.

I sometimes enjoy reading stream-of-consciousness writing, but most of the time (especially when reading code) I'm more interested in the point itself. The same applies to version history. It can be used to tell the raw story, but there's usually a more useful and interesting story to be told.

Exactly. I want to "tell a story" with my commits, and that story is really more of an idealized retelling of what I actually did.

Five years from now, no one needs to know that I forgot to add that one line to a prior commit and had to add it separately, or that my first attempt didn't quite pan out as expected.

What that future person _will_ care about is:

- What final changes actually got made?

- What task was I working on?

- What was the reason for any of these changes in the first place?

- Why did I make some of these changes specifically to implement that task?

- What additional side info is important context for understanding the diffs?

Exactly. It's also great to compartmentalize different aspects of your change.

Often my changes are

1. Refactor the existing code to support the new feature

2. Add the new feature

It's great to keep these separate, because someone can look at number 1 and see that the two versions of the code ought to be functionally the same (same tests pass, app looks the same, refactor is easy-to-understand), and look at number 2 and see the new feature.

There are countless other times where you want to tell the "story" in a logical fashion.

(Honestly, I expect that there is a significant correlation between being a good git committed and being a clear story-teller.)

I understand that you want to tell a story. But as someone examining your code, I also want to know how you got there. While you're throwing out your junk, you're also throwing away valuable information. If I'm taking the actual time to review the code history, then let me play it out in real-time, mistakes and all. I know how to step back and summarize, I don't need you do do that for me.

This is especially true if your code is clever. I'm much more likely to understand your polished gem if I can see all the things that you bumped into while you were discovering it.

That is what comments and commit messages are for. I trawl history all the time. Running into an unbisectable mess of a branch (because a bug that was introduced in commit X~15 is fixed in X on the same branch) is a complete nightmare. I have to discect the branch history and understand what is because of the branch and what is debugging/review/CI cycle cleanups. Commit messages for fixups also tend to be 100% terrible and utter trash. "Fix review comments". Thanks. If we're doing that, let's copy what the comment was in too and why it fixes it.

The problem with your request is that 90+% of the time (with the way I develop), the dead ends are on MRs that got closed or code that never got pushed in the first place. So again, comments as to why this approach is used is way better than hiding it in the history because someone coming to "clean up" code sees the thought process instead of having to remember to search for it.

I don't do much work like that -- I suspect you're part of a much larger developer team -- but I think I understand the problem you're describing.

Couldn't you simply review/bisect at the fork/join points? i.e., take the commits at which forks began or ended, ignore any intermediary commits, and run the bisect (or, read diffs) across that subset? That way you're only comparing at the chapter-markers of the story, so to speak, and not getting mired in the gory details.

While I understand and somewhat empathize with this desire (I'd use it all the time for personal repos, for example)... current VCS systems are terrible at supporting it.

What you probably want in this case is something like "automatically commit on every change (possibly recording every keystroke)" + "automatically tag based on tests/builds passing or failing" + "allow manual comments at any time, whether based on files changing or not". All of that is technically possible with git/hg/fossil/etc, but it's so much work for both the recorder and the viewer that it's infeasible.

This is great, except that we’re often bad at recounting this idealized history without lying in ways that make later maintenance more difficult
> or that my first attempt didn't quite pan out as expected.

Actually that's still important, it's just important from an architecture perspective.

As a much newer developer, the biggest problem I have with git is that I rarely end up actually making one change at a time. I'll be working on some larger thing, and in the process I'll notice and quickly fix a smaller thing before returning to the original task. This might be a typo in a code comment, a poorly named variable, or a block of code I realize is dead.

I suspect this is the type of tendency which goes away with experience, but it makes git a lot less useful. My commits won't really tell you what changed; the most they can tell you is the primary change I was working on.

Many of us do that, and it's not just a new developer thing. Git actually enables this, because you get to pick and choose what to add to the index (`git add`) before committing. So that little tweak you made in the unrelated function? -- no problem, just `git add` that later, and commit it under a different message. Not all SCM tools give you that kind of flexibility.

On the other hand, there's a diminishing return to placing every tiny change into a separate commit. Commit messages like "Fixed multiple small things" might make some people clutch their pearls, but sometimes you just need to get shit done and move on to solving bigger problems.

My suggestion is to consider breaking your commit into two: one for "fixed this big issue that everyone cares about", and one for "a bunch of tiny cleanup stuff that I happened to notice." (Maybe call that second one "refactoring" -- it will go over better with your audience.)

> Git actually enables this, because you get to pick and choose what to add to the index (`git add`) before committing.

That assumes the changes are in separate files though, right? I know you can do use the "-i" flag, but it's fairly labor intensive.

That kind of depends on your tooling. e.g., I use Magit (an Emacs front-end for git) which makes interactive mode really, really easy.

(But easy or not, other version control systems such as Subversion don't offer the feature at all. We kind of take Git for granted these days, but it wasn't always like that.)

A lighter weight option is the --patch flag to 'git add' and 'git commit'.
Personally I have gotten used to using `git commit --patch` for everything (even if I only have one change) just as a convenient way of reviewing the changes I am about to commit. With that, only committing part of the changes is no additional effort.
Look at `git add -i`. You can commit just part of a change to a file. So if you notice a small problem and already have a bunch of changes made, you can still make those changes, and commit them separately.

Up to you if you wanted to rebase those changes back onto main.

I don't use it often and find it's kind of painful to use, but if you're in the position where you've already saved two different things in your IDE and need to pull them apart for commit, it's a useful tool.

Have you tried using 'git commit --patch'? It makes it easy to separate out unrelated changes when committing. You can precede it with an invocation of 'git reset $HASH' to restructure your last few commits.

In general, more experienced git users aren't actually working on one commit at a time. They're just comfortable enough with editing history to make it look that way.

That is what OP was talking about: after you've done the change, make a commit with just that tiny refactoring. Once you're done and ready to review your work, you can cherry pick just that fix and move it to main / master / it's own PR. Since it is self-contained, it can be processed by itself only.
`git add -p` will take you through all the changes in your files, and let you add them selectively. I find this makes for much cleaner commits.
With the add command's interactive mode, it is often possible to selectively stage and commit individual patches in a file.

https://git-scm.com/book/en/v2/Git-Tools-Interactive-Staging

> We're helping the future reader who's reading the history because they want to understand why a change was made...

"Change" is a subject to interpretation. Most of the time it's the scope that the change belongs to is what has the meaningful value.

Say, changes made in connection to fixing an issue are logically tied for inclusion as well as for potential unwinding.

Some tangent changes technically should not be casually folded in, just in case this changeset will need to be propagated or rolled back.

Thus this elaborate muli-staged commit management in Git.

Many projects don't have such need to manange the change flow, so Version control is used as a kind of undo buffer. Which is fine, in such cases the meaning is tied to release states.

If anything, it makes more practical sense to preserve only commits with a buildable state, not just some transitional changes.

The advantage if that you get a more usable and understandable list of historical changes. "You wouldn't publish the first draft of a book" [1]

A squashed merge or rebased and cleaned set of commits gives a very clean overview of which changes where made, at what point, why they were made, and what together. That picture tends to get utterly lost in the "set up X", "make test Y", "fix typo", "wip" and "change error handling" commits a feature branch typically has.

Additionally I'm not really interested in that my colleague started change X yesterday before lunch, I'm interested in when it went live and became visible for the all developers when it was merged into the main branch.

[1] https://git-scm.com/book/en/v2/Git-Branching-Rebasing#_rebas...

You wouldn't publish a first draft, but neither would you burn it once the final draft was off to the printer. Personally, I'd prefer it if "squashing" commits was purely a UI thing; the underlying commits were all still there, but grouped together and displayed as a single big "virtual" commit. That way you could still drill down to the real history if you needed to.
Why would you want to see every typo that was corrected? Every little test that was changed erroneously and then backed out again?

That may be an accurate representation of the order savepoints were made, but it's not an accurate representation of how the software evolved. It is noise that needs to be discarded if a reader would like to know what change was really made. It also makes if difficult or impossible to use tools like git bisect.

Is the argument really that a more detailed history is always better? In the trivial case every keypress could be a savepoint, and every savepoint a commit.

One does not always know in advance that a commit needs to be split in two. The only way to produce readable commits without rebasing them in that case is to work with local _backup files. A version control system does this much better.

In fairness, you're only seeing 5% of the typos. We caught the other 95% before committing. :)

I love your question, "why not a commit per keypress?", because it raises an interesting follow-up: why not squash and rebase entire months or years of project work into single commits? If squashing is so useful, why do we only apply it at low-grain scales? Could we read and understand massive projects quickly and easily, if they only had a few commits to them?

I'm sure that we don't experiment with larger-scale rebases because of the limitations in the technology -- we all know that we're not supposed to 'git rebase' in public, and why that is. But suppose those obstacles were lifted. Now that we can rebase and rewrite at any time scale, which scale(s) is the right one(s) to choose?

> why not squash and rebase entire months or years of project work into single commits?

The argument here is that one should rebase and carefully craft commits that isolates each functional change into a separate commit, where each change is motivated and builds on previous, before pushing anything. Every commit should build cleanly, preferably even pass tests. That makes changes easier to reason about, and enables the use of tools such as bisect. Look at git itself for an example of this type of history.

The counter argument to that was that it presents a false view of history. Maybe there were false starts and mistakes made along the way. Without preserving these to history the reader is left without understanding these. This is not an uncommon argument. Some people argue rebase should never be used.

This view suggests that a more detailed history is preferable. Taken to its logical extreme, that would mean every keypress and editor command.

But "why not delete all of history" is not an example of "carefully crafted commits" taken to an extreme. Quite the opposite.

> it raises an interesting follow-up: why not squash and rebase entire months or years of project work into single commits?

That's effectively what happened before version control/before the small-scale rebases we enjoy now were possible. And the reason is that it's hugely valuable in certain circumstances to be able to see some granularity of the history. (Though clearly people disagree about what the grain size should be.)

> Could we read and understand massive projects quickly and easily, if they only had a few commits to them?

I don't think so. The current state is visible at the top of the git tree regardless. History comes in when you are trying to understand why the state is what it is. Usually this is for troubleshooting in my experience, but sometimes also when doing a refactor. Meaningful commit messages attached to meaningfully-clumped patches are, in my opinion, absolute gold in those cases.

There's little benefit to squashing down a year's worth of work into 5 commits because you can just as easily tag each of those 5 commits with a version number, give it a little write up, and call it a release.

I think the reason to squash commits is to cut out the noisy bits that were only useful to the original developer that day and create a timeline that's helpful for future readers. It doesn't really make sense to get more granular than the level of a single commit with a good comment and a small set of cohesive changes. So you store your history at that granular level and you can take care of the rest with tags, minor and major versions, etc.

The Fossil designer agrees with you:

"So, another way of thinking about rebase is that it is a kind of merge that intentionally forgets some details in order to not overwhelm the weak history display mechanisms available in Git. Wouldn't it be better, less error-prone, and easier on users to enhance the history display mechanisms in Git so that rebasing for a clean, linear history became unnecessary?"

I'm not a user of it myself, but I believe this is the philosophy behind how Fossil approaches it:

https://fossil-scm.org/home/doc/trunk/www/rebaseharm.md

Pull requests can serve the same purpose; messy feature branches and a clean main trunk.
The only way you get that in Git is if you squash-and-rebase before merge, though. Which is fine if that's the process and end result that you want, but does (if you keep feature branches "messy") disconnect feature branches from their related merges into trunk from Git's point of view.
Yeah, you're reliant on Github metadata to make those links for you; there's nothing natively in git itself doing it. It's also an all-or-nothing affair, where the whole PR becomes a single squashed commit. To get anything in between ("here's my single large PR which I've rebased into N incremental commits, but you can also dig in and see the work that actually led here"), you really do need first class support in the tool.

I suppose the Github answer to all this would be "just make separate PRs", but going that way asks a lot more of the developer in terms of how polished those incremental states need to be.

Mercurial does this with the Evolve extension.

https://www.mercurial-scm.org/doc/evolution/user-guide.html#...

It still has the individual commits, but the interface will make it appear as if it's just one commit.

The real history is useless. Especially if we have tests. In that case it doesn’t matter how often we make changes.

I do think this is because I prefer to think of code as a black box. No one should need to figure out how my functions work. Someone should just need the name of the function, what inputs it receives, and what output does it return. If someone actually has to read my code, that’s a failure.

> If someone actually has to read my code, that’s a failure.

I can't tell if you're being serious, or are a brilliant troll. :)

Assuming you're serious, Hyrum's Law is one reason I might need to see your code (https://www.hyrumslaw.com/). The signature of your function is not the whole signature, it's just a sketch of the high points.

You really should just need to read the code in case something goes wrong, but otherwise, no. You need to be more careful with your time.
> Who do we really help by pretending that we're more organized, coherent, and linear than we actually were?

You help the reviewer.

To understand why git is the way it is, you have to understand the workflow of the original git-using project (other than git itself), the Linux kernel. Whenever someone proposes a change to the Linux kernel, it's sent as a sequence of patches. Each patch should contain a single logical change, and will be reviewed individually. For instance, suppose you want to change the way a field in a particular structure is stored. The first patch of your series might introduce a couple of helper functions to access the structure fields. Patches 2-5 might each change a separate subsystem to use the new helper functions, instead of accessing the field directly. The next patch changes both the field and the helper functions to use the new representation. When reviewing this sequence, it's easier to see that each patch is correct. And that was a simple example; it's not rare to have patch series with over 15 patches, and even longer patch series are not unheard of. I've seen patch series which refactor whole subsystems, where each patch in the series was an obviously correct transformation, while the final result was completely different.

From the Fossil page: > Rebasing is lying about the project history

This tired hyperbole just won’t seem to ever go away. Please try to ignore this junk, the Fossil devs could and should make their point without the FUD and misleading judgement, if they want to be taken seriously. Rebase has perfectly legitimate uses, and if Fossil makes it so you don’t need to rebase, that’s fantastic.

Rebase is most useful before pushing local changes to other people, and most people fluent in git know this fact, and also know that you don’t rebase public branches, you don’t rebase other people’s commits or your own after they’re pushed, except in emergencies and with team communication.

Rebasing before you push is the same amount of “lying” as typing something into your editor and then deleting it before you hit save. You don’t actually want your history at the raw keystroke level, right? You aren’t “lying” if you fix a bug you wrote before you push the bug into public branches, right?

> Understanding how you got to the goal -- encoding all the fumbles and disoriented thoughts right in the commit history -- that can be a genuine benefit to the reader.

Disagree.

Sorry, but I'd rather be rather inclined to read commit history like this: (whether it's reviewing others' code or my own at a later time)

- Add functionality X to function y()

- Fix a bug in y(): ...

- Fix a bug in z(): ...

than

- X

- oops

- fuck, typo fix

- do it another way

- ok, y is fixed now

- another typo fix

- it has a bug, fix it

- z has the same bug

- typo fix

Whereas the latter can be quite common during dev cycle so as to keep it to yourself. It's not about 'pretending' at all.

I think that's a pretty valid argument about just wanting to rewrite history.

I'll offer an alternative. I love having every commit buildable. When I'm drafting, this isn't going to happen. I'd like to save my work and move between machines more frequently than that. But after a rebase, it's great to only have compiling commits. It makes doing a bisect a lot easier when you're hunting for something.

I have found this works a charm, if I want to present a clean repo (for things like tutorials and classes): https://24ways.org/2013/keeping-parts-of-your-codebase-priva...

But basically, I let things "all hang out."

Tools shouldn't really be running the show.

My commit history is often a descent into profane madness.
For my solo projects I break the "don't code in master" rule because there is nobody else to coordinate with, and I usually only work on a major idea at a time. However I still use branches, usually if I want to quickly test out a breaking change, or if I start something I don't anticipate being finished with in a long time, so that my master branch remains usable for other side tangents.

The branching strategy means that it's pretty important that my commits are small, the brief commit message is accurate (even if I occasionally commit too many changes at once) and the description explains my train of thought. Nearly every time, I am communicating those changes to myself in 6 months when I switch into that branch randomly and wonder what I was in the middle of doing.

Rebasing private code to clean up WIP commits and break it into logical steps is healthy and a very good practice. But as Linus himself says in the linked mailing list post[1], just don't rebase public code.

     "In other words, you really shouldn't rebase stuff
     that has been exposed anywhere outside of your own
     private tree. But *within* your own private tree, and
     within the commits that have never seen the light of
     day, rebasing is fine."

     -- Linus Torvalds

[1] https://yarchive.net/comp/linux/git_rebase.html
This applies mainly to projects with kernel style of development. There are not many of those. In centralized repo style (GitHub), it's fine to rebase, even force push as long as you know exactly what you are doing and coordinate with your colleagues.
In a lot of projects topic branches generally have a single owner and are not used as the basis of other people's branches, even if they are technically public they are not public in the same sense that he is referring to. If you aren't going to be getting PRs on your branch you can consider it private and rebase all you like IMO.

edit to add: I generally prefer people not rebase after they've asked for a PR review just because the reference for comments will be lost. If they want to, maybe do it after all the reviews are approved.

I'm going to disagree, unless your are a solo developer (and even then it's bad practice to rebase commits that have already been pushed). Allowing rebase on shared branches just opens the door to too many possible catastrophic mistakes. When I make a new repo the first thing I do is disable history rewriting.
Absolutely. This guy has too many rules.

When I work alone I'm climbing a mountain, and Git is the rope. I can fall, but I won't fall far. I commit as often as I want to. The log is not a story for someone else to read later, it's the way I get to the top.

I find commit logs useful even if I'm the only person ever reading them. I like to be able to git blame on a line and be reminded of the context in which I did something and what I was trying to solve. I don't bother to pretty up my feature branches though, I just squash them so that master has a clear story
How we do it at work is every single commit message must have a ticket number in it. This is super easy to do and super useful. Even if the commit message is "fix exception #1823" you can go and look up #1823 and see what that issue was to make sure you don't reintroduce it with your change. You will always find more info and context in the ticket than in git commit messages.
I hate this. Often I am rewriting a bad comment, or improving the working code I checked in yesterday whose ticket has already been closed. Deleting an unneeded #include. All kinds of stuff for which there is no open ticket.

This kind of rule prevents people from maintaining the code base as they go. I have literally quit a company because of bullshit such as this. I was a senior engineer and could not fix a typo in a comment without a bug number and two code reviews.

This has been a non issue for me. I just slap minor fixes in with other tickets even if they are not related. Usually I just drop a comment in the review page with "saw this other issue and fixed it"

Short circuiting the review and QA steps is not ideal. A reviewer should see the change is just a comment typo fix and accept it even if it has nothing to do with the current ticket.

The log is a useful byproduct, but it's not the product.
This is an excellent metaphor, thank you. I’m going to add it to the bag of metaphors I use when explaining git
> It's kind like a snapshot-based local history.

You can extend it to remote-history too, because git makes it almost trivial to create a repo that you want to work over the network (without a running server of any kind).

I use git as a fancy rsync sometimes.

I do most of my work on a remote box, but I still like to edit locally in an IDE, but occasionally I make a change on the remote side.

On the remote side, I do

git init --bare project.git

git remote add clusterx remotebox:dev/project.git

Then do a git clone on the remote box from that repo, then I can push changes back to that local repo and when I'm done with the day, I can just pull it all back to my laptop with a git pull.

This used to be full of patch + diff + rsync in the past, but when you build stuff remotely and do diffs, but add new files to the codebase, it is so much easier to just use git.

For my personal projects, I think CSS files are the most common things I've edited in this sort of loop - my web-app folders are generally just git clone --depth 1, which also takes care of the other loop where I edit locally and deploy to remote.

Even when I use git by myself, I like to use branches. This helps me keep my work separate, and avoids issues if I'm working on one thing but need to do a quick fix elsewhere. I also tend to have anything in master set to go to live, so branching helps keep things that aren't ready from going live. Even if you don't have a "production" environment you push to, making sure your master branch is only code that works well is a good idea IMO.

And as another reply mentioned, squashing commits is good for keeping your history cleaner. My branches tend to have a ton of "fix" commits that get squashed out when merging into master.

Another reason I personally like branches is because it gives me confidence to make breaking changes without immediately having to worry about regressions.
Is it "terrible"?

If it's working for you to produce software how you need, then it's working.

But I would say it's building up habits of using git that would not transfer well to a multi-person team. That may or may not matter to you.

OP's usage is interesting, I think by and large they are transferable to a multi-person team, they are still good habits, or on the _way_ to good habits or _similar_ to good habits with a multi-person team. The one difference is how much easier it is for a solo developer to "rewrite git history" without disrupting others, in OP we see it done with abandon.

But in general the way OP is thinking about things -- what they are trying to prioritize how -- are things that apply to a multi-person team too. Keeping commit history readable, keeping branches cohesive, etc.

Your practices are... not. Which doesn't make them terrible, but it means you are developing habits you'd probably have to revise when/if working on a multi-person team.

Yep, small disciplined commits take valuable time. If you rarely revert or get other benefits from them they might be a net loss for you. Especially in solo projects when you can keep a lot of what's going on in your head.

It's a bit like testing - there's a lot of posts about where you need them and not many discussing where you don't.

It is funny because I use git (and commit messages) to help me keep track of what I was working on since I'm a solo developer but also an entire IT department so coding is only a portion of my time. Sometimes I'll just be starting to implement a major feature when something else will come up and I'll have to put it on hold for a few days/weeks. Having the quick little commits helps me figure out where I was and helps me get back into the flow.
If you use “git add -p” it makes small commits pretty painless. I still like small commits in repos I work alone on because it makes reading the history during future debugging easier.
Side conversation because I recognize your username. I've been playing wordoid every day since you posted it three weeks ago. You made a comment about having heard that someone scored 3000, and I think that's now in my mind as an end goal. I've gotten to about 1800 and can't quite let go yet. :)

https://news.ycombinator.com/item?id=25999655

Great, glad it's a fun distraction and that's a better score than I can get. :) Feel free to contact me outside HN as well if you can think of any improvements (global high score tables are sounding good!).

More on topic, when coding the game, I was Git committing maybe every hour or so without useful commit messages and didn't have a problem. With games (in the early stages anyway), I find you're typically changing lots and lots of small things all over the place to tweak the gameplay and presentation in an experimental way, so granular commits aren't helpful.

I would switch to more granular commits now though since the game has stabilised more.

If you work with branches, can you merge with the --squash option? This makes one neat commit on your default branch. You could even then commit without the -m option, and type a more descriptive multi-line commit message detailing the changes you've made.

I only work on little solo projects and this is what I'm doing. It makes a very readable history, and helps me answer "why on earth did I do that?", but it's harder to revert small changes later.

If I'm working with others, I try to match my committing style to the project.

Squashing makes tools like git blame or emacs’s vc-annotate a lot less useful: with small commits, I can reconstruct the code as it was when a particular line changed; with a squash, the coordinated changes are a lot less useful.
Without squashing git blame has too much noise in it for my taste. I don't want to see 90 different commits in a single file's blame, when they were actually related to 9 different features. If each topic branch has a reasonable scope then the squashed changes I think are more useful than each little tweak or fixup.
If you really want to do it nice, you get to the end and then move all of the commited changes in to the uncommited state and then recommit them in to logical steps and commit them piece by piece with well written messages.

But at some point you are spending more time bookkeeping than the actual value you will get from it. If its a personal repo, don't bother. If you are sending a patch to Linus, tidy your commit messages.

Pretty much what I do too, even working in collaborative projects. Once nice thing about git is that it makes it easy to go back and clean up your history with rebase before publishing it to others, so you can make as big of a mess as you want in your local branches without anyone else having to see it.

So yes, having "interm" or "wip" commits would be an anti-pattern in a shared repo, as it makes it harder for others to see what changes you made. For a local branch though; not a big deal.

So maybe that's the idea for my projects that I make available to others. Be a bit more deliberate with branches, allow them to be junky, and clean up when I merge to master. That seems like the best of both worlds with minimal effort. I think I'll even try it.
That has been how I have started to do it. I make a new branch, make a mess of it (until I am finished), then merge it back into the original "golden" branch.
> Is this a terrible coding practice? I don't have enough non-me experience to know what an anti-pattern this probably is. I probably won't change my process, but I'm curious.

For solo development? I don't think it's good, but ultimately you should do whatever works for you. When you work on a team, though, it might be hard to break the habit later if this is what you're used to, and you really really will need to. Nobody wants to see "snapshot" commits in a shared repo; commits will need to actually accomplish a clear goal. Also, I find it very helpful to be able to make independent changes in separate commits (sometimes I see something wrong that's unrelated to whatever I'm doing), then reorder (rebase) them to polish them sometime before pushing. If you don't get in the habit of making your commits somewhat orthogonal, you won't be able to do these kinds of things (whether on teams or solo).

I think what you're doing is fine especially if you're just coding for yourself. I'm perfectly happy with slightly more expressive commit messages like "LoadData appears to be working; still hacking away at TransmorgifyData", followed eventually by "stabilized TransmorgifyData".

I think it's when you start syncing with other people over multiple days that people start insisting that a commit should (compile), be atomic, tested, etc. What they're really looking for at that point is that incoming changes be easy to understand and modular and possibly easy to omit if some code change is causing them trouble for a moment.

As a solo developer who works across several machines, I still use subversion to coordinate code. I have yet to see the advantage of using git in this situation.
As long as you're not still using CVS -- that would just be masochistic. :)

If it's working, stick with it. Most people use Git as a centralized RCS anyway. I like the decentralized features of Git (darcs, fossil, hg, whatever), but mainly for short-term problem solving -- on any project, eventually an official hub emerges.

Yeah, I'm pretty much the same. Still, there are elements of the article's approach that I follow.

  Principle 2a: Every commit must include its own tests
  Principle 2b: Every commit must pass all tests
But otherwise, I don't create branches. My commits are medium sized, one big thing, and to the trunk. My commit messages are at best ok.

After a commit, I git --amend liberally. It's never really clear in my mind when a commit ended and the next one starts. This wouldn't fly in a group.

The one think I'd recommend is Never Type git. That's overstating, but basically git's command line syntax is just terrible AND thus dangerous. I think my one moderate sized git screwup was due to the command line syntax. So now I hide (most of) it behind shell aliases. This guy goes a bit far but you get the idea:

https://github.com/ohmyzsh/ohmyzsh/wiki/Cheatsheet

I pull rarely enough that I prefer to type it out.

Also, configure a good diff tool (although Apple seems to reject kdiff3 for now). And .gitignore goes without saying.

Regarding diff tools: suggestions welcome. I've used VIM's three-way diffs, and would prefer to stay in the CLI. But the Jetbrains IDEs come with a great GUI diff tool for Git merge conflicts. I'd love to find something comparable on the CLI.
I use kdiff3 but I only want a visual diff. I'm not using it as a merge tool. VS Code has great git support and great diff support. If it only had better (or rather, more accurate) vim support I'd be there but every time I try, I head back to standard vim with something like VimR. It's been six months and I should try again again.
Thank you.
>Is this a terrible coding practice?

Are you the only consumer of the practice? And do you like it? Then no, it's not terrible at all, it's useful. Git will function just fine for this. I do similar things with my "experiment" repos, they're practically "streams of thought saved to disk" and they contain a ton of digressions and occasional breakages and that's totally fine. I have zero complaints after several years of doing this.

The major benefits to much-more-structured approaches come in the form of automated tooling that's really only useful when you have large repos or many contributors (git bisect is a perfect example), or external automation (ci/cd pipelines, etc). For those kinds of repos, yeah, I'd say it's a terrible practice, and it'll cause some easily-avoided pain. But even then: work however you like on a branch, and merge (or squash) when you have "good" stuff, and it generally works well.

I do that on shared projects too. I absolutely will not end my day with work that exists only on my machine, and git is a fine place to put it, as far as I'm concerned. I routinely make a branch called "phil/stash" that I will commit totally broken code to at the end of the day. Then I rebase/ammend it into shape when I'm ready to PR.
Just squash the junk commits with rebase when you are done. It keeps the history clean and you have many points to revert to.
Same thing for me.

Generally I am against rewriting history unless there is a big mess to fix. For me, git is my work process, and bugs, typos, bad merges and code that doesn't compile is part of it and I don't try to hide it. Personally, I value historical accuracy more than cleanliness.

But some people have compelling arguments for the opposite, like the author. These people tend to view git as a release schedule where every commit is workable code. It is good for bisecting, and git log is your actual changelog. But you lose information about how you solved problem, when you did what, etc... it is also more time consuming to maintain.

You can use a hybrid solution with two parallel branches and merge commits, or you can just use tags.

Git is not very opinionated on how you should work. Merge or rebase, clean or historically accurate, push or pull, etc... There is more than one way to do it.

> Is this a terrible coding practice?

Nope. I have been versioning everything from C# to SQL for more than a decade and it saved me many, many times. With Subversion too, which is far less evolved and modern than Git.

The advantage of mastering a complex tool like git and mantaining a central repository is the increased granularity of commits/branches and clarity of versioning, but if the "snapshot here and there" approach works for you, then use it.

Some of the sophistication you are 'missing' is there as a solution to scaling problems, to business problems (do you need to track/fix customer/client issues?), or to other problems that may not be as critical for solo devs.

Some of the other is best practices which you are missing out on.

You might want to take a look at what some 'best-practices' are and see which might improve your coding.

Simple things like tagging a commit as "feature/fix/refactor/chore" might make you think differently about your programming workflow. Or you might find it more of a distraction and limitation than a help.

and yes, sometimes you certainly need that 'interim' tag to freeze work. For those rare cases where you run out of time or inspiration before you get to a natural end point of a task.

For me, those cases of running out of time or inspiration before I get to the end of the task are incredibly common (basically a daily occurence). I understand it might be rare for you, but for me (and I would think others) it's the default state of programming.

When I'm really getting going I flow through a ton of work and only stop when I hit a time limit, so I expect to finish in the middle of a task whenever I start coding.

Yeah I'm 13,000 lines into a solo side project and haven't bothered with a single branch+merge. I've got a bunch of tests, but I don't test everything, I don't make sure every commit has tests. Commits are mostly checkpoints of when major feature achieve some kind of initial stability where I want to be able to diff back to last-known-working. I try to do better commits than "WIP" but they're something like "such and such feature now seems to actually work (lots of buggy edge conditions)". I'll throw in a lot of unrelated code cleanup that happens into single commits as well. I focus on moving the needle on the end results though and not having perfect process. Many bits of code that I have which work well enough don't have any tests at all. As I hit bugs I drill into code and fill out the tests that I didn't do. Simple code that is used all over the place and doesn't cause issues may not have any formal tests at all. I rarely actually bother to go back into my own git history, mostly I just use it like a quicksave in case I wind up dying at the next bossfight.

I think what'll get you in trouble more than having perfect process is writing spaghetti code that violates separation of concerns. If things are separated well, you should be able to come back and test it easily if it causes issues. Test the stuff that is complicated and obviously will cause issues if its not perfect. Test the stuff that is found to be buggy or needs to be proven to be not buggy in order to track down bugs. Don't bother with perfectly testing everything.

I've been adding a threaded AVL tree implementation lately. I definitely tested that extensively and did a savegame when the AVL tree was written and passing tests properly, and then added threads and did another couple of savegames. I'm going to build on top of that, and I need to be able to trust it without falling back into debugging it. I've got a Clamp01 function though which takes a double and ensures it is within 0 <= x <= 1 and I don't have that one tested. I'm pretty confident it works though.

Sounds exactly like what I'm doing.

What I wondered was if and when other developers deviate from that workflow. After the first release? After the first collaborator has joined? Never?

Are there textbook developers who use a strict strategy like the one Daniel Stenberg [1] is following from day 1?

[1] https://daniel.haxx.se/blog/2020/11/09/this-is-how-i-git/

I do think that after you release and more or less go "1.0" (even if you don't call it 1.0) you should start treating master as always-releasable. At that point if you have a lot of work to do and need checkpoints, do it on a branch. Same with major breaking change features. Keep master always ready so that you can release for bumping your upstream deps, releasing security fixes, or other interrupt driven housekeeping to stay current.
I mostly do solo work too and for me the main goal is less to make code readable for other people, but for code to be readable for myself in a year from now.

I’ve definitely had situations when I had reverted code to a year back to check if there was a bug. Git was very helpful from that point of view.

That has been my pattern for some hobby projects, however it quickly becomes a mess to keep track of what you have done over a period of time (I have the same bad habit, I commit things so I can freexe my code in case I mess something up).

My hobby project has been to extend a program with a new plugin, but then along the way I found bugs in the core code, and I wanted to upstream the fixes (and the plugin). I am very glad I knew what the fixes where, becasue I more or less had to rebase to the original code in order to untangle the mess of commits I made.

I have also found other folks wanting to use my code, so it also made it much more helpful if ourside folks can see how I altered the original program.

Absolutely nothing wrong with this if you are developing solo. As suggested by a sibling comment, you can always squash if you are looking for a cleaner history, though this is probably isn't necessary if you are never going to share your code. If you are going to share and your important commits are clean, then it's easy to squash.

Another good trick trick is to simply stage things to "freeze" them—You can then `git checkout` any changed files if you want to revert to the staged state. This is useful if you are in working state without a lot of changes but want to run a quick experiment before committing.

I don't think it's an antipattern if you're working by yourself. As you said, it's a safety net that helps your confidence in case something does go wrong. The only thing I'd recommend changing might be to make your commit messages a tiny bit more detailed, even for the interim commits - that way you know what's going on at each commit if you do have to eventually do a `git bisect` to hunt down a regression.

Going a step further and rebasing interactively to tidy up your logs would also accomplish the same goal, but if it's just for your own eyes, it's probably not worth the time.

As long as the commits compile, it's a great coding practice. Unfortunately most development teams haven't caught up to what git actually helps with (and doesn't help with) yet and will block you from doing this, but if you can find a team that's open to actually testing out different practices and seeing what works better then it will serve you very well.
> Is this a terrible coding practice

It is not. I do the same.

I tried to use git the way OP describes and I was taking more time to manage the logistics than to code.

I then hit the "I can obviously call a function in a new commit from one in a previous one". Handling this gracefully means full time work.

My code works 70% of the time anyway si I ended up making efforts to move it up a few percent points than to have a byzantin git tree.

I do the same. I use branches almost exclusively for features that won’t be done for a while and won’t make it into production until then.
If it works for you it's fine. I used Git in different ways depending on the projects, on some solo projects I do like you say (similar to saving a game to be able to go back). For some other solo projects (with a longer lifespan or more critical) I follow Gitflow and I am more strict with the process.
> Is this a terrible coding practice?

Not at all. Copying folders with names like code1, code2, code3 is terrible practice. Using SCM and committing as checkpoints while you work is good coding practice. You have a process, and it works for you.

I don’t think it’s an anti-pattern but you might just need some sort of backup tool with incremental backups and rollback. Sounds like it would likely suit your needs with less overhead.
I do the same if I am doing assignment or writing a single script.

I'd prefer something close to his if I am writing a library or a small app.

If using this pattern it helps to split out your commits, i’d possible. I do this with changelists in intellij/pycharm.
I thought I was the only one using guy like this.