Hacker News new | ask | show | jobs
by wyuenho 1586 days ago
Every time I was tempted to do something like this, I hesitated because I didn't want every other line in every file with my name on a single commit, mostly to avoid making git blame harder than necessary. It would be nice if there was a kind of diffing algorithm that can diff code units *syntactically* across history.
6 comments

You can tell "git blame" to ignore specific commits which helps a lot here: https://www.moxio.com/blog/43/ignoring-bulk-change-commits-w...
The problem with this approach is, the blame before and after the ignored wouldn’t make any sense to the viewer if he didn’t know about ignoring the formatting commit. Also, you will need to configure that for every clone. Since tree diffing algorithms are pretty well known these days, I don’t know why there hasn’t been any real effort to implement a git plugin that can chase syntax tree node changes instead of doing string diffing like it was the 70s. Syntax parsers are so easy write now and surely the tree node changes can be cached. Your usual diff/patch tooling wouldn’t work for this kind of diff, but that’s just an option away when you need them back.
Here’s a script that automates the once-per-repository local setup of this feature:

https://github.com/ipython/ipython/pull/12091/files

Unfortunately there isn’t support for it in GitHub or GitLab yet, but there’s at least a GitLab issue here requesting it:

https://gitlab.com/gitlab-org/gitlab/-/issues/31423

This is a nice feature, but I do wish that .git-blame-ignore-revs was automatically applied, similarly to .gitignore and .gitattributes. Hopefully there are plans to do so in a future Git release?
Not everyone uses PyCharm, but if you do it's really easy to highlight a specific code block and look through the git commit history for that section. I've used it many times for this exact type of problem, trying to find when the last substantive change happened.

To do this just highlight the block, right click, and choose Git > Show History for Selection.

The best way to do this is to rewrite history with git filter branch / etc and rerun black at every commit. Then everyone nukes their clone and you continue on with the best of both worlds.

The only real downside is you nuke your issue tracker at the same time.

That’s correct. Which is a shame.
In my experience it's better to just bite the bullet and do it. Eventually you will do it, so you either screw up git blame for a small codebase with a small amount of history, or wait until it is a large codebase with a large amount of history to screw up.

> It would be nice if there was a kind of diffing algorithm that can diff code units syntactically across history.

There have been quite a few attempts at that though I've only seen them applied to resolving merge conflicts. It would be interesting to try them for blame too.

Does the user matter? As long as the commit message is something sensible like 'Autoformat with black' it can be easily ignored when seen, and you can avoid seeing it with blame as simonw suggests.
The problem is that this revision will override all the previous ones in the “blame” output so it needs to be explicitly ignored. See a great link elsewhere in the thread on how to deal with that in newer versions of git.
Yes, as I said?

My point was that the user doesn't matter (vs. anything else about the commit) to me in any context that I see it.

And then I mentioned without reiterating the advice about hiding the commit from blame just as you did.

In any context where I see "OJFord committed 'Autoformat with black'" for this, it's not 'OJFord' that's the problem IMO.

Git blame has a feature just for this `git blame -w -M`. -w ignores white space changes and -M isn't really necessary but will ignore moved lines
On the flip side you can get an intern to commit. /s.

Probably best to just make a one time git user to do it.