This is also a problem for enterprises. I’ve seen commits from root, ec2-user, etc: GitHub knows who’s pushing a commit even if git doesn’t, and it’s maddening that at least for enterprise accounts they don’t carry that identity into the metadata.
That would change the commit hash, at least if you want it to survive a clone of the repo. Of you'd store it externally so that it would only be able to be shown in the webui then it's of limited use, but maybe better than nothing.
I feel the commit data could be extended to include some metadata that isn’t used to compute the hash. GitHub could then make use of this data to populate whatever.
(Not sure if such a field already exists in the commit blob)
I’m somewhat familiar with how git works. In my understanding, a commit is just a blob combining the commit information and a tree blob, hashing them together to create a commit id.
This design doesn’t preclude the usage of additional information in the commit blob that isn’t used to compute the hash.
(Think for example how file access times do not affect its hash)
Git is a content-addressed object store, the address of any stored object is the hash of the object itself. So you actually can't stuff extra data into an object and not change its ID; this auxiliary data would need to go in a separate store indexed by object ID or a similar solution. The reason why file access times don't affect git hashes is because git does not store them.
Correct. My suggestion for a solution is for github to add a "reject-unsigned" feature. Only allow commits signed by <my gpg key> and <my email> to be pushed to github, under any projects/org.
1. What happens when someone needs to resolve a merge conflict involving your commit? Let's say I maintain a fork of an open source repo to add some feature, and I periodically merge back in upstream changes... that necessarily involves resolving conflicts. By default, git retains author ownership, and now the commit is unsigned, but it's really your work. What do we do? Do I have to use a custom merge flow that also rewrites authorship from "Alice <alice@gmail.com>" to "Alice <alice@gmail.com.fake.unsigned.suffix>"?
2. What happens if your gpg key is compromised or expires? Are all your previous repos now invalid? I can't fork it because it contains a commit authored by you, but with a revoked or expired gpg key?
3. What happens to previous commits if I enable this feature? Can all my unsigned commits no longer be pushed to github? I made a commit in a project at work 5 years ago with my email, but didn't sign it.. if that company wants to open source that project on github, do they now have to rewrite history to change the author on my unsigned commits?
4. What does the "squash merge" button on github PRs do for your PRs?
We've adopted this policy internally. It largely requires using the proper git workflow, rather than (in our case) gitlab's GUI flow, though we make an exception for a simple merge (I think it's verifiable that no code is added).
The check is that commits at the point of merge have a valid signature. Historical commits are part of the history and as such cannot be changed without an additional commit (with a valid signature). Previous unsigned commits are deemed trusted at the point you begin signing and checking.
Squash merge breaks stuff and shouldn't be used. To complete things, the restricted set of operations exposed through the GUI should sign using gitlab's or github's key (or some accepted bot key we've set), with the check happening on the input commits, but AFAICT that's not supported yet.
A blockchain could be useful for proving a commit was not done significantly before or after the claimed timestamp, and (depending on how you do it) non-repudiation.
IMO I don't see how this is desirable enough to be worth it.
Wider adoption of GPG signatures would be much more impactful.
It could make sense to use a blockchain for key distribution/key discovery/revocation, effectively as a replacement of PKI, though.
They have something under Settings > SSH and GPG keys where you can enable Vigilant mode.
While that still allows pushing unsigned commits, it will flag them with a warning batch.
I had this on for a while, but unfortunately as some open source projects tend to rebase commits before pushing them, this was causing warnings to be shown (as the rebase breaks my signature), so I turned it off again as to not scare people when looking at the commit history of a project and seeing warnings after my contributions were merged in.
Good to know, I was not aware.
The squash/rebase issue is definitely problematic, though a tree of signatures could be appended to each commit. Now... this does break how commits are currently signed.
a git repo is practically a blockchain. Fixing this will require how git treats signatures, but no additional parallel architecture needs to be created.
I lost a Yubikey, so I revoked the subkeys on it. Github didn't let me update the existing pubkey, so I had to remove and re-enter it. Now all things signed by the revoked key, despite being before the revoke date, come up as unverified.
I'm not sure if it's a me issue or a PGP issue or a GitHub issue, but it's a pretty broken system.
How would that work for past commits? Would people be forbidden to mirror a project to git just because it contains an unsigned commit of mine from 2007?
It would only allow commits signed by me to be pushed under my email.
Github uses the email as the "proof" of commit ownership. By only accepting signed commits a user would not be able to push a commit impersonating me.
That solves impersonation, but that is not a related problem here.
These repos were not taken over but cloned and made to look like another repo via similar naming.
I think what you're looking for is more "all accounts must be verified via payment/identity" then you really know who is making "random clones" and "look-a-likes" w/ malware.
But you've got a whole host of other problems in the process.