| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by drekipus 1420 days ago
	This is that thing where people can put anyone in as the commit author, thus impersonating the original creator right? Seems like the solution is "don't just copy random github urls into your code" ?

3 comments

macintux 1420 days ago

This is also a problem for enterprises. I’ve seen commits from root, ec2-user, etc: GitHub knows who’s pushing a commit even if git doesn’t, and it’s maddening that at least for enterprise accounts they don’t carry that identity into the metadata.

iforgotpassword 1420 days ago

That would change the commit hash, at least if you want it to survive a clone of the repo. Of you'd store it externally so that it would only be able to be shown in the webui then it's of limited use, but maybe better than nothing.

macintux 1420 days ago

Specifically it could be exposed through the GitHub REST API without impacting the commit itself.

xrisk 1420 days ago

I feel the commit data could be extended to include some metadata that isn’t used to compute the hash. GitHub could then make use of this data to populate whatever.

(Not sure if such a field already exists in the commit blob)

Karellen 1420 days ago

> I feel the commit data could be extended to include some metadata that isn’t used to compute the hash.

That's not how git works.

xrisk 1420 days ago

I’m somewhat familiar with how git works. In my understanding, a commit is just a blob combining the commit information and a tree blob, hashing them together to create a commit id.

This design doesn’t preclude the usage of additional information in the commit blob that isn’t used to compute the hash.

(Think for example how file access times do not affect its hash)

Karellen 1420 days ago

If it's not part of the information that's hashed to create the commit id, it's not part of the commit. By definition.

formerly_proven 1420 days ago

Git is a content-addressed object store, the address of any stored object is the hash of the object itself. So you actually can't stuff extra data into an object and not change its ID; this auxiliary data would need to go in a separate store indexed by object ID or a similar solution. The reason why file access times don't affect git hashes is because git does not store them.

jwilk 1420 days ago

I think you want git-notes (or something similar):

https://git-scm.com/docs/git-notes

"Adds, removes, or reads notes attached to objects, without touching the objects themselves."

jollybean 1420 days ago

What is the difference between a 'random' and 'non random' repo?

The whole point of 'Open Source' is that we can use code which might otherwise be a bit 'random'.

It's not 'Institutionalized Open Source' it's just 'Open Source' i.e. we're not all Torvalds.

So, credibility etc. is a very fickle thing otherwise, this is a serious security issue and we really don't have answers.

We used to think about code as 'logic that works' but now we have other criteria, I wonder if our FOSS models need to adapt bit.

drekipus 1420 days ago

It's a good point actually.

I suppose the message is "read the code you're using" but that is hard for big libraries and frameworks.

Obviously using one's code where they are impersonating someone else is a big red flag.

jollybean 1419 days ago

Reading the code for functional integrity is already a big deal, but having to sleuth around for the sneacky hacks? No way.

I don't know what the answer is, but the model has to be changed.

stevelacy 1420 days ago

Correct. My suggestion for a solution is for github to add a "reject-unsigned" feature. Only allow commits signed by <my gpg key> and <my email> to be pushed to github, under any projects/org.

TheDong 1420 days ago

Let me ask a few questions about this scheme:

1. What happens when someone needs to resolve a merge conflict involving your commit? Let's say I maintain a fork of an open source repo to add some feature, and I periodically merge back in upstream changes... that necessarily involves resolving conflicts. By default, git retains author ownership, and now the commit is unsigned, but it's really your work. What do we do? Do I have to use a custom merge flow that also rewrites authorship from "Alice <alice@gmail.com>" to "Alice <alice@gmail.com.fake.unsigned.suffix>"?

2. What happens if your gpg key is compromised or expires? Are all your previous repos now invalid? I can't fork it because it contains a commit authored by you, but with a revoked or expired gpg key?

3. What happens to previous commits if I enable this feature? Can all my unsigned commits no longer be pushed to github? I made a commit in a project at work 5 years ago with my email, but didn't sign it.. if that company wants to open source that project on github, do they now have to rewrite history to change the author on my unsigned commits?

4. What does the "squash merge" button on github PRs do for your PRs?

hgomersall 1420 days ago

We've adopted this policy internally. It largely requires using the proper git workflow, rather than (in our case) gitlab's GUI flow, though we make an exception for a simple merge (I think it's verifiable that no code is added).

The check is that commits at the point of merge have a valid signature. Historical commits are part of the history and as such cannot be changed without an additional commit (with a valid signature). Previous unsigned commits are deemed trusted at the point you begin signing and checking.

Squash merge breaks stuff and shouldn't be used. To complete things, the restricted set of operations exposed through the GUI should sign using gitlab's or github's key (or some accepted bot key we've set), with the check happening on the input commits, but AFAICT that's not supported yet.

eurasiantiger 1420 days ago

And now we’re on the fast track to adopt a blockchain as a tamper evident mechanism.

jhugo 1420 days ago

git is effectively a blockchain. Trying to use a blockchain for this has many of the same problems as described in GP's comment.

3np 1419 days ago

A blockchain could be useful for proving a commit was not done significantly before or after the claimed timestamp, and (depending on how you do it) non-repudiation.

IMO I don't see how this is desirable enough to be worth it.

Wider adoption of GPG signatures would be much more impactful.

It could make sense to use a blockchain for key distribution/key discovery/revocation, effectively as a replacement of PKI, though.

pilif 1420 days ago

They have something under Settings > SSH and GPG keys where you can enable Vigilant mode.

While that still allows pushing unsigned commits, it will flag them with a warning batch.

I had this on for a while, but unfortunately as some open source projects tend to rebase commits before pushing them, this was causing warnings to be shown (as the rebase breaks my signature), so I turned it off again as to not scare people when looking at the commit history of a project and seeing warnings after my contributions were merged in.

stevelacy 1420 days ago

Good to know, I was not aware. The squash/rebase issue is definitely problematic, though a tree of signatures could be appended to each commit. Now... this does break how commits are currently signed.

eurasiantiger 1420 days ago

Would this be a problem if commits were stored in a blockchain? Rebase would effectively fork the chain.

pilif 1420 days ago

a git repo is practically a blockchain. Fixing this will require how git treats signatures, but no additional parallel architecture needs to be created.

denismi 1420 days ago

I lost a Yubikey, so I revoked the subkeys on it. Github didn't let me update the existing pubkey, so I had to remove and re-enter it. Now all things signed by the revoked key, despite being before the revoke date, come up as unverified.

I'm not sure if it's a me issue or a PGP issue or a GitHub issue, but it's a pretty broken system.

bonzini 1420 days ago

How would that work for past commits? Would people be forbidden to mirror a project to git just because it contains an unsigned commit of mine from 2007?

szundi 1420 days ago

I like it, although raises the bar for contributors to join.

Also does not help that enough percentage of repo owners would accept then signed PRs to their projects.

The fake accounts can be created with gpg signed fake commits too.

irjustin 1420 days ago

How does this solution solve the problem?

You're just adding an extra step that's hardly going to stop someone.

stevelacy 1420 days ago

It would only allow commits signed by me to be pushed under my email. Github uses the email as the "proof" of commit ownership. By only accepting signed commits a user would not be able to push a commit impersonating me.

irjustin 1420 days ago

That solves impersonation, but that is not a related problem here.

These repos were not taken over but cloned and made to look like another repo via similar naming.

I think what you're looking for is more "all accounts must be verified via payment/identity" then you really know who is making "random clones" and "look-a-likes" w/ malware.

But you've got a whole host of other problems in the process.