Hacker News new | ask | show | jobs
by lostmyoldone 2605 days ago
Re GitHub keeping unreachable data, if I understand it right, isn't that GitHub painting a giant target on their back? Wouldn't that that imply every secret accidentally committed and then 'deleted' is still accessible, when one would expect it not to be? It's one thing to have your source code in the wild, but pairing it up with thought-to-be-deleted secrets would be an absolute disaster.

Certainly one should not ever keep using a secret once it has escaped into a Git repo, but I'm sure it happens quite frequently.

4 comments

> Wouldn't that that imply every secret accidentally committed and then 'deleted' is still accessible

This should be a moot point because anyone (in IT) should realize that an accidentally committed secret is now 100% public for all eternity and needs to be rendered irrelevant to restore secure operations.

And a hundred times so for any public repos. There are bots feeding on the GitHub firehose, scavenging for accidentally committed credentials.

A few years back (2015 or so) the average time from push-to-repo to AWS account compromise was 6 minutes. Surely that time has only gone down, and the number of different credentials identified has gone up.

> the average time from push-to-repo to AWS account compromise was 6 minutes.

Wow, I didn't realize it had become so efficient, but I shouldn't be surprised. I never really understood the value in hosting non-public software in the public, and if it's open source, it shouldn't be getting anywhere near secrets that can be used to extract money from its developers.

I remember thinking, back when it became trendy for people to upload their personal dotfiles to Github, that it would be a source of endless suffering. Who knows what information you're leaking in your ".profile" or ".bashrc"? Is that risk justified by the dubious benefit of storing your dotfiles on the internet for everyone to see, forever?

I had accidentally pushed an AWS credential out a month or two ago- within about a minute and a half AWS had disabled the IAM user, and automatically emailed me(as well as my entire org- how embarrassing!)- when we were going through the access logs it looked like it had taken only a minute and a half longer for some other, presumably malicious, system to attempt to access my compromised user. Probably between 2 or 3 minutes total. I'm not a huge Amazon fan but props to AWS for saving my butt.
Why have credentials anywhere outside of the .aws directory in your home directory? When developing locally all of the SDKs will read them from there and when deploying to AWS, the SDK will get them from the attached role.
I understand what the best practices are, it was a total mistake- I never even intended to push what I did to github.
> A few years back (2015 or so) the average time from push-to-repo to AWS account compromise was 6 minutes. Surely that time has only gone down, and the number of different credentials identified has gone up.

I don't doubt that a second and I'd like to use that as a quote. I'd like to be prepared if someone doubts it, so: Do you have a primary source for this?

This paper may be relevant to your interests: https://blog.acolyer.org/2019/04/08/how-bad-can-it-git-chara...
I'll need to find the talk I lifted it from. Not easy... but looks like downthread a sibling comment gives a relatively decent update about the current speed of compromise.
Answering myself: I think it was a BSides London talk. (Quite likely from 2017.) After doing a search, I don't think it was recorded.

Hence, I can't provide a primary source. Sorry.

I thought that AWS nowadays is also feeding at the firehose and auto-disabling any of its keys it could find in a commit?
Security isn't rendered in absolutes. We have to assume some sheepish new employee somewhere is scared of approaching management about a mistake they made committing a secret, so they reverse the commit and pretend nothing ever happened.

We have to try and mitigate damage from lapses in communication and protocol like that.

Math and cryptography don't care about a sheepish new employee (thankfully). The fact that leaked secrets will cause trouble is not mitigated by git forgetting a deleted commit. It is only mitigated by revoking that secret and creating a new one and not leaking it. So if a sheepish new employee fails to revoke them, why blame git or any other system? We have contracts, insurance and then criminal code for people who fail to follow protocols.
> So if a sheepish new employee fails to revoke them, why blame git or any other system? We have contracts, insurance and then criminal code for people who fail to follow protocols.

Because you won't know that a protocol isn't being followed. Your contracts, insurance, and criminal code won't cause you to realize that an employee caused an infosec incident if they don't tell you (and neither will your math and cryptography). And the more you threaten use of the criminal code, the less likely people are to admit that they made a mistake.

You can either build defense in depth (e.g., regular secret rotation, policies on use of GitHub in the first place or better yet automation that only pushes publicly after internal review, DLP via a corporate MITM, segregating your open source dev from your secret dev, etc.) or you can let your single defense get breached and have no idea.

The criminal code? I doubt there's anything in there that criminalizes a failure to follow an employer's secret revocation policies.
No, I wasn't referring specifically to this case. Generally, if people don't "follow the protocol", we have criminal code. If machines don't follow protocol, they end up with wrongly decrypted garbage data. It was to highlight the point that we have different measures to deal with people than we have with computer security, because math cannot prevent people from deviating from their protocols.
No need for any git-blame, the blame lies in GitHub not making this feature optional and more known. So well known that a noobish employee is aware of it, so that they do not feel like they are "safe" and no longer need to alert management about their error.

Contracts, insurance and criminal code are responsive measures, not preventative measures. Security is preventative, not responsive.

> the blame lies in GitHub not making this feature optional and more known.

Which feature are you meaning should be optional and more known?

In either case, the secret is already out whether the user wants to admit to it or not
But in one case, damage is mitigated because the sys admins didn't assume everyone is infallible and strictly adheres to protocol.
The correct way to deal with fallibility in this situation is to make it feasible to change secrets when they leak, not pretend they weren't leaked.
That doesn't prevent someone from not following protocol.
Is it mitigated? Once it's leaked you can't force everyone who may have captured it to delete it. So GitHub deleting it doesn't solve the problem.
The definition of mitigation is to make something less severe. Yes, GitHub making this policy as clear as possible and allowing controls to toggle it per-repository or per-account mitigates the problem.
I agree with the general statement about security absolutism (it's often very dumb and irrational), but in ths case in particular, most keys are swept from GitHub within seconds of being pushed, so the additional harm of not pruning those commits is very low. Data loss concerns are probably a much larger source of harm to weigh against it.
Well now I would say that I'm not interested in most keys, and I am interested in figuring out how to mitigate damage from the rest of them. You only need one key to get inside.

99% coverage is not good enough from a security standpoint, not when we can achieve 100%.

Simply, this functionality should be transparent and toggleable.

deleted
I think you're confused about what absolutism means. Just because I want 100% coverage when achievable does not mean I am being absolutist.
Yes and this is why Github handicapped their search so much.

But at the end of the day, any secret you post publicly is compromised.

That's a very poor reason to not have good searching functionality considering the uses far outweighs the potential risk. I highly doubt it's the case.
I'm 90% sure it is.

GitHub had great search, then they took it down when they found people we're scraping credentials with it, then they had bad search. I'm connecting the dots.

Thankfully the guys over at Sourcegraph feel differently about how searchable projects should be.
You can permanently delete data from github, but doing requires a bit of work and a message to customer service: https://help.github.com/en/articles/removing-sensitive-data-...
If you ask nicely they'll run a one-off "gc expire" for you.

It also requires an attacker to know at least the partial SHA-1 anyway. It's infeasible to start brute-forcing that without being banned for dDoSing them, and if you know what the SHA-1 is you probably had access to the data already.

But yeah. It definitely creates security caveats peculiar to git, e.g. a hostile actor guessing that a force push in an IRC commit announcement clobbered secret data, and the accessing the old commit in the web UI.

This is precisely why secret rotation mechanisms are essential. If you are regularly rotating your secrets, your window of vulnerability for an accidentally leaked secret reduces to the rotation window. With good automation, and in the context of secrets which don't need to be remembered or input by a human, your secrets should be rotating nearly constantly. Additionally, automation greatly reduces the risk of human intervention, which reduces the risk of a human writing secrets to files by hand, which reduces the risk of those secrets being committed to version control in the first place.

Of course, automatic secret rotation is hard. Vault is a great help, but it can't be grafted onto everything. Good DevSecOps engineers are worth their weight in gold.