Hacker News new | ask | show | jobs
by dorianm 3416 days ago
Right after my "remove secrets" post: https://news.ycombinator.com/item?id=13650614

There are just so many of those it's crazy:

    remove .env
    YOURFAVORITEAPI_SECRETKEY
    YOURFAVORITEAPI_PASSWORD
Also replace "remove" with delete/rm/replace/etc.

And replace "YOURFAVORITEAPI" with CircleCI, Travis, Mailchimp, Trello, Stripe, etc, etc.

Also, companies I contacted consider it the customer fault and basically don't care.

4 comments

I once pushed my Amazon S3 key to GitHub accidentally. Realized instantly what I'd done, and while in the process of feverishly regenerating a new key, my cell phone rings. It's Amazon telling me I pushed my S3 key to GH.
It happens to us as well.

The interesting thing is that there is also an evil crawler that will automatically launch thousands of windows vms to mine bitcoins (that's all they do). Amazon told us that we have leaked our account id and secret but also they notice the other crawler has launched a lot of VMs and they did a refund to us. yes, we love amazon.

Lesson learned: you never put the account id and secret in your code, not only that you should not hardcode it, but there is no need to even read that from the environment etc.

Don't do something like this `new S3({accountKey: ..., accountSecret: ..}` instead you do `new S3()` and that's it. Every AWS SDKS is smart enough to find the keys in the environment following a series of steps:

- environment variables

- ~/.aws/credentials

- and when your code is run on ec2, lambda, etc. you should use IAM Roles.

So, in addition to not hardcoding an AWS secret, your code should not even pass the secret to the SDK.

Consider also enabling CloudTrail and have alerts on that.

There is also a way to not have ~/.aws/credentials in your machine and have another thing that requires MFA. I am not familiar how this work yet but we started to use it.

Whoa, that's actually amazing. Wonder how they got alerted and reacted so fast.
Github provides a public firehose for events[0]. So it's possible to hook a process to read from the firehose, and look for commit events and then match file contents against the list of API keys.

[0] - https://developer.github.com/v3/activity/events/

Yikes. So this is where the evil crawlers are sitting.

Reminds me of the water pipeline in Finding Nemo with the crabs above it.

Alexa probably overhead the developer swearing…
It's cheaper for them to give a few engineers a web crawler project that's this specific than it is to refund people. Im just surprised they don't have an "auto revoke access key if found on interwebz" setting in the AWS account settings actually.
It's not surprising, consider the failure modes:

- a key is made public, and we have to call a user or refund them (for retention purposes)

- a key is made public, and we revoked the key, potentially breaking the customers builds/deploys and potentially knocking a customers stuff out (if, for example, a key is disabled during a push to production).

I heard AWS has a crawler for that specifically. Not sure if it's true, but makes sense based on the anecdata.
You pushed your secret key, and they recognised it?

Does that imply that they are not hashing secret keys, or did you also push the account key (allowing for a single auth test on their side)?

It's also possible that they just scrape the Github firehose for common patterns like

  AWS_SECRET_KEY="FOOBAR"
and send a message to the committer's email (since you presumably used a correct/valid email in the git commit).
The secret key can probably be used to generate the account key.
It is customer fault.

However it should be pretty easy for them to set up a script to search github for this kind of stuff and automatically invalidate keys

At my work one of my coworkers accidentally put a secret token in a GitHub issue. Couple hours later he got an email from the sysadmin at the parent company saying his token finding script went off. He probably wouldn't have noticed for a long while if that script wasn't running.
Wouldn't the token-finding script be even more of a risk?

If the token is XYZ and the script is searching https://github.com/search?utf8=%E2%9C%93&q=XYZ&type=Commits&...:

1. It's sharing the token with GitHub.

2. It's embedding the token as query-string parameter in a GET request, which is much more likely to be logged (than sending it as data in a POST request), and more likely to be available to less-privileged/less-trusted staff.

3. If the request is sent to a non-HTTPS endpoint, the query can be MITMd, revealing the token.

I'd be very wary of setting up a token-finding script, it feels like it adds more risk than it saves.

You can scrape the issues without exposing the token. You could probably do it by just subscribing to all of them and parsing emails. No one(especially in security) should be using a third party search to match sensitive data. It's like searching Google for your social security number.
It was a pattern based script, all the tokens had the same length.
You just search that some token was uploaded by your people, not specifically yours.
Maybe they search the tokens public key and not the token itself. Then if the public key is found, then they download the repo and do scanning for the private key.
A group at BigCorp Inc. was sharing a tool they'd written to ThirdParty Ltd. As part of this, they transferred documentation, including how to configure the tool. Including an example. With a real AWS key. For a dev too, so the key had no restrictions.
And this would be a cool feature from github too. A link mentioning "we found something in your code that looks like a secret, please know people will use it."
They do this for all of their own API keys already. They not only notify you but instantly invalidate a key pushed to a public repo.

Annoyingly there is no way to turn it off even when you explicitly want to share an API key knowingly. But i'm more than fine with needing to "obfuscate" an API key or manage secrets correctly knowing it saves TONS of people.

Why would you ever want to share a valid Github API key publicly?
It's been a while, but IIRC it was a key with no permissions used on a CI server to get around github's API usage limits.

It probably wasn't the best idea, but it was the only "secret" needed in the whole project and I didn't want to maintain a way of managing secrets in a public project for a pointless key.

In the end I did just that, and looking back it was the better choice, but at the time it was annoying.

"... used ... to get around github's API usage limits."

I wonder why they'd want to invalidate that. :)

Continuous development, e.g. Jenkins? (Please don’t do this)
Why not?
Split it to parts and concatenate it, then?

$key = "BAAD" + "F00D" + "CAFE" + "BABE";

> But i'm more than fine with needing to "obfuscate" an API key or manage secrets correctly
GitLab has this: https://docs.gitlab.com/ee/push_rules/push_rules.html#preven... (enterprise edition, admittedly)
1) Search for common pattern where the key can be stored

2) Search if found keys are actual valid keys

3) Expire key, send explanatory email, issue new key

(I think that's what AWS does)

Heroku's official Python template include `.env` in the repo: https://github.com/heroku/python-getting-started https://github.com/heroku/heroku-django-template (Although, to be fair, they do include `.env` in the `.gitignore` file.)
MailChimp has something like this. A few years ago I accidentally committed and pushed an API key, and I got an email from them a few minutes later saying that they had found the key and already invalidated it, so it couldn't cause any damage. Very proactive and smart, especially for an email service which is likely a huge target for abuse around this sort of thing.
Someone should create an app where you select an API service and it gives you a key that's been pushed to a public GitHub repo.