| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by derefr 1281 days ago

> These "leaked" secrets GitHub forwards might be dissidents getting access without being tracked.

"Leaked" here means "made public", i.e. "published such that literally anyone can use them", for example when burned into a commit of a public repo. Even for a dissident, publishing an API key or other credential where literally anyone can find it to use it, is almost assuredly a mistake. Because external scrapers can also find it there, such that the key will be inevitably picked up and fed into a botnet to abuse — at which point the ops staff at the service will notice the abuse and revoke the key, thus "burning" it as useful from the dissident's perspective.

If you store a secret on Github somewhere that only people and people you trust have access to, rather than everyone having access to it, then this is not considered a "leak", and so Github does not detect this as a "leaked secret." For example, commit data of private repos is not scanned for secrets (if it was, GitOps as a concept would be impossible!); nor are a repo's formal Actions Secrets store (part of a repo's configuration readable only by triggered Github Actions CI jobs).

Github's own secret-scanning here, is trying to catch the cases where a user has done something stupid by accident. Whether or not they reported secrets to third parties, they'd still be doing leaked-secret scanning of their own Github API keys, to ensure that people aren't accidentally trying to configure Github Actions by burning their Github Actions CI API key into the workflow itself. If they find such keys, they revoke them.

The point of Github's secret-scanning partner program, is that because Github is doing this leaked-secret scanning for their own purposes anyway, you (the partner) can sign up to be told when API keys of yours are accidentally made public as well.

> That makes no sense, then they don't need GitHubs help.

Ignoring for a moment that Github is a website, and so anyone can just crawl it—

Did you know? Github pushes the commit data of all public repos to BigQuery as a public research dataset: https://codelabs.developers.google.com/codelabs/bigquery-git.... Literally anyone can do their own "secret scanning" with a simple BigQuery query. It costs about $500 to run such a query, because the Github dataset is pretty large. It's not a price most SMEs would pay. But it's definitely a price attackers could be willing willing to pay. It's a lot cheaper than running your own web-spider infrastructure!

The difference with Github's own secret scanning, is that it happens synchronously, on push of commits; whereas the ETL of commit data to Github et al happens asynchronously, some time after commits happen. Tencent — and every other secret-scanning partner — depends on Github to stay ahead of any third-party attackers trying to scrape leaked credentials for use in botnets et al.

Also, FYI, you yourself can sign up to be a Github secret-scanning partner. You just need 1. a regex that uniquely identifies your secrets, so that Github can recognize them on push, and 2. a webhook URL to report them to. (https://docs.github.com/en/developers/overview/secret-scanni...)

And by the way, this isn't a hypothetical nice-to-have. I run an API SaaS — and not one that's even very large, in relative terms. But my own customers' accidentally-leaked secrets have been scraped from their Github repos and used by botnets already! Signing up as a Github secret-scanning partner is on my to-do list.