| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by chuckadams 154 days ago
	Breaking this down, several of AWS's core repos like the JS SDK use an allowlist of which contributor ids can run workflow actions in their PRs. The list was a regex, contained several short ids, and wasn't anchored with ^$, so if it allowed user 12345, then any userid containing 12345 could run their own actions on the PR, including one that exfiltrated access tokens. So they spammed GH with user creation requests, got an id that matched, and they were in like Flynn. Said tokens didn't have admin access, but had enough privileges to invite other users to become full admins. Not sure if they were rotated, but github tokens are usually long-lived, like up to a year. Hey, isn't AWS the one always lecturing us to use temporary credentials? To be fair, AWS did more than just fix the regex, they introduced an "approve workflow run" UI unto the PR process that I think GH is also using now (not sure about that).

5 comments

cyberax 154 days ago

> Said tokens didn't have admin access, but had enough privileges to invite other users to become full admins.

Ah... Github permissions. What fun.

Github actually has a way to federate with AWS for short-lived credentials, but then it screws everything up by completely half-assing the ghcr.io implementation. It's only available using the old deprecated classic access tokens.

link

catlifeonmars 154 days ago

Right? How is it that you still need a PAT or a custom app installation to access a registry?

link

fowl2 152 days ago

Yeah wow! Even most "trusted" contributors shouldn't have this level of access. Is there really no way of scoping tokens with more granularity?

link

cyberax 150 days ago

Nope. The best we could do was to create a separate service that creates Docker tokens (using "docker login") and exposes a secure API.

Obviously, GitHub needs to just fix this nonsense. But I interviewed a couple of "senior" engineers from GitHub, and I have zero hope of that happening soon.

link

bink 154 days ago

As a security dude I spend way too much of my time fixing missing anchors or unescaped wildcards in regex. The good news is that it's trivial to detect with static analysis tooling. The bad news is that broken regex is often used for security checks.

link

SkiFire13 154 days ago

Sometimes I wish regexes were full matches by default and required prefixing and postfixing with `.*` to get the current behaviour

link

chuckadams 153 days ago

Java's Pattern.match() method works that way. Python has two separate methods: re.match auto-anchors, re.search does not.

link

ruined 154 days ago

a match isn't boolean, it's substring. the original (and more common) use-cases would become excessively verbose

link

edoceo 154 days ago

https://xkcd.com/1171/

link

TacticalCoder 154 days ago

> The list was a regex ...

Regexpes for security allow lists: what could possibly every go wrong uh!?

link

bflesch 154 days ago

At least the vuln was old enough so that they couldn't blame AI for it, otherwise the article would read different ;)

link

chuckadams 154 days ago

Ironically (?) an AI code review would very likely have noticed the overly-permissive regex.

link

SkiFire13 154 days ago

This doesn't really matter as long as they also find 10x more nits that create noise for the human reviewer.

link

catlifeonmars 154 days ago

This is a good point. On my GH I’ve disabled Copilot reviews because the vast majority of them are false positives, but I’m reconsidering that position as it might still be worth it to wade through the spurious reviews just to catch some real issues.

link

maxbond 153 days ago

I filter for false positives with language like this:

    For each bug you find, write a failing test. Run the test to make sure it fails. If it passes, try 1-3 times to fix the test. If you can't get it to work, delete the test and move on to the next bug.

It's not perfect, you still get some non-bugs where the test fails because it's premises are wrong. Eg, recently I tossed out some tests that were asserting they could index a list at `foo.len()` instead of `foo.len() - 1`. But I've found a bunch of bugs this way too.

link

catlifeonmars 150 days ago

I take it this wasn’t Lua then?

> I tossed out some tests that were asserting they could index a list at `foo.len()` instead of `foo.len() - 1`.

link

catlifeonmars 153 days ago

Nice, I’ll give this a try

link

whatever1 154 days ago

Another success story for Regexes! Let's keep using this cryptic mess!

link

pxc 154 days ago

I met regexes when I was 13, I think. I spent a little time reading the Java API docs on the language's regex implementation and played with a couple of regex testing websites during an introductory programming class at that age. I've used them for the rest of my life without any difficulty. Strict (formal) regexes are extremely simple, and even when using crazy implementations that allow all kinds of backreferences and conditionals, 99.999% of regexes in the wild are extremely simple as well. And that's true in the example from TFA! There's nothing tricky or cryptic about this regex.

That said, what this regex wanted to be was obviously just a list. AWS should offer simpler abstractions (like lists) where they make sense.

link

catlifeonmars 154 days ago

> That said, what this regex wanted to be was obviously just a list. AWS should offer simpler abstractions (like lists) where they make sense.

Agree. I would understand if there was some obvious advantage here, but it doesn’t really seem like there is a dimension here where regex has an advantage over a list. It’s (1) harder to implement, (2) harder to review, (3) much harder to test comprehensively, (4) harder for users to use (correctly/safely).

link

twoodfin 154 days ago

Presumably the advantage was ease and speed of developing the filtering feature.

Wrong tradeoff, to be sure.

link