| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by fy20 1582 days ago

Is there a better way to catch errors like this?

Looking through the PR it looks like this file was accidentally changed, I assume with a project wide search and replace.

I could easily imagine myself missing this when reviewing the PR "oh it's just changing a whole bunch of specs, go ahead".

3 comments

mdaniel 1582 days ago

> I could easily imagine myself missing this when reviewing the PR "oh it's just changing a whole bunch of specs, go ahead".

That's one of the major benefits of the "tree view" in MRs, because one can collapse the "spec" folder, collapse the "ee/spec" folder, and it leaves "db" and "lib/gitlab/auth" visible which should for sure set off mental alarm bells: https://docs.gitlab.com/ee/user/project/merge_requests/chang...

link

NavinF 1582 days ago

An integration test that creates dummy accounts using every method including SSO and then attempts to bruteforce the password should find “12345678” within an hour.

I think a test like this would also have found the dropbox and macos bugs that let you login to any account by using an empty password:

https://techcrunch.com/2011/06/20/dropbox-security-bug-made-...

https://arstechnica.com/information-technology/2017/11/macos...

Edit: Oh, the password was "123qweQWE!@#000000000". Technically doable with an efficient password cracker that favors common patterns. zxcvbn’s entropy estimate says it will take 10^10.5 guesses. That’s 1 week at 50k/s. That’s a hell of an integration test for most software.

https://lowe.github.io/tryzxcvbn/

link

dewey 1582 days ago

> then attempts to bruteforce the password should find “12345678” within an hour

But only if there's no rate limiting or increasing timeouts for wrong passwords which in most cases exists.

link

NavinF 1582 days ago

Right, but you can bypass that in a testing environment.

link

dewey 1582 days ago

Which then opens you up to exactly that class of bugs that also caused this issue. Having test specific code and feature flags and then testing a tweaked version isn't really covering all the cases then.

Just like in this case where a hardcoded password was set to maybe log in through a test based on the naming "test_default".

link

theptip 1582 days ago

Posted some of this in a sibling comment, but there are a few ways you can address this (not all viable for Gitlab).

First of all, in code review, having merge requests touching security-sensitive code (e.g. /lib/gitlab/auth) require review by a security engineer using CODEOWNERS is the obvious process gap.

If you're paranoid, have security engineers review the diffs on all releases before they are shipped (in practice the security engineer would filter the `git diff` down to just the files you care about like `/lib/gitlab/auth/`). This is expensive, but probably not as expensive as "whoops, our Chinese subsidiary completely broke our auth for 3 months". If you have a good enough bug bounty program then I'd expect your pentesters to spot this in less than 3 months, as reading git commits to auth code is a good way to spot potential bugs as they are committed. This is hard if you're doing continuous deployment (I don't know if GL do this) but since they have numbered releases, I'm guessing (hope?) they don't run gitlab.com off the tip of master. If you're doing Continuous Deployment you need to be damn sure that your CI/CD pipeline is rock solid, and you have the right reviewers and controls on the sensitive bits of the codebase.

These are processes; you can also make this kind of error less likely at the architectural level. A general approach for larger applications is to split Auth out into its own service with its own repository, release cycle, restricted set of contributors, etc; while microservices are probably somewhere between Peak of Inflated Expectations and Trough of Disillusionment, splitting out services that require a strong security boundary is a good use-case. You can even use an OSS off-the-shelf system like https://www.keycloak.org/ to avoid having to build your own LDAP/SAML/JWT authority. (Or outsource this to Okta, though that seems a bit more dubious these days).

This is harder for GL because they have the FOSS monorepo and splitting out auth code would be awkward. But Gitlab currently deploys a bunch of components from the monorepo (https://docs.gitlab.com/ee/development/architecture.html) and if there was a separate auth component, a paranoid security team could reasonably deploy their auth service from a fork that they update in a more controlled fashion, instead of just auto-deploying all changes to security-sensitive code as appears to have been the strategy here (if security engineers actually looked at this change and OK'd it, there are bigger problems).

link

bloomburger 1581 days ago

Is the vulnerability in the code change glaringly obvious to you?

Looks like two security engineers actually took a look at it: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/76318#...

link

theptip 1581 days ago

It’s easy to spot in hindsight. I can see how it slipped through if they didn’t actually closely read the changes in each file, it’s easy to let your guard down with many-file changesets like this. (That’s a cognitive bias that should be kept in mind when doing security review, particularly if that’s your only control in place to prevent issues like this).

Fundamentally, “fix tests” changesets like this should not touch non-test code and anything touching /var/lib/gitlab would be a big code smell to me if I was reviewing.

The advantage of the code owners approach is it highlights that there were changes to the prod auth code (shows the rule that triggers in the MR I believe), which gives you an additional chance to spot issues to focus on. Seems they also have bots running checks and so they could have the bot ask for an extra review if …/auth/ was touched. Test helpers shouldn’t live next to prod code, for obvious reasons, so the “prod auth code owner” rule being hit would have been an alarm bell on this MR.

I do think it’s concerning that security review was given here, so I’m keen to see the post-mortem and see what other gaps they identify.

link