Hacker News new | ask | show | jobs
by ibejoeb 1591 days ago
In general, what are the strategies for large public codebases like this to mitigate supply chain attacks or other source-level attacks?

For clarity, I'm hoping to open us discussion about how we're dealing with massive changesets like this that are difficult to review due chiefly to the breadth of it.

3 comments

For a purely mechanical change like this, someone could run black against the same revision of Django and verify the changes they see locally match the changes in this PR.
That's true as long as the results are predictable and reproducible. I don't happen to know if Black is, and it's not apparent from the documentation.

Update: Found it:

> How stable is Black’s style?

> Starting in 2022, the formatting output will be stable for the releases made in the same year

https://black.readthedocs.io/en/stable/faq.html

The same version of black, with the same settings, will always produce the same results from the same input code. Definitely re-producible. That question is about how stable the formatting is from version to version. Which is now more stable, and why Django finally made the move.
Interesting! Can you help me imagine attack scenarios? All I can think of is:

- The changeset is authored by a trusted committer but the committer's tools have been locally compromised.

- The public tool itself (e.g. black) has been compromised to automatically create vulnerabilities in difficult-to-review bits of code (a Ken Thompson hack).

As a reformatting tool should only change the formatting, you could check that the Abstract Syntax Tree is unchanged. The ast module in the standard library gives access to the AST [1].

[1]: https://docs.python.org/3/library/ast.html