Hacker News new | ask | show | jobs
by mananaysiempre 1837 days ago
Trust extended by the root program maintainers (who serve as a proxy for you, the user, and should make decisions in your interests) to the CA operators (who all too often make decisions that are good, in a prisoner’s dilemma sort of way, for the certificate holders but are terrible for you). This is meant to correct the broken incentive structure of the CA model where the people who pay the CAs are not the people who consume the results of the CA’s work.

(How much the root program maintainers can be relied on to represent your interests varies, but all other alternatives so far seem even more terrible.)

Nobody is seriously suggesting that revoking every LE certificate is the proportionate response here. But in the background for all of this is the fact that CAs historically have resisted revocation as a remedy for practically every misissuance or security incident, no matter their severity. They also frequently talked as though revocation was not only not considered, but did not even come to mind when the incident was recognized.

Arguments from lack of security impact or the number of “affected customers” (who are not, in fact, the people who the CAs exist to provide a service to, to reiterate the incentive problem) were used to argue against almost everything, including eliminating shady sub-CAs, reducing astronomical maximum validity periods, and prohibitions against broken security schemes, even after the corresponding decisions were confirmed by a vote of the CA/Browser Forum and publicly announced by the programs. (In fact, there’s an open Mozilla bug right now where Google’s Ryan Sleevi is lambasting Google Trust Services for cross-signing a historical SHA-1 CA using a currently trusted root.)

This is why Mozilla’s policy is absolutely merciless regarding revocation and incident reports. In an issue like this, where there is, in fact, no actual security impact, it is expected that the CA will suck it up and file an additional report saying that yes, on the balance we aren’t going to revoke, and here’s our reasoning (because again, bare statements or restatement of conclusions masquerading as justification is the norm for CA communications, as can be seen from the current compliance bugs on the Mozilla tracker). For every decision not to revoke, there needs to exist at least percieved harm to the CA(’s reputation with the root program), because history shows that otherwise nobody revokes anything.

(Browse the relevant category of the Mozilla tracker, mozilla.dev.security.policy, or the CA/B Forum mailing lists if you want a chilling read. This is what finally turned me off DNSSEC+DANE, because if people for whom PKI and key management is literally the only job are that bad at it, I don’t even want to imagine how bad domain resellers are going to be, and unlike CAs you can’t just toss your DNS delegation out and get a new one in a couple of minutes.)

Thus the revocation part of this issue is posturing, but it’s posturing that both sides recognize and have decided to accept as the only way of ensuring the WebPKI remains somewhat functional.

The operations part that’s going to be discussed in the linked bug itself (and not the so far nonexistent no-revocation report), on the other hand, has immediate importance to LE’s operations, because it means that there were no additional controls beyond the issuing software itself that were enforcing LE’s declared policies, and that’s just too brittle a design to work with. The policies as declared in the Certification Practice Statement are (intentionally meant to be) rigid and hard to change, whereas people will routinely reconfigure or even modify the issuing software, and somebody, sometime, will make a fumble in the certificate template or check the wrong verification box or even mistakenly enter a command to issue a sub-CA from prod. That’s not a problem, but not preventing it from causing a misissuance is.

The accepted remedy is to run independent linters (plural, because bespoke tooling integrating them into the pipeline has mistakes as well) configured to enforce both the common Baseline Requirements and the particular CA’s CPS and left untouched until there’s a (carefully reviewed) change to those. It seems that LE’s setup failed in this regard, because while of course Boulder can and will have the occasional off-by-one bug, it’s highly unlikely that ZLint or Cablint or whatever has the same one, so somebody configured them wrong.