Hacker News new | ask | show | jobs
by matthewdgreen 1204 days ago
I am currently reading through all of the legislation to try to figure out what the latest version of this actually requires from providers. Early versions talked about scanning for "grooming behavior" in textual conversations, but then also required that there would be no mass-scanning of communications: this seems obviously contradictory. What's frustrating about these laws is how vague everything is: I wish there were more high-quality summaries of the current legislation's terms. (If they exist, please post them here.)
2 comments

The legislation [1] is amazingly vague about the impact on end-to-end encrypted systems. In fact the string "encryption" appears twice, and only once in the body of the text (in Paragraph 26 on page 27.) This paragraph basically says "it's ok to use end-to-end encryption" but does not actually stipulate an exemption for scanning technologies. Presumably this means you would need to somehow implement an effective scanner into your end-to-end encryption. The legislation gives no other guidance about how to do this, how secure it will be, or even a mild discussion of the tradeoffs.

This is alarming because the bill also makes clear that the goal is to detect not only known and unknown CSAM using some technological measure, but also to detect textual content that represents "grooming behavior." Only the "known" CSAM detection approach has ever even been attempted in a production system (with significant limitations) and that system was not ultimately deployed due to technical and customer concerns. But as much as CSAM media scanning worries me, the idea of automated ML-based text analysis for something as vague as "grooming behavior" is frankly terrifying. And I haven't even considered the slippery slope that becomes visible the second you build text-analysis and reporting systems into encrypted communications.

What is much more concerning than the legislation is the Impact Assessment [2], which is cited in the legislation to justify its reasoning. Specifically, the Impact Assessment recommends Option E, which is "mandatory scanning of all known and unknown CSAM, as well as textual detection of 'grooming behavior'" even in systems that deploy E2E encryption.

Where the legislation is vague about E2E encryption, the impact assessment [2] leaves no scrap of unambiguity: it makes clear that the need for these mandatory scanning mechanisms is almost entirely a response to the increasing deployment of E2EE, and specifically cites Facebook's (still un-deployed) 2019 encryption announcement to support its argument for a mandatory scanning requirement. It uncritically cites Apple's (since withdrawn) CSAM scanner (p. 39) as an example of a balanced privacy solution. It cites vaguely to the existence of scanners capable of detecting unknown CSAM, barely acknowledging that such techniques are entirely at the hypothetical/research stage and may not be safe at all. Finally it provides a privacy analysis that somehow concludes that the privacy benefits "in protecting victims" naturally outweigh all other concerns that might pop up around the deployment of what will be the world's most powerful ML-based text and media mass-surveillance system for encrypted and unencrypted private messages.

Because take note: while the authors don't use that terminology, readers should have no doubt: that is what the EU is proposing to build with this legislation.

[1] https://eur-lex.europa.eu/resource.html?uri=cellar:13e33abf-...

[2] https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELE...

The ambiguity is by design.

It puts the onus on the implementer to be overzealous under fear of being criminally liable. This has the added benefit (to the slimy legislators pushing this garbage) of allowing them to scapegoat any perceived excesses onto corporations and developers.