Hacker News new | ask | show | jobs
by PeterisP 916 days ago
No, there definitely is CSAM in the clearweb, it's just that it's usually ephemeral as it gets relatively rapidly removed as fast as it's added, and having a process to fight CSAM is (rather expensive) table stakes for anyone who wants to make a public service that includes user-generated content, because you will get "this kind of content" posted on your platform.

E.g. one of the issues with changes in Twitter moderation after the management change was that it turned out that reducing the moderation manpower meant that suddenly CSAM on Twitter was more prevalent.

The same applies to other media - e.g. observing Mastodon https://www.theverge.com/2023/7/24/23806093/mastodon-csam-st... found a pretty similar "CSAM rate" as in this Laion dataset.

1 comments

Hm I guess that makes sense, if there is a CSAM "fill rate" in the clearweb of X images per unit time, and also assuming a "remove rate" that is approximately X then there will always be a "rolling buffer" of around X images constantly (which would change in its content every unit time, but always be there in terms of quantity) and that's probably what the algorithm picked up.