That assumes all thumbnails or pictures hash to the same value. Why would that be? There can be different algorithms, resolutions, crops... And there can be millions of thumbnails to check.
Yeah it's a gnarly problem for sure. I mean I'm sure doing some sort of hash-based blocking would knock out 80% of the content related to a takedown request, but I could also see that last 20% being difficult af to purge.
the internet definitely remembers forever, unfortunately, especially porn
The same set of values, yes.
> Why would that be? There can be different algorithms, resolutions, crops...
Yes, so you do normalization and reduction before computing the hashes.
And ‘hashes’ doesn’t have to mean literal SHA. Any non-reversible fingerprinting algorithm will do.