A hash doesn't work without solving all of the same problems their provenance classifier solves: you need a way to be able to detect the steganogram even after the image has been modified.
> you need a way to be able to detect the steganogram even after the image has been modified.
True, but not really. If you look at any suggested method to prevent the spread of CSAM "at scale" its based on large "hash databases" and not fancy classifiers (yet, if you can't develop one, and yes they have "problems" because they dont only contain hashes for "bad" files - "bad" being "locale" dependent).
Also you need to consider at FAANG scale (where election interference matters), "just send TB's of data a day to our API" doesn't actually "scale" - nobody's ever paying those API/bandwidth fee's (as much as we'd like them to).
It would be much cheaper for OpenAI to run this AI on all the images they produced and publish hashes for ones the model thought was "bad". But they never would because then they'd become "censors" - its a complicated world where nobody wants any blame.
True, but not really. If you look at any suggested method to prevent the spread of CSAM "at scale" its based on large "hash databases" and not fancy classifiers (yet, if you can't develop one, and yes they have "problems" because they dont only contain hashes for "bad" files - "bad" being "locale" dependent).
Also you need to consider at FAANG scale (where election interference matters), "just send TB's of data a day to our API" doesn't actually "scale" - nobody's ever paying those API/bandwidth fee's (as much as we'd like them to).
It would be much cheaper for OpenAI to run this AI on all the images they produced and publish hashes for ones the model thought was "bad". But they never would because then they'd become "censors" - its a complicated world where nobody wants any blame.