Hacker News new | ask | show | jobs
by nsplayer 1069 days ago
>Ensuring that consumers are able to spot A.I.-generated material by implementing watermarks or other means of identifying generated content.

This will either be impossible or something we find out the NSA invented 10 years ago.

Would end up being amazing if we could use these to de-enshittify the internet by automatically removing/filtering existing content which triggers

2 comments

as long as it's about content they generated which weren't modified (at most cropped, scaled) this is quite viable

you e. g. can encode a subtle pattern in the generated image which surives compression and isn't really human visible

then you make a browser extension to spot that pattern and indicate it to the users "in some way"

given that there is a overlap between AI company owners and biggest browser producers and mobile OS vendors this doesn't even need to be an extension but can be build in

obviously any bad actor is likely able to remove it or otherwise still trick users

> encode a subtle pattern in the generated image which surives compression and isn't really human visible

This is basically a contradiction in terms. Compression attempts to throw away any and all data that "isn't really human visible," that's how it works. There isn't space for invisible watermarks by design. You can kind of get away with something "at the edge" that survives an initial JPEG encoding, but there's no way it's going to reliably survive e.g. resizing, cropping, and recompressing and still remain invisible.

Also, most AI generation content is presumably going to be text, not images. Good luck watermarking text that's a paragraph long. (There are potential tools that can operate on text the size of a news article, but are also trivially defeated by swapping a few prepositions and synonyms.)

> watermarking text that's a paragraph lon

unicode has a ton of room for that

> There isn't space for invisible watermarks by design.

if it is impossible, why does it exist?

> There isn't space for invisible watermarks by design.

Very incorrect. Steghide [1] supports JPEG. JPEG and other lossy image formats are ultimately just fancy file formats; there's nothing preventing you from encoding arbitrary messages in a compressed image.

> You can kind of get away with something "at the edge" that survives an initial JPEG encoding, but there's no way it's going to reliably survive e.g. resizing, cropping, and recompressing and still remain invisible.

I am pretty sure that I can design steganagraphy algorithm that disperses a small message across a JPEG in a way that is:

1. invariant to resizing (absolutely certain this is possible),

2. robust to cropping (invariant to cropping up to some limit is definitely possible; eg if you crop 100% of the image then obviously everything goes out the window),

3. robust or even invariant to recompression. This seems a lot harder but I'm pretty sure it's possible.

> Also, most AI generation content is presumably going to be text, not images. Good luck watermarking text that's a paragraph long. (There are potential tools that can operate on text the size of a news article, but are also trivially defeated by swapping a few prepositions and synonyms.)

Yeah, text seems more difficult. Images are also difficult/impossible if you assume the model user is adversarial and competent, which I'm not sure what you wouldn't assume.

For any particular model you can probably do detection with a fair bit of inaccuracy. But I would definitely put detection in the "doomed" category.

I also think the threat is real but wildly over-stated relative to the non-AI status quo. We're slightly democratizing Photoshop and copywriting skills, which weren't exactly scarce to begin with. It's not an AI problem, and it's barely a technology problem. It's primarily a political problem.

[1] https://github.com/StefanoDeVuono/steghide

> 3. robust or even invariant to recompression. This seems a lot harder but I'm pretty sure it's possible.

No, that's my main point. By definition, "perfect" compression will discard everything not human-noticeable, which leaves no room for watermarks/steganography. So the only room for watermarks is in the margin where compression is currently imperfect, i.e. encoding more detail than needed.

But that's relying on artifacts that vary dramatically with compression technique (JPG vs PNG vs WEBM etc.), with basic image manipulation (adjusting brightness, contrast, color, etc.), and other basic operations like resizing. So as soon as you chain any of these together, watermarking falls apart.

> Steghide [1] supports JPEG.

Yes, I already said in my comment 'You can kind of get away with something "at the edge" that survives an initial JPEG encoding'. But as I'm saying, it's not robust or reliable as images get reused. The whole point of a watermark is that it survives copying -- e.g. they would show up as dark text if you xeroxed a watermarked document. That type of robustness or reliability is just not possible here as users download and re-upload images that get re-encoded, because the entire point of image compression is to try to throw away anything and everything the human eye doesn't care about.

> perfect

I think you're being distracted and confused by imprecise and inaccurate descriptions of the intent of various algorithms, instead of considering what actual algorithms actually do.

No existing compression algorithm was designed to be "perfect" in your sense of the word, and none are. They were designed to be good enough, under a lot of different constraints (ease of implementation, extant mathematical tools/knowledge at the time, computation time, etc.)

Instead of talking in hand-wavy terms about hypothetical objects in a wishy-washy way, let's remember that lossy image formats and compression schemes are just pieces of mathematics. E.g., the basic JPEG algorithm is a fairly simple procedure that can be explained to any college student and even moderately above average middle schoolers. Is it perfect? No. Is it what actually gets used in reality? Yes.

> That type of robustness or reliability is just not possible here as users download and re-upload images that get re-encoded, because the entire point of image compression is to try to throw away anything and everything the human eye doesn't care about.

Let F : IMG -> IMG be a set of functions that most/all users and platforms use for compressing images. The question is whether there exist a pair of functions s,t such that for an image i \in IMG:

1. s(i) is roughly the same as s to the bare human eye but contains a message m. (Or not? Depends on the use-case.)

2. t(s(i)) ~= m for some notion of similarity ~= which is sufficient for watermarking.

3. for any f1, ..., fn \in F, t(fn(...f1(s(i))...)) ~= m.

We can relax constraint 1 because we probably only care about a subset of IMG, etc. etc.

Your impossibility conjecture about the existence of s,t for common extant F's doesn't seem nearly as obvious as you're claiming. And there are CERTAINLY choices of F for which s,t do exist. Eg for the basic JPEG algorithm I'm pretty darn confident I can design s,t that are robust to various parameters and also where you only need at least k pixels uncropped to recover a message ~= m, for example. And not just design it, but write a fairly short and intuitive mathematical proof explaining precisely why it works.

In fact, if you know how JPEG works and other applications of Fourier transforms (eg in acoustics and perhaps also crypto), you might see why it would be more surprising if doing this were NOT possible at least for various JPEG implementations/parameterizations!

Stepping aside from JPEG in particular, you might need to know something about how each function in F works, perhaps intimately, and there is probably some clever mathematics involved for many choices of F.

you could also view this as an optimization problem and use various tools that got really popular in 2016 or so to build quite robust solutions that don't depend on the particularities of your choice of F. I'm less certain this would give absolute guarantees but I bet you'd end up with stuff that works well in practice.

But in any case, it's unclear why you are so convinced this is impossible.

Are you describing dct in particular?
I would like to know the same thing. In the case of JPEG and similar schemes in particular, an impossibility result that isn't unrealistically narrowly scoped (again assuming non-adversarial user) would be highly surprising.
> then you make a browser extension to spot that pattern and indicate it to the users "in some way"

If that's an Open standard, and if that browser extension is Open Source, then anyone who wants to avoid that can mess with the final image until the Open Source free standard that everyone is using no longer detects the image.

The only way this is viable is with DRM or a closed service; if it's a standard everyone follows, then circumventing it is trivial. The only way it would work is if it's shrouded in secrecy and attackers can't freely use red-team against the tool. These kinds of watermarks work when there's a very limited pool of people checking for the watermarks, they don't tell people who the watermarks are generated, and they don't tell people how to check for the watermarks.

But that's not really useful for the current situation -- we don't want to further entrench these companies and we don't want it to be costly to check if an image is AI-generated.

I don't think it's viable to do this without significantly curtailing user agency or designing a system that is fully opaque and inaccessible to most people.

Yeah, this is my general feeling about why this area is doomed. It's why I haven't bothered to write up a patent even though I had some good ideas a few years ago. Maybe that was a mistake since governments and corporate politicians are stupider than I assumed.

If you have a central source of authority then the problem is totally trivial and the fact that the images are AI-generated (or not) is a complete red herring.

If you don't have a central source of authority then any reasonable adversarial model makes the watermarking problem somewhere between very difficult and impossible.

Detection from known models is still possible, at least for images. But that's not really watermarking per se.

We need regulations requiring mandatory watermarks on the outputs of troll bots