| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by lwhi 1213 days ago
	Perhaps AI generated text should be created with a specific signature in mind _specifically_ to be identifiable?

2 comments

wongarsu 1213 days ago

There's a large body of research into invisible text watermarking, so this would certainly be possible. Maybe the simplest to implement in LLMs would be to bias the token generation slightly, for example by making tokens that include the letter i slightly more likely. In a long enough text you could then see the deviation from normal human text characteristics.

link

PeterisP 1213 days ago

All the main scenarios for detecting generating text that I can imagine do have to assume that the LLM isn't "cooperative" but actually is specifically designed (or fine-tuned) to avoid detection.

link

lwhi 1213 days ago

Yep, I think it's a good idea.

link

mattnewton 1213 days ago

isn’t this essentially asking anyone who runs a model to flip the evil bit[0]? People who want to misrepresent model output as human written output will trivially be able to beat this protection by removing the signature or using a version of the model that simply doesn’t add it.

[0] https://en.m.wikipedia.org/wiki/Evil_bit

link

lwhi 1213 days ago

I think you're correct, but it would promote the idea that what's produced is a basis or source for further work .. and would mean that effort is required.

Feels like a good basis tech for something like ChatGPT.

link