|
|
|
|
|
by wgd
962 days ago
|
|
The approach proposed in this paper is to watermark LLM generated text using character-substitution from various simple characters (normal whitespace, normal letters, etc) to semantically equivalent Unicode code points (such as U+2004 THREE-PER-EM SPACE instead of normal spaces, or replacing specific character sequences with equivalent ligatures). The authors appear to be entirely aware that this sort of substitution can be trivially stripped out by normalizing down to a simplified character set ("The critical limitation of Whitemark is that it can be bypassed by replacing all whitespaces with the basic whitespace U+0020, then the validator can no longer detect the watermark"), but believe that it still has value because the typical student using an LLM to write their essay won't know anything about Unicode. This seems a bit naive to me. Implementing the necessary "watermark remover" normalization as a simple webapp would be an easy afternoon project for most of us here, and if this approach reached any sort of widespread use there would be many such sites. Students who intend to cheat by using an LLM to write their essays are entirely capable of learning "there's some secret data hidden in the text so copy-paste it through this other site to strip that out before turning it in". Even without access to such a tool they could simply...retype the text themselves? Arguably this still has some value. In most contexts there is minimal downside to watermarking the generated text in this way, and a slight possibility of catching some cases in which people lazily present LLM generated text as human written. However this might give people a misplaced belief that the absence of such a watermark means the text is authentically human authored, which might outweigh the benefits of catching the occasional lazy or ignorant user. |
|
In fact there is precedent for this. When I was at school a lot of kids would start writing an essay by copy and pasting the most relevant Wikipedia article into Microsoft Word, and then edit it to sound different, but this resulted in a subtle light-blue background being inserted into the resulting printed page, which made it very obvious that they had copied from Wikipedia. They quickly learnt that they had to paste it through Notepad or similar first to get rid of the background colour.