| HN Mirror

You might be right, but my first instinct is that this probably wouldn't happen enough to throw off the water marking to badly.

The most likely used word is based off the previous four, and only works if there is enough entropy present that one of multiple word would work. Thus its not a simple matter of humans picking up particular word choices. There might be some cases where there are 3 tokens in a row that occur with low entropy after the first token, and then one token generation with high entropy at the end. That would cause a particular 5 word phrase to occur. Otherwise, the word choice would appear pretty random. I don't think humans pick up on stuff like that even subconsciously, but I could be wrong.

I would be interested to see if LLMs pick up the watermarks when fed watermarked training data though. Evidently ChatGPT can decode base64, [0] so it seems like these things can pick up on some pretty subtle patterns.

[0] https://www.reddit.com/r/ChatGPT/comments/1645n6i/i_noticed_...