| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by julianh65 1243 days ago

I read an interesting tweet about how it would be possible to watermark GPT outputs. Essentially before each token is generated the previous token is used to seed an RNG. Using the RNG the possible next tokens are split into a whitelist and a blacklist and the model can only select words from the whitelist. Later on it's possible to "check" for the watermark by counting the whitelist tokens and doing statistical analysis.

Apparently they can preserve performance by not doing this for very low entropy tokens where there is only one token that is extremely likely.

Saw it here: https://twitter.com/tomgoldsteincs/status/161828766500640358...