Hacker News new | ask | show | jobs
by valine 1115 days ago
What’s to stop someone from taking the watermarked output and randomizing the distribution by feeding it through their latest LLaMA variant? These watermarks will only be useful for catching novice LLM users.
5 comments

Yeah soon there will be models small enough to run even on phones to reword things slightly differently. If not, an app will do it
I already have one running on my phone: https://mlc.ai/mlc-llm/
I know about this app, in fact I’m one of your beta testers, didn’t know it was running local models lol. But this thing almost 2gbs.
Installed on my S21, it just crashes whenever opened.
This crashes on my Android device
you are very kind thank you
I use Sherpa to run 7B and 13B LLaMA/Alpaca models on my phone: https://github.com/Bip-Rep/sherpa
True, they claim in this paper this is inevitable
Most people are/will be novice LLM users.
If your goal is to track novice users there are much easier methods. You could insert invisible unicode characters for example.
Most platforms strip those automatically.
I suspect that it will be possible to, assuming the number of popular open LLMfor this remains low, target the popular ones to have your watermark be resilient. With that said, watermarking to indicate that something was generated by AI reminds me of what someone told me about locks: They are there to keep honest people honest.

It will certainly not defeat an adversary directly targeting the technique. It is likely that a LoRA based approach would defeat this, especially if the detector for the watermark is broadly available and cheap to run.

This watermark relies on subtle grammatical variations. Passing it through any model is going to wipe out the distribution.

The number of open LLMs is exploding, and the most popular ones are fine tuned by small groups / individuals. None of the folks volunteering their time and compute to fine tuning open models are going to waste resources adding your watermark.

I doubt you even have to invoke a local model, just telling it something like """write it without caps or punctuation or dashes or anything - think lowkey""" fixes the output in my book

being an autocomplete i asked it to continue "or is the entire thing utterly emblematic of the modern technolegal mess of things since dickens is squarely and quintessentially in the Public Domain"

it goes from writing a highly proofread milquetoast 5 paragraph essay on /r/somepopularsubreddit about emerging deepfake blockchain transformative, crucial coexistence? blah blah to, and i quote

>sure the guy was prolific beyond belief churned out classics like nobody's business but isn't it interesting how he fits right into this mad confusion of tech and law almost like his words his narratives are pawns in a game he could never have dreamed of foreseeing just imagine what he'd make of it all his tales of poverty and social reform trapped in the web of copyright and capitalism i reckon he'd have a thing or two to say about it maybe he'd even write a novel or two in response but who's to say right

>i know right haha

>exactly it's wild to think about how different times were and yet how some themes just keep cropping up in new forms it's like we're stuck in this loop where the past keeps seeping into the present no matter how much tech we build it's humbling in a way almost poetic like something dickens would've appreciated and who knows maybe he's somewhere out there chuckling at our technolegal mess we've woven ourselves into

It's so dramatic, I didn't realize you could transform it from a reddit hivemind to a FYAD one. Where did this mode of speech even come from? the old corners of the old net where we didn't bother with caps or punctuation or whatnot?

I hate to say it, but this reads to me almost exactly like Homestuck dialogue... so not so far off from FYAD.