| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by viraptor 700 days ago
	You can't detect LLM output with any reasonable rate. You'd have both false positives and false negatives all over the place. If you solve that part on its own, that will be a SOTA method.

1 comments

benreesman 700 days ago

This is a dangerous falsehood. OpenAI's since-cancelled polygraph had a 9% rate of false positives, and a 26% rate of true positive. If I can lose a quarter of toxic bytes and need to enable JavaScript on one site in ten? Count me in!

I want more false positives.

https://openai.com/index/new-ai-classifier-for-indicating-ai...

link

viraptor 700 days ago

Then don't use any website - 100% false positives. But seriously, it's a 9% rate for specific models at the time. It's a cat and mouse game and any fine tuning or a new release will throw it off. Also they don't say which 9% was misclassified, but I suspect it's the most important ones - the well written articles. If I see a dumb tweet with a typo it's unlikely to come from LLM (and if it does, who cares), but a well written long form article may have been slightly edited with LLM and get caught. The 9% is not evenly distributed.

link

benreesman 700 days ago

It was a cat and mouse game before, spam always is. The inevitable reality that spam is a slog of a war isn’t a good argument for giving up.

I don’t know the current meta on LLM vs LLM detector, but if I had to pick one job or the other, I’d rather train a binary classifier to detect a giant randomized nucleus sampling decoder thing than fool a binary classifier with said Markov process thing.

Please don’t advocate for giving up on spam, that affects us all.

link