Hacker News new | ask | show | jobs
by eternalban 1131 days ago
> What exactly makes anyone think that they can detect an LLM that is outputting text? The notion seems absurd yet it keeps coming up.

My sense of the general idea (non-authorative): Since the sequence emitted by an LLM is probabilistic completion i.e. predict the next word, the examiner can also do the same by progressively processing the text. Given the assumption that the semantic relations extracted from training corpus should be fairly universal for a given domain at the output level (even though distinct LLMs will likely have distinct embedding spaces), then the examiner LLM should be able to assign probabilities to the predicted words. The idea is that a genuine human produced text will have idiosyncrasies that are -not- probabilistically optimal and the examiner can establish a sort of 'distant from probable mean' measure, with the expectation that LLM produced text should be 'closer' to the examiner's predictions of 'the next word'.

The problem (if above is correct) then is the missing 'prompt' and meta-instruction embedded therein. Those should ("engineering") affect the output, possibly skewing the distance measure, thus defeating the examiner. But of course, say in context of academia, the examiner can 'guess' as to some aspects of the prompt as well. For example, if you are examining papers for a specific assignment, the examiner can self-prompt as well. "An essay on Hume's position on the knowledge of the self".

1 comments

That only works if the temperature setting is low. If you set it to 1 the it will pick something that’s 1% likely 1% of the time, for example, which should match human text
Definitely, should have mentioned that. However it is interesting that using temperature may lower the quality of the output and you may get a C and not the hoped for A+.
Could you run it once to get the A+ version, then feed it back that version saying "change x% of these words to be unlikely choices in a way that keeps all the meanings of the essay" to avoid letting the low-probability words hamper the main contents, just the way it's described?
As far as I know lower temperature doesn’t necessarily mean higher quality, just more determinism and less creativity so whether that’s desirable depends on what you’re trying to generate

Temperature above 1 often results in nonsense though

I completely hedged that with two "may"s. Agreed re it depends on task at hand.