Hacker News new | ask | show | jobs
by Eisenstein 3 days ago
Please explain why the mechanism of the LLM generating output precludes it from being able to be fooled without using use tautologies or reducing to substrate for explanation.
1 comments

'Fooling', essentially to deceive or trick, is defined as causing someone to believe an untruth - at least in British English it does.

LLMs don't hold beliefs (neither do mechanistic processes), and they aren't a someone.

You can widen out the definition words but that generally makes language weaker - interestingly, semantic drift is a big issue for LLM's.