|
|
|
|
|
by reshlo
605 days ago
|
|
You’re missing my point. Take one of the articles described in that comment, titled “The Internal State of an LLM Knows When It's Lying”. It states “In this paper, we provide evidence that the LLM's internal state can be used to reveal the truthfulness of statements.” Both of these are untrue, for a number of reasons. - An LLM knowing when it is lying is not the same thing as its internal state being able to “reveal the truthfulness of statements”. The LLM does not know when it is lying, because LLMs do not know things. - It is incapable of lying, because lying requires possessing intent to lie. Stating untrue things is not the same as lying. - As the paper states shortly afterwards, what it actually shows is “given a set of test sentences, of which half are true and half false, our trained classifier achieves an average of 71% to 83% accuracy”. That’s not the same thing as it being able to “reveal the truthfulness of statements”. No intellectually honest person would claim that this finding means an LLM “knows when it is lying”. |
|
You keep saying the same nonsense over and over again. A LLM does not know things so... What kind of argument is that ? You're working backwards from a conclusion that is nothing but your own erroneous convictions on what a "statistical model" is and are undertaking a whole lot of mental gymnastics to stay there.
There are a lot of papers there that all try to approach this in different ways. You should read them and try to make an honest argument and that doesn't involve "This doesn't count because - claim that is in no way empirically or theoretically validated."