|
|
|
|
|
by GodelNumbering
14 days ago
|
|
More interesting part probably worth highlighting: The SAME model won't always return the same output when prompted with the same fact check. You ask a human 1000 times a fact check question, they say the same answer 1000 times. You ask an LLM the same question a 1000 times, your results could vary significantly. Humans work based on the Metamemory (knowing what they know), while LLMs are picking from statistical probability. |
|
I have labeled datasets with a human team and shown the same task to the same user on a different day, and they answered differently. Of course, they are usually consistent with themselves most of the time but not always.