|
|
|
|
|
by ninetyninenine
495 days ago
|
|
You'll have to drop a bit of rigor here. I ask the question, what is 2 * 2, which is an obviously loaded question that's pattern matched to death. The LLM can answer "4" or "The answer is 4" of "looks like the answer is 4" All valid answers but all the same. We count all 3 of those answers as just 4 out of the set of numbers. But we have to use our own language faculties to cut through the noise of the language itself. |
|
Yeah, that was my point. Small codomain -> easy to validate. Large codomain -> open to interpretation. You implied that to prove reasoning, pick a prompt with a large codomain and if the LLM answers with accurate precision, then viola, reasoning.
So my question was, can you give an example of a prompt with a high codomain that isn't subject to wide interpretation? It seems the wider the codomain the easier it is to say, "look! reasoning!"