|
|
|
|
|
by majormajor
1111 days ago
|
|
> Not if I was trying to win a race, but I can see how this particular example is a useful way to gauge how an LLM handles a task that looks at first like a simple math problem but requires some deeper insight to answer correctly. It's not just testing reasoning, though, it's also testing fairly niche knowledge. I think a better test of pure reasoning would include all the rules and tips like "it's good to have some buffer" in the prompt. |
|