|
|
|
|
|
by zamadatix
462 days ago
|
|
Whether you train the model how to do math internally or tell it to call an external model which only does math the root problem still exists. It's not as if a model which only does math won't hallucinate how to solve math problems just because it doesn't know about history, for the same number of parameters it's probably better to not have to duplicate the parts needed to understand the basis of things multiple times. The root problem is training models to be uncertain of their answers results in lower benchmarks in every area except hallucinations. It's like you were in a multiple choice test and instead of picking which of answers A-D you think made more sense you picked E "I don't know". Helpful for the test grader, a bad bet for the model trying to claim it gets the most answers right compared to other models. |
|
This is a problem for testing humans too and the solution is simply to mark a wrong answer more harshly than a non-answer.