|
|
|
|
|
by eternityforest
478 days ago
|
|
I still don't understand why all the datasets have so many general knowledge questions and so much math, when so few people can do any of that stuff. It makes sense for ASI research I suppose, but why are we trying to teach small models to do stuff almost no humans even try to do? What happens if you train them with RAG context in the prompts and calculator calls in the CoT? |
|
I agree with your meta-point that better benchmarks testing more types of task would be good!