|
|
|
|
|
by Jensson
810 days ago
|
|
I know why it is hard for LLMs to learn this, that was the whole point. The way we make LLMs today means they can't identify such structures, and that is strong evidence they wont become smart just by scaling since all the things you brought up will still be true as we scale up. To solve this you would need some sub networks that are pretrained to handle numbers and math and other domains, and then you start training the giant LLM it can find and connect those things. But we don't know how to do that well yet afaik, and I bet all the big players has already tested things like that. As you say adding capabilities to the same model is hard. |
|
If you want the LLM to be better than a human at math, then give it a calculator, or access to something like Wolfram Alpha for harder problems. Your proposed solution of "give it a specialized NN for math" is basically the same, but if you are going to give it a tool, they why not give it a more powerful one like a calculator ?!