| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Jensson 857 days ago
	I know why it is hard for LLMs to learn this, that was the whole point. The way we make LLMs today means they can't identify such structures, and that is strong evidence they wont become smart just by scaling since all the things you brought up will still be true as we scale up. To solve this you would need some sub networks that are pretrained to handle numbers and math and other domains, and then you start training the giant LLM it can find and connect those things. But we don't know how to do that well yet afaik, and I bet all the big players has already tested things like that. As you say adding capabilities to the same model is hard.

1 comments

HarHarVeryFunny 856 days ago

An LLM can learn to identify math easily enough, it's just that performing calculations just using language isn't very efficient, even if it's basically what we do ourselves. If you want an LLM to do it like us, then give it a pencil and paper ("think step by step").

If you want the LLM to be better than a human at math, then give it a calculator, or access to something like Wolfram Alpha for harder problems. Your proposed solution of "give it a specialized NN for math" is basically the same, but if you are going to give it a tool, they why not give it a more powerful one like a calculator ?!