| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by SkyPuncher 1058 days ago
	I don’t think this is a particularly useful benchmark. It’s well known that LLMs are bad at math. The token based weighting can’t properly account for numbers that can vary wildly. Numbers are effectively wildcards in the LLM world.

1 comments

Surely this is a "didn't read the question properly" problem rather than a "didn't maths right" problem?

And that (understanding a natural language question) is the USP for LLMs.