Hacker News new | ask | show | jobs
by energy123 309 days ago
Basic prose is a saturated bench. You can't go above 100% so by definition progress will stall on such benchmarks.
1 comments

All the same they choose to highlight basic prose (and internal knowledge, for that matter) in their marketing material.

They’ve achieved a lot to make recent models more reliable as a building block & more capable of things like math, but for LLMs, saturating prose is to a degree equivalent to saturating usefulness.