| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by karmakaze 538 days ago
	This is the interesting part of the experiment. Since these LLMs are general and not specifically trained on historical (and current) stock prices and (business) news stories, it isn't a measure of how good they could be today.

1 comments

attentionmech 538 days ago

My first through after seeing this post was that it's a real world eval. We are running out of evals lately (arc-agi test, then sudden jump on frontier math, etc). So it's good to have such real world tests which show how far we are.

link