| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by adverbly 483 days ago
	This does look like a large relative increase in score, but it seems like it comes from getting zero correct out of 6 to getting 1 and 1/2 correct. I think it's fair to say the sample size here is relatively small. Still, a record is a record! Congrats to the team for a new record!

1 comments

onlyrealcuzzo 483 days ago

From my small sample size (tens of queries per day), Gemini 2.5 seems like a noticeable improvement in (almost) every way compared to to previous Gemini models.

Answers do seem to take longer to generate, but well worth the cost.

link