| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by tikotus 185 days ago
	Here's someone else testing models on a daily logic puzzle (Clues by Sam): https://www.nicksypteras.com/blog/cbs-benchmark.html GPT 5 Pro was the winner already before in that test.

2 comments

thanhhaimai 185 days ago

This link doesn't have Gemini 3 performance on it. Do you have an updated link with the new models?

link

dezgeg 185 days ago

I've also tried Gemini 3 for Clues by Sam and it can do really well, have not seen it make a single mistake even for Hard and Tricky ones. Haven't run it on too many puzzles though.

link

crapple8430 185 days ago

GPT 5 Pro is a good 10x more expensive so it's an apples to oranges comparison.

link