Y
Hacker News
new
|
ask
|
show
|
jobs
by
dezmou
177 days ago
I love it, just purchased a pack. I've also found that it is a very great tool to test LLM, like take a screenshot of a half resolved game and feed it to ChatGPT with the rules and ask him to select the next target
2 comments
tikotus
177 days ago
Thank you so much! Also, you might find this interesting regarding testing LLMs:
https://www.nicksypteras.com/blog/cbs-benchmark.html
link
dezmou
177 days ago
turn out Claude Sonnet 4.5 is far better as resolving those as ChatGPT 5.2
link