|
|
|
|
|
by stevepike
376 days ago
|
|
This seems to show the power of the reasoning models over interacting with a prompted chat-tuned LLM directly. If I navigate backwards on your link Sonnet 4 gets it right. I've used a similar prompt - "How can you make 1000 with exactly nine 8s using only addition?" Here's GPT 4.5 getting it wrong: https://chatgpt.com/share/683f3aca-8fbc-8000-91e4-717f5d81bc... It tricks it because it's a slight variation of an existing puzzle (making 1000 with 8 8s and addition only). The reasoning models seem to reliably figure it out, though. Some of them even come up with a proof of why it's impossible to do with 9 8s. Here's o4 getting it right: https://chatgpt.com/share/683f3bc2-70b8-8000-9675-4d96e72b58... |
|