| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by afro88 351 days ago

I have a real world problem I gave o1 when it came out and it got it quite wrong. It's a scheduling problem with 4 different constraints that vary each day, and success criteria that need to be fulfilled over the whole week.

GPT-5 Thinking (Think Longer) and Opus 4.1 Extended Thinking both get it right.

Maybe this unique problem is somehow a part of synthetic training data? Or maybe it's not and the paper is wrong? Either way, we have models that are much more capable at solving unique problems today.

1 comments

sachin_rcz 351 days ago

Models today also have access to certain tooling or have been reinforced to use that tooling in complicated situations. i.e. Questions of counting letters in word are being answered by using python code in background.

link