Seriously. I have never seen this, even once. I had been wondering if it was impossible. If a model can really say “I don’t know” when it doesn’t know, that could change everything. How many pointless, dumb rabbit holes could be avoided?
This feels like honestly the biggest gain/difference. I work on things that do a lot of tool calling, and the model hallucinating fake tools is a huge problem. Worse, sometimes the model will hallucinate a response directly without ever generating the tool call.
The new training rewards that suppress hallucinations and tool-skipping hopefully push us in the right direction.
I get the "good" result with phi-4 and gemma-3n in RAG scenario - i.e. it only used context provided to answer and couldn't answer questions if context lacked the answer without hallucination.
GPT4: Collaborating with engineering, sales, marketing, finance, external partners, suppliers and customers to ensure …… etc
GPT5: I don't know.
Upon speaking these words, AI was enlightened.