This is not really true. If you give a decent model docs in the prompt and tell them to answer based on the docs and say “I don’t know” if the answer isn’t there, they do it (most of the time).
$ rgd ~/repos/jj/docs "how can I write a revset to select the nearest bookmark?"
Using full corpus (length: 400,724 < 500,000)
# Answer
gemini-2.5-flash | $0.03243 | 2.94 s | Tokens: 107643 -> 56
The provided documentation does not include a direct method to select the
nearest bookmark using revset syntax. You may be able to achieve this using
a combination of ancestors() , descendants() , and latest() , but the
documentation does not explicitly detail such a method.
I need a big ol' citation for this claim, bud, because it's an extraordinary one. LLMs have no concept of truth or theory of mind so any time one tells you "I don't know" all it tells you is that the source document had similar questions with the answer "I don't know" already in the training data.
If the training data is full of certain statements you'll get certain sounding statements coming out of the model, too, even for things that are only similar, and for answers that are total bullshit
Ok, how? The other day Opus spent 35 of my dollars by throwing itself again and again at a problem it couldn't solve. How can I get it to instead say "I can't solve this, sorry, I give up"?
That sounds slightly different from "here is a question, say I don't know if you don't know the answer" - sounds to me like that was Opus running in a loop, presumably via Claude Code?
I did have one problem (involving SQLite triggers) that I bounced off various LLMs for genuinely a full year before finally getting to an understanding that it wasn't solvable! https://github.com/simonw/sqlite-chronicle/issues/7
It wasn't in a loop really, it was more "I have this issue" "OK I know exactly why, wait" $3 later "it's still there" "OK I know exactly why, it's a different reason, wait", repeat until $35 is gone and I quit.
I would have much appreciated if it could throw its hands up and say it doesn't know.
I was benchmarking some models the other day via openrouter and I got the distinct impression some of these models treat the thinking token budget as a target rather than a maximum.
I solve this by in my prompt. I say if you can’t fix it in two tries look online on how to do it if you still can’t fix it after two tries pause and ask for my help. It works pretty well.
This is doing some heavy lifting