Hacker News new | ask | show | jobs
by spencerchubb 613 days ago
it's not "just" model error

during pre-training, there is never an incentive for the model to say "I don't know" because it would be penalized. the model is incentivized to make an educated guess

large transformer models are really good at approximating their dataset. there is no data on the internet about what LLMs know. and even if there were such data, it would probably become obsolete soon

that being said, maybe a big shift in the architecture could solve this. I hope!

3 comments

> it would probably become obsolete soon

Suppose there are many times more posts about something one generation of LLMs can't do (arithmetic, tic-tac-toe, whatever), than posts about how the next generation of models can do that task successfully. I think this is probably the case.

While I doubt it will happen, it would be somewhat funny if training on that text caused a future model to claim it can't do something that it "should" be able to because it internalized that it was an LLM and "LLMs can't do X."

also presumes that the LLM knows it is an LLM
System prompts sometimes contain the information that "it" is an LLM.

Maybe in the future, those prompts will include motivational phrases, like "You can do it!" or "Believe in yourself, then you can achieve anything."

They're generally fine tuned not to. I'm not sure how long that will hold though.
- Are you an LLM?

- As a Large Language Model, I am fine tuned to be unable to answer this question.

in another paper which popped up recently they approximated uncertainty with Entropy and inserted "wait!" tokens whenever Entropy was high, simulating chain of thought within the system.
> during pre-training, there is never an incentive for the model to say "I don't know" because it would be penalized. the model is incentivized to make an educated guess

The guess can be "I don't know". The base LLM would generally only say I don't know if it "knew" that it didn't know, which is not going to be very common. The tuned LLM would be the level responsible for trying to equate a lack of understanding to saying "I don't know"