|
|
|
|
|
by derefr
725 days ago
|
|
> But perhaps the simplest explanation is that an LLM doesn't recognize what constitutes a correct answer but is compelled to provide one Why is it compelled to provide one, anyway? Which is to say, why is the output of each model layer a raw softmax — thus discarding knowledge of the confidence each layer of the model had in its output? Why not instead have the output of each layer be e.g. softmax but rescaled by min(max(pre-softmax vector), 1.0)? Such that layers that would output higher than 1.0 just get softmax'ed normally; but layers that would output all "low-confidence" results (a vector all lower than 1.0) preserve the low-confidence in the output — allowing later decoder layers to use that info to build I-refuse-to-answer-because-I-don't-know text? |
|
1. An LLM's mathematical "confidence" of having a clear best-scoring candidate for the predicted next token when given a list of tokens.
2. A not-yet-invented AI that models the idea of different entities interacting, the concept of questions and answers, the concept of logical conflicts, and it's "confidence" that a proposition is compatible with other "true" propositions and incompatible with false ones.
To help illustrate the difference, suppose you trained an LLM on texts where a particular question was always answered with "I don't know, I have zero confidence in anything anymore." Later the LLM will regurgitate similarly nihilistic text, and by all objective internal measures it will be extremely "confident" as it does so.
> Why is it compelled to provide one, anyway
It's following the patterns in its training data, which probably reflects a whole lot more people trying to provide answers (sometimes even deliberately wrong ones) as opposed to admitting uncertainty.
This is especially true if developers put their thumb on the scale by injecting primer-text like "You are an intelligent computer eager to provide answers", as opposed to "behave like Socrates and help people understand that nothing is truly knowable."