|
|
|
|
|
by berkes
461 days ago
|
|
> Remember, LLMs are just statistical sentence completion machines. So telling it what to respond with will increase the likelihood of that happening, even if there are other options that are viable. Obviously. When I say "tuned" I don't mean adding stuff to a prompt. I mean tuning in the way models are also tuned to be more or less professional, tuned to defer certain tasks to other models (i.e. counting or math, something statistical models are almost unable to do) and so on. I am almost certain that the chain of models we use on chatgpt.com are "tuned" to always give an answer, and not to answer with "I am just a model, I don't have information on this". Early models and early toolchains did this far more often, but today they are quite probably tuned to "always be of service". "Quite probably" because I have no proof, other than that it will gladly hallucinate, invent urls and references, etc. And knowing that all the GPT competitors are battling for users, so their products quite certainly tuned to help in this battle - e.g. appear to be helpful and all-knowing, rather than factual correct and therefore often admittedly ignorant. |
|
The root problem is training models to be uncertain of their answers results in lower benchmarks in every area except hallucinations. It's like you were in a multiple choice test and instead of picking which of answers A-D you think made more sense you picked E "I don't know". Helpful for the test grader, a bad bet for the model trying to claim it gets the most answers right compared to other models.