| Low entropy is expected here, since the model is seeking a “best” answer based on reward training. But I see the same misconceptions as always around “hallucinations”. Incorrect output is just incorrect output. There is no difference in the function of the model, no malfunction. It is working exactly as it does for “correct “ answers. This is what makes the issue of incorrect output intractable. Some optimisation can be achieved through introspection, but ultimately, an llm can be wrong for the same reason that a person can be wrong, incorrect conclusions, bad data, insufficient data, or faulty logic/modeling. If there was a way to be always right, we wouldn’t need LLMs or second opinions. Agentic workflows and introspection/cot catch a lot, and flights of fancy are often not supported or replicated with modifications to context, because the fanciful answer isn’t reinforced in the training data. But we need to get rid of the unfortunate term for wrong conclusions,“hallucination” . When we say a person is hallucinating, it implies an altered state of mind. We don’t say that bob is hallucinating when he thinks that the sky is blue because it reflects the ocean, we just know he’s wrong because he doesn’t know about or forgot about Raleigh scattering. Using the term “hallucination” distracts from accurate thought and misleads people to draw erroneous conclusions. |