|
|
|
|
|
by radarsat1
611 days ago
|
|
Agreed. I have a loose idea that hallucination is related to training to maximize the probability of individual tokens while ignoring the joint sequence probability, which is along the lines of what you are saying -- it is not trained to output the most probable final sequence, so it gets stuck in the "wrong place" half way through. |
|