|
|
|
|
|
by otabdeveloper4
68 days ago
|
|
LLMs are next token predictors. Outputting tokens is what they do, and the natural steady-state for them is an infinite loop of endlessly generated tokens. You need to train them on a special "stop token" to get them to act more human. (Whether explicitly in post-training or with system prompt hacks.) This isn't a general solution to the problem and likely there will never be one. |
|