Y
Hacker News
new
|
ask
|
show
|
jobs
by
puppystench
70 days ago
I believe you're right, it's an issue of the model misinterpreting things that sound like user message as actual user messages. It's a known phenomenon:
https://arxiv.org/abs/2603.12277