|
|
|
|
|
by data-ottawa
345 days ago
|
|
if you shrink the context window on most models you'll get this type of behaviour. If you go too small you end up with basically gibberish even on modern models like Gemini 2.5. Mercury has a 32k context window according to the paper, which could be why it does that. |
|
Even though it has gotten drastically better and rarer, I think this is going to be one of the failure modes that's just fundamental to the technology.