Hacker News new | ask | show | jobs
by XenophileJKO 317 days ago
From my experience. Gemini is REALLY bad about context blending. It can't keep track of what I said and what it said in a conversation under 200K tokens. It blends concepts and statements up, then refers to some fabricated hybrid fact or comment.

Gemini has done this in ways that I haven't seen in the recent or current generation models from OpenAI or Anthropic.

It really surprised me that Gemini performs so well in multi-turn benchmarks, given that tendency.

1 comments

I’ve not experimented with the recent models for this but older Gemini models were awful for this - they’d lie about what I’d said or what was in their system prompt even with short conversations.