Hacker News new | ask | show | jobs
by whbrown 457 days ago
Those sound like the sort of issues which could be caused by your server silently truncating the middle of your prompts.

By default, Ollama uses a context window size of 2048 tokens.

1 comments

I checked this, the whole conversation was about 1000 tokens.

I suspect the Ollama version might have wrong default settings, such as conversation delimiters. The experience of Gemma 3 in AI studio is completely different.