Hacker News new | ask | show | jobs
by thot_experiment 723 days ago
Thanks! If you have a moment can you give me a quick explainer on what happens when you hit the context limit in ollama? I had assumed that ollama would just trunc the context to whatever is set in the model, but I guess this isn't the case?
1 comments

Currently when the context limit is hit, there's a halving of the context window (or a "context shift") to allow inference to continue – this is helpful for smaller (e.g. 1-2k) context windows.

However, not all models (especially newer ones) respond well to this, which makes sense. We're working on changing the behavior in Ollama's API to be more similar to OpenAI, Anthropic and similar APIs so that when the context limit is hit, the API returns a "limit" finish/done reason. Hope this is helpful!