|
|
|
|
|
by paradite
217 days ago
|
|
In theory, auto-regressive models should not have limit on context. It should generate the next token with all previous tokens. In practice, when training a model, people select a context window so that during inference, you know how much GPU memory to allocate for a prompt and reject the prompt if it exceeds the memory limit. Of course there's also degrading performance as context gets longer, but I suspect memory limit is the primary factor of why we have context window limits. |
|