Hacker News new | ask | show | jobs
by ddren 1188 days ago
The llama models were trained with a context size of 2048. By default llama.cpp limits it to 512, but you can use -c 2048 -n 2048 to get the full context window.
1 comments

2048 words?
Tokens. Short or common words tend to be one token, while less common words are composed of multiple tokens. For GPT OpenAI gives the rule of thumb that on average you need four tokens to encode three words, and LLaMA should be similar
Well that's for sure bigger than my context size.
2048 "tokens", where one token is roughly equivalent to ¾ of a word
Tokens