Hacker News new | ask | show | jobs
by wongarsu 1188 days ago
Tokens. Short or common words tend to be one token, while less common words are composed of multiple tokens. For GPT OpenAI gives the rule of thumb that on average you need four tokens to encode three words, and LLaMA should be similar
1 comments

Well that's for sure bigger than my context size.