| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ddren 1188 days ago
	The llama models were trained with a context size of 2048. By default llama.cpp limits it to 512, but you can use -c 2048 -n 2048 to get the full context window.

1 comments

worldsayshi 1188 days ago

2048 words?

link

wongarsu 1188 days ago

Tokens. Short or common words tend to be one token, while less common words are composed of multiple tokens. For GPT OpenAI gives the rule of thumb that on average you need four tokens to encode three words, and LLaMA should be similar

link

worldsayshi 1188 days ago

Well that's for sure bigger than my context size.

link

doctoboggan 1188 days ago

2048 "tokens", where one token is roughly equivalent to ¾ of a word

link

teaearlgraycold 1188 days ago

Tokens

link