The llama models were trained with a context size of 2048. By default llama.cpp limits it to 512, but you can use -c 2048 -n 2048 to get the full context window.
Tokens. Short or common words tend to be one token, while less common words are composed of multiple tokens. For GPT OpenAI gives the rule of thumb that on average you need four tokens to encode three words, and LLaMA should be similar