| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ericskiff 854 days ago
	Has anyone found the context length for these models yet? So far I haven't seen it mentioned in their write-up or the model card

2 comments

minimaxir 854 days ago

For posterity, an easy way to find the context length of a LLM hosted on Hugging Face is to look at the max_position_embeddings in the config.json, which shows the 8192 mentioned in another comment. (although in this case you need to sign the agreement first)

link

brucethemoose2 854 days ago

There are some exceptions, like Mistral 0.1 (which is technically 32K according to the config but practically 8K because the sliding window is awful) and InternLM (which (at least initially) used auto rope scaling to extend the context as part of the model's architecture).

link

minimaxir 854 days ago

Yes, RoPE has thrown a wrench into things a bit.

link

kathleenfromgdm 854 days ago

The context length for these models is 8192 tokens.

link