| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bitRAKE 1218 days ago
	The 1.4T tokens are what the model was trained on, and not the token range of the embedding.

1 comments

dentalperson 1218 days ago

Ah, that makes more sense, thank you. Since this was mentioned in the tokenizer section and the number of unique tokens wasn't mentioned I misunderstood.

link