Y
Hacker News
new
|
ask
|
show
|
jobs
by
bitRAKE
1218 days ago
The 1.4T tokens are what the model was trained on, and not the token range of the embedding.
1 comments
dentalperson
1218 days ago
Ah, that makes more sense, thank you. Since this was mentioned in the tokenizer section and the number of unique tokens wasn't mentioned I misunderstood.
link