Y
Hacker News
new
|
ask
|
show
|
jobs
by
om8
361 days ago
> 50 tokens is not really very much Yes! And also llama3.1’s tokens are different from Qwen and llama1 tokens. That’s the first model where meta started to use very large vocab_size.