Y
Hacker News
new
|
ask
|
show
|
jobs
by
rockinghigh
353 days ago
The vocabulary size is fairly small (128,256) for a multilingual model. I would guess it doesn't require many additional parameters to support these 5 languages as many tokens can be shared.