Hacker News new | ask | show | jobs
by rockinghigh 353 days ago
The vocabulary size is fairly small (128,256) for a multilingual model. I would guess it doesn't require many additional parameters to support these 5 languages as many tokens can be shared.