Hacker News new | ask | show | jobs
by Vecr 1172 days ago
That might not be true. OpenAI do set a limit of the total number of tokens, and since I'm pretty sure they trained the model and the tokenizer on mostly English text, I assume there's a somewhat proportional bias toward English based on the input dataset to those models.