Hacker News new | ask | show | jobs
by yorwba 430 days ago
They train on 14 billion tokens in Slovene. Are you sure that's not enough?
1 comments

Unfortunately, yes.

We need more tokens, more variety of topics in texts and more complexity.

We need one-shot learning.

(That amount is equivalent to 50000 books, which few nationals will have read.)