Y
Hacker News
new
|
ask
|
show
|
jobs
by
belval
1249 days ago
It's likely due to the corpus though. It's multilingual, but the dataset they trained on is representative of "the Internet" so the latin languages (English, French, Spanish, Italian, German, etc...) are overly represented.