Hacker News new | ask | show | jobs
by maxloh 1484 days ago
What is the dataset used for training the model? Where did the data come from?
1 comments

All of them are freely available. Most of them through mtdata [1]. The exact list of the datasets is in the firefox-translations-training pipeline configuration file [2].

[1] https://pypi.org/project/mtdata/

[2] https://github.com/mozilla/firefox-translations-training/blo...