Hacker News new | ask | show | jobs
by authorfly 609 days ago
Yes, but you can control that.

You can use set fit, less examples, or SVM or etc depending on how much separation, recall and other aspects matter to you for the task at hand.

Sensitivity level to biasing to the dataset is a choice of training method, not an attribute.

It's just not really a major issue unless you finetune with an entirely new or unseen language in the present day.