|
|
|
|
|
by YeGoblynQueenne
1436 days ago
|
|
>> Even if it explains good performance on the test data, it does not explain the typically good performance on real world data never seen during the training process. You'll have to clarify this because I'm not sure what you mean by "real world data". Do you mean e.g. data that is made available after a machine learning system is deployed "live"? As far as I can tell, nobody really does this kind of "extrinsic" evaluation systematically, first of all because it is very expensive: such "real world data" is unlabelled, and must be labelled before the evaluation. What's more, the "real world data" is very likely to change between deployments of a machine learning system so any evaluation of a model trained last month may not be valid this month. So this is all basically very expensive in terms of both money and effort (so, money), and so nobody does it. Instead everyone relies on the approximation of real-world performance on their already labelled datasets. |
|
It's widely recognized that image recognition models typically perform well also on such data. We don't need to quantify that exactly to conclude that many large (in terms of parameters) models generalize quite well to data neither in the training or the test set.
Provided that the model space is large enough to contain both models that generalize well and models that don't (while still fitting the training data), some explanation why we tend to find generalizing models is required.