|
|
|
|
|
by elandau25
1771 days ago
|
|
I see what you are saying, but in that context then you lose what most people's intuitive definition of overfitting is. If I train a model on one image as my train set and then change one random pixel and run that model on this eval set then your argument would be that this is not overfitting because you are performing well on the eval set you created the model for. My argument is that compared to models, as most people use them, micro-models are low bias and high variance, and thus overfit. That's why I set a distinction between a batman model and a batman micro-model. |
|
The way you use over-fitting is misleading. In fact, according to the article, the model is fit just right for its purpose. If it were fit any less, given the five pictures, it might not work at all. Your confusion arises because what you actually change is the objective and the DGP in question.
It should be clear to anyone that over-fitting and under-fitting is conceptually tied to the DGP under consideration. It makes no sense to speak of a model being "generally over fit" (!)
An "intuitive definition" of over-fitting that does not take into account this crucial fact will always be problematic.
For instance, if you train a model to have zero error, it does not imply it is over fit. If your training set is broad enough, and the production environment has the same exact underlying DGP, then the model is simply fit well. In practice, the training data is not the same as all the data coming from the latent DGP that the model eventually encounters. For that reason, such a model would be overfit.
However, in this case, the model does not seem to fail on any DGP that corresponds to the task: Identifying one type of Batman. It is therefore not overfit.
I am sorry, but op is right.