| > Besides, overfitting is defined procedurally, not by how well it performs. Huh? That's the opposite of the truth. Compare https://en.wikipedia.org/wiki/Overfitting : > In statistics, overfitting is "the production of an analysis that corresponds too closely or exactly to a particular set of data, and may therefore fail to fit additional data or predict future observations reliably". > The essence of overfitting is to have unknowingly extracted some of the residual variation (i.e. the noise) as if that variation represented underlying model structure. Procedural concerns are not part of the concept. Conceptually, overfitting means including information in your model that isn't relevant to the prediction you're making, but that is helpful, by coincidence, in the data you're fitting the model to. But since that can't be measured, instead, you measure overfitting through performance. |
An overfitted model is a statistical model that contains more parameters than can be justified by the data.
Another definition from https://www.ibm.com/cloud/learn/overfitting#:~:text=Overfitt...:
Overfitting is a concept in data science, which occurs when a statistical model fits exactly against its training data.
We can argue over the precise definition of overfitting, but when you fitting a high-capacity model exactly to the training data, that is a procedural question and I would argue falls under the overfitting umbrella.