Hacker News new | ask | show | jobs
by stingraycharles 337 days ago
Right, maybe my definition of overfitting was wrong, I always understood it more as trying to optimize for a specific benchmark / use case, and then it starts failing in other areas.

But the way you phrase it, it’s just “the model is not properly able to generalize”, ie it doesn’t understand the concept of silence also makes sense.

But couldn’t you then argue that any type of mistake / unknown could be explained as “overfitting” ? Where do you draw the line ?

2 comments

I don't think so. Overfitting = the model was too closely aligned to the training data and can't generalize towards *unseen* data. I think it saw "silence" before, so it's not overfitting but just garbage in, garbage out.
Your definition is one, but the one the OP is using is overfitting to training data.
That’s exactly my point: by that definition any incorrect answer can be explained by “overfitting to training data”.

Where do you draw the line between “overfitting to training data” and “incorrect data” ?

> That’s exactly my point: by that definition any incorrect answer can be explained by “overfitting to training data”.

Not really, getting 94381294*123=... wrong, but close within the actual answer, cannot be overfitting since it wasn't in the training data.

> [By] that definition any incorrect answer can be explained by “overfitting to training data”.

No it doesn't, for instance some errors would be caused by under fitting. The data could also be correct but your hyperparameters (such as the learning rate or dropout rate) could cause your model to overfit.

> Where do you draw the line between “overfitting to training data” and “incorrect data” ?

There's no need to draw a line between two explanations that aren't mutually exclusive. They can (as in this case) both be true. Overfitting is the symptom; dirty data is the cause.