|
|
|
|
|
by godelski
698 days ago
|
|
I think the miscommunication is due to the proxy nature of our modeling. From one perspective, yes you're right because it's just on your optimization function and objectives. But if we're in the context where we recognize the practical usage of our model replies on it being an inexact representation (proxy) then certainly there is too much optimization. I mean most of what we try to model in ML is intractable. In fact, that entire notion of early stopping is due to this. We use a validation set as a pseudo test set to inject information into our optimization products without leaking information from the test set (why you shouldn't choose parameters based on test results. That is spoilage. Doesn't matter if it's status quo, it's spoilage) But we also need to consider that a lack of divergence between train/val does not mean there isn't overfittng. Divergence implies overfittng but the inverse statement is not true. I state this because it's both relevant here and an extremely common mistake. |
|