Hacker News new | ask | show | jobs
by kkoncevicius 1315 days ago
My first impression is also in agreement with the parent. The blog post appears to use some terms loosely in order to make the connection between overfitting and Goodhart's law stronger. For example - calling training sample "proxy" and stating that it is is a slightly different goal is already leading towards the pre-defined conclusion.

And the reply also leaves me with a similar impression:

> your training objective can be different at train and test time

But this is not overfitting, this is concept drift, a different and well-defined thing in ML.

> the datapoints you use can be different at train and test time

Both train and test data came from the same population. They are just different incomplete random samples.

I guess what I am getting at - overfitting happens because we know we are training a model on an incomplete representation of the whole. But that representation is not a proxy, as suggested in the article - it is not slightly different to the goal. It's an incomplete piece of the goal.

2 comments

A gentle note that an incomplete piece of a goal (e.g. a loss function computed on a subset of the data) is a proxy for the full goal (e.g. the loss function on the full dataset).

Similarly, concept drift can be a source of overfitting -- the objective you care about is the one after the concept drift occurred, but the objective you trained on is the one from before the concept drift. (Here's a scholar search for papers where the two concepts co-occur: https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&as_vis... )

I think this is a difficult concept for many without statistical training. The fact that different outcomes can be "the same" from a practical perspective.