| HN Mirror

> What would be the advantage to train a model on the whole dataset, you would be running it against later? Wouldn't that overfit the network?

The opposite, it would almost be cheating since the model would just have memorized all the correct solutions. You'd usually want to train on a subset of the data and evaluate against the remainder to protect against overfitting. So to that point I agree with your evaluation. But after that you'd use it with real data in the wild for your actual use case. With geogussr you'd know already what all the "real-world" situations are and overfitting wouldn't matter as long as you retrain wherever they add new sets.