Hacker News new | ask | show | jobs
by luguenth 1908 days ago
What would be the advantage to train a model on the whole dataset, you would be running it against later? Wouldn't that overfit the network?

And for the cheating part; geoGuessr actually exposes the right answer in their API. You you could just use that to pinpoint the exact location automated.

1 comments

> What would be the advantage to train a model on the whole dataset, you would be running it against later? Wouldn't that overfit the network?

The opposite, it would almost be cheating since the model would just have memorized all the correct solutions. You'd usually want to train on a subset of the data and evaluate against the remainder to protect against overfitting. So to that point I agree with your evaluation. But after that you'd use it with real data in the wild for your actual use case. With geogussr you'd know already what all the "real-world" situations are and overfitting wouldn't matter as long as you retrain wherever they add new sets.