| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by JASchilz 4617 days ago

Validation is a method to control for over-fitting, but over-fitting isn't a danger to all projects. Suppose we know that our dataset is iid normally distributed with known sigma. Using all available data to find the mean doesn't put us in danger of overfitting. And if you would like a posterior on the true disposition of the mean, there are ways to produce that.

Generally we're in danger of overfitting when the cardinality of our data is comparable to or less than the cardinality of our parameters (including meta-parameters like which model to select).

What I just described is a perspective derived from Bayesian model selection. But Bayesian model selection encompasses other types of model selection; it need not be considered a separate path.