|
|
|
|
|
by baron_harkonnen
1729 days ago
|
|
> In practice, these huge models are, in laymans terms, fucking awesome and work really well A similarly surprising result from an adjacent community, Bayesian Statistics, is that in the case of hierarchical models, increasing your number of parameters can paradoxically reduce overfitting. The scale of parameters in Bayesian model's is no where near that of these deep neural nets, but nonetheless this is a similarly shocking result since typically adding parameters is penalized when model building. It's a bit more explainable in Bayesian stats since what you're using some parameters for is limiting the impact of more granular parameters (i.e. you're learning a prior probability distribution for the other parameters, which prevents extreme overfitting in cases with less information). I wouldn't be too surprised if eventually we realized there was a similar cause preventing overfitting is overparameterized ml models. |
|