Hacker News new | ask | show | jobs
by z2210558 1547 days ago
L1 regularisation is the usual way (see e.g. https://en.wikipedia.org/wiki/Lasso_(statistics))
3 comments

There are also some iterative methods like grafting (using feature gradients):

https://www.jmlr.org/papers/volume3/perkins03a/perkins03a.pd...

and gain-based selection (using the improvement of the objective), see the appendix of:

https://aclanthology.org/J96-1002.pdf

We used grafting for parser feature selection, for which it worked quite well:

https://danieldk.eu/Research/Publications/ucnlg2011.pdf

Feature selection ought to be model-specific. Because a feature wasn't selected by Lasso (in a linear model) does not mean it cannot be useful in a non-linear model.
Psychologists use factor analysis.