https://www.jmlr.org/papers/volume3/perkins03a/perkins03a.pd...
and gain-based selection (using the improvement of the objective), see the appendix of:
https://aclanthology.org/J96-1002.pdf
We used grafting for parser feature selection, for which it worked quite well:
https://danieldk.eu/Research/Publications/ucnlg2011.pdf