Hacker News new | ask | show | jobs
by parrt 1730 days ago
Thanks! Took me a year to discover the key nut there. L1 vs L2 regularization is not well described I found so I went nuts trying to nail it down.
1 comments

If you're interested, in my thesis I induced l1-regularized decision trees through a boosting style approach. Adding an l1 term and maximizing the gradient led to sparse tree.