| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jasonjmcghee 1260 days ago

I haven't seen this to be the case, fwiw. There was a paper in 2016 that did this and most were in the ~40% range.

But "any ml algorithm" isn't the point. It's a new optimization technique and should be applied to models/architectures that make sense with the problems they are being used on.

For example, they could have used a pretrained featurizer and trained the two layer model on top of it, with both back prop and FF and compared.

1 comments

whimsicalism 1260 days ago

> For example, they could have used a pretrained featurizer and trained the two layer model on top of it, with both back prop and FF and compared.

Making the assumption that weights/embeddings produced by a backprop-trained network are equally intelligible to a network also trained by backprop vs. one trained by this alternative method.

link

jasonjmcghee 1260 days ago

I have personally seen them used successfully with all kinds of classic ml algorithms (enets, tree-based, etc) that have nothing to do with back prop.

link