Hacker News new | ask | show | jobs
by whimsicalism 1260 days ago
> For example, they could have used a pretrained featurizer and trained the two layer model on top of it, with both back prop and FF and compared.

Making the assumption that weights/embeddings produced by a backprop-trained network are equally intelligible to a network also trained by backprop vs. one trained by this alternative method.

1 comments

I have personally seen them used successfully with all kinds of classic ml algorithms (enets, tree-based, etc) that have nothing to do with back prop.