| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by sushirain 3973 days ago

They are effective because:

- They use more parameters (and fewer computations per parameter.)

- They are hierarchical (convolutions are apparently useful at different levels of abstraction of data).

- They are distributed (word2vec, thought-vectors). Not restricted to a small set of artificial classes such as parts-of-speech or parts of visual objects.

- They are recurrent (RNN).

etc.

1 comments

kylebgorman 3973 days ago

word2vec isn't "deep" in the relevant sense. The both skipgram and CBOW forms have a single hidden layer.

link