Hacker News new | ask | show | jobs
by make3 3169 days ago
the embedding layer is the layer that converts the one hot word feature in to a continuous multi dimensional vector that the deep net can learn with. they used to pretrain that layer separately with word2vec. now as it's just a neural net layer, they let the translation model train it with backprop on the main (translation /dialog / qa, etc) task as a regular layer