|
|
|
|
|
by yvdriess
2291 days ago
|
|
The reverse is true, embeddings are both the performance and memory-footprint bottleneck of modern NN models. Check figure 6. of : https://arxiv.org/pdf/1906.00091.pdf Embeddings are used to lookup sparse features, so you have those pesky data-dependent lookups. |
|
They may be a bottleneck, but the alternative is worse -- you can't fit complex models with large vocabularies into GPU memory using sparse one-hot encodings.