|
|
|
|
|
by fizx
2864 days ago
|
|
There are a lot of places in practical neural nets with attention where you want softmax(queryvector ยท memorymatrix), where memory can be quite large. If you have a decent ANN implementation, you can approximate by only calculating the dot product for the vectors of memory that neighbor the query. There are currently a ton of mediocre ways to do this because nothing really works very well in high dimensions, and calculating this can easily be the bottleneck in training and evaluation. |
|