Hacker News new | ask | show | jobs
by fizx 2864 days ago
There are a lot of places in practical neural nets with attention where you want softmax(queryvector ยท memorymatrix), where memory can be quite large. If you have a decent ANN implementation, you can approximate by only calculating the dot product for the vectors of memory that neighbor the query.

There are currently a ton of mediocre ways to do this because nothing really works very well in high dimensions, and calculating this can easily be the bottleneck in training and evaluation.