Hacker News new | ask | show | jobs
Comparing 5 ways to implement Multihead Attention in PyTorch (github.com)
3 points by rasbt 841 days ago