Y
Hacker News
new
|
ask
|
show
|
jobs
by
kasmura
678 days ago
Yes, it is just a way of computing the self-attention in a distributed way