Y
Hacker News
new
|
ask
|
show
|
jobs
by
igorkraw
1670 days ago
Your last point is true only with positional encodings though, attention itself is a permutation equivariant function