Y
Hacker News
new
|
ask
|
show
|
jobs
by
montebicyclelo
396 days ago
It wasn't obvious that a transformer could do this, and learn to produce conv via attention
1 comments
artemisart
396 days ago
But it is, as long as the positional embedding are sufficient, i.e. use relative positional embeddings here.
link