Y
Hacker News
new
|
ask
|
show
|
jobs
by
robbedpeter
1642 days ago
The transformer in transformer (TnT) model looks promising - you can set up multiple overlapping domains of attention, at arbitrary scales over the input.