Y
Hacker News
new
|
ask
|
show
|
jobs
by
srush
968 days ago
The model targets the decoder part of the system which is the speed bottleneck. So for tasks like classification it is not likely to be helpful. However a similar method could be used for that use case. (Coauthor)