Y
Hacker News
new
|
ask
|
show
|
jobs
by
davidatbu
1457 days ago
I wouldn't rule out the fact that transformers are very amenable to parallel computation as the reason