Hacker News new | ask | show | jobs
by stri8ed 805 days ago
Isn't that how previous models were, before the attention is all you need paper?