Hacker News new | ask | show | jobs
by twobitshifter 619 days ago
Agreed, Ilya Sutskever himself has spent a long time with lstm and published papers like this one while working at Google. http://proceedings.mlr.press/v37/jozefowicz15.pdf

Recent comments from him have said that any architecture can achieve transformer accuracy and recall, but we have devoted energy to refining transformers, due to the early successes.