Hacker News new | ask | show | jobs
by opprobium 741 days ago
Not just efficiently, can't solve.
1 comments

They can solve it if you keep adding layers to the transformer, it's just not efficient; you'd need exponentially more layers than a similarly sized RNN.