Hacker News new | ask | show | jobs
by joewferrara 880 days ago
They show that a decoder only transformer (which gpts are) are rnns with infinite hidden state size. Infinite hidden state size is a pretty strong thing! Sounds interesting to me.
1 comments

not infinite, just scaling linearly with length