Hacker News new | ask | show | jobs
by aghilmort 185 days ago
there’s decent work on computational reasoning power of transformers, SSMs, etc.

some approximate snippets that come to mind are that decoder-only transformers recognize AC^0 and think in TC^0, that encoder-decoders are strictly more powerful than decoder-only, etc.

Person with last name Miller iric if poke around on arXiv, a few others, been a while since was current top of mind so ymmv on exact correctness of above snippets

1 comments

You are probably thinking of Merrill (whose work is referenced towards the end of the article).
ah yes Merrill thx!