Hacker News new | ask | show | jobs
by sva_ 739 days ago
It's an LLM architecture competing with transformers: https://arxiv.org/abs/2312.00752

Proponents of it usually highlight it's inference performance, in particular linear scaling with the input tokens.

1 comments

I really disagree with pigeonholing it as an LLM architecture! It is much more general than that as I mentioned in another comment in this post [1] (and of course as mentioned in the original paper which you linked).

[1] https://news.ycombinator.com/item?id=40616181