| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by biofox 533 days ago
	Isn't that all of modern AI?

1 comments

immibis 532 days ago

Transformers are completely unlike RNNs.

link

tripplyons 532 days ago

There are some interesting connections between them. If you remove the softmax from the attention formula, you end up with linear attention, which has a recurrent form.

I haven't read it, but the Mamba 2 paper claims to establish a stronger connection.

link

kadushka 532 days ago

* If you remove the softmax from the attention formula, you end up with linear attention*

Sorry, what?

link

tripplyons 532 days ago

Here is a paper explaining it: https://arxiv.org/abs/2006.16236

link