| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Greenpants 965 days ago
	I still haven't found "that one visualisation" that makes the attention concept in Transformers as easily understood as these CNNs. If someone here on HN has a link to a page that has helped them get to the Eureka-point of fully grasping attention layers, feel free to share!

1 comments

juliangoldsmith 965 days ago

I found this video helpful for understanding transformers in general, but it covers attention too: https://www.youtube.com/watch?v=kWLed8o5M2Y

The short version (as I understand it) is that you use a neural network to weight pairs of inputs by their importance to each other. That lets you get rid of unimportant information while keeping what actually is important.

link