Hacker News new | ask | show | jobs
by Mithriil 232 days ago
My opinion on the "Attention is all you need" paper is that its most important idea is the Positional Encoding. The transformer head itself... is just another NN block among many.