Hacker News new | ask | show | jobs
by SneakyTornado29 1878 days ago
The title is a reference to the famous machine learning paper "Attention Is All You Need" which introduced the concept of transformers. Transformers have revolutionized how we process sequential data (i.e. natural language processing).
3 comments

And recently, a paper titled Attention Is Not All You Need has made the rounds arguing that some of the claims made in the AIAYN paper may have been overstated. https://arxiv.org/abs/2103.03404
If you read the title, it only refers to the multi-head-attention part of BERT, excluding the feed forward and skip connections, hence calling it "Pure Attention".

> Attention is Not All You Need: Pure Attention Loses Rank Doubly Exponentially with Depth

This does not prove the original title was wrong, and this paper is not a counter, but an analysis of a submodule which helps better understanding transformers.

> famous machine learning paper "Attention Is All You Need"

1. It's a paper from 2017. Unless you follow academic ML research, you will not have heard of it.

2. That paper's title is also inscrutable unless you've gone and read at least the abstract.

Which itself is a reference to the 1967 Beatles song All You Need is Love (which also includes the line "Love is all you need").