Hacker News new | ask | show | jobs
by anima-core 188 days ago
As a follow up just to refresh your memory:

“Attention Is All You Need” (Vaswani et al., 2017)

Length: 11 pages of main content, 5 pages of references and appendix

2. The first GPT paper (Radford et al., 2018)

Length: 12 pages

3. BERT (Devlin et al., 2018)

Length: 14 pages

Big ideas don't require big papers. I don't know where you got that idea from.

1 comments

Your paper is 10 pages of fluff without even an architecture diagram or a single equation, bro. It's not real.