|
|
|
|
|
by anima-core
188 days ago
|
|
As a follow up just to refresh your memory: “Attention Is All You Need” (Vaswani et al., 2017) Length: 11 pages of main content, 5 pages of references and appendix 2. The first GPT paper (Radford et al., 2018) Length: 12 pages 3. BERT (Devlin et al., 2018) Length: 14 pages Big ideas don't require big papers. I don't know where you got that idea from. |
|