Y
Hacker News
new
|
ask
|
show
|
jobs
by
maremmano
459 days ago
Maybe these papers: "Attention Is All You Need" (Transformer paper, 2017) "Improving Language Understanding by Generative Pre-training" (GPT-1, 2018) "Language Models are Unsupervised Multitask Learners" (GPT-2, 2019)