Hacker News new | ask | show | jobs
by melony 1602 days ago
The last 3 years were mostly dominated by transformer models. BERT, GPT-3, they are all scaled up and variations on the transformer model. It is surprisingly good and long-lived.