Hacker News new | ask | show | jobs
by YeGoblynQueenne 781 days ago
Addendum:

>> Do you realize how much more data and compute it would take to train a Vanilla RNN to say GPT-3 level performance?

Oh, good point. And what would GPT-3 do with the typical amount of data used to train an LSTM? Rhetorical.