Hacker News new | ask | show | jobs
by scriptsmith 935 days ago
Seems like he actually disagrees here:

If you train a bigger model on more text, we have a lot of confidence that the next-word prediction task will improve. So algorithmic progress is not necessary, it's a very nice bonus, but we can sort of get more powerful models for free, because we can just get a bigger computer, which we can say with some confidence we're going to get, and just train a bigger model for longer, and we are very confident we are going to get a better result.

https://youtu.be/zjkBMFhNj_g?t=1543 (23:43)

1 comments

And then at 35 minutes he spends a few minutes talking about ideas for algorithmic improvements.