|
|
|
|
|
by zone411
1169 days ago
|
|
You're making incorrect assumptions. This project wasn't about scaling any published approaches. It was original neural net research that produced excellent results with a new architecture without self-attention, using a new optimizer, new regularization and augmentation ideas, sparsity, but with some NLP feature engineering, etc. Scaling it up to GPT-2 size matched its performance for English (my project was English-only and it was bidirectional unlike GPT so not a perfect comparison), and very likely scaling it up to GPT-3 size would have matched it as well, since GPT-3 wasn't much of an improvement over GPT-2 besides scale. Unclear for GPT-4 since there is very little known about it. Of course, in the meantime, most of these ideas are no longer SOTA and there has been a ton of progress in GPU hardware and frameworks like PyTorch/TF. You can check out my melodies project from a year ago as a current example. There is nothing matching it yet: https://www.youtube.com/playlist?list=PLoCzMRqh5SkFPG0-RIAR8.... And that's just my personal project. What you're saying about companies recognizing the commercial potential is clearly wrong. It's six years later and Siri, Alexa, and Google Home are still nearly as dumb as they were back then. Microsoft is only now working on adding a writing assistant to Word, and that's thanks to OpenAI. Why do you think Google had to have "code red" if they saw the potential? Low-budget startups are also very slow - they should've had their products out when the GPT-3 API was published, not now. One thing I didn't expect is how well this same approach would work for code. I haven't even tried to do it. |
|
And I'm sorry, but you're completely wrong about companies recognizing commercial potential. I worked on Alexa for five years, it is a far harder problem than you think. It is nowhere near as simple as "we just weren't looking at the right NN architecture or optimizer!" You're acting like it was a novel idea to think LMs would be extremely useful if the performance was better (in 2017). I'm just trying to tell you that isn't the case.