Hacker News new | ask | show | jobs
by derivativethrow 2189 days ago
Have you read the GPT-3 paper? The authors are pretty forthright about how the GPT techniques likely won't scale much beyond 175B parameters.