Hacker News new | ask | show | jobs
by sirk390 1528 days ago
Is outperforming GPT-3 still a good reference? It seems there are many models outperforming GPT-3 in the superglue benchmark: https://super.gluebenchmark.com/leaderboard/ GPT-3 is in position #21, with 71.8% score. The best model is at 91.2%. Note the human baseline in #6 with 89.8%
4 comments

Note that this isn't an apples-to-apples comparison. The GPT-3 position is for a few-shot use-case that has not been trained for this particular task. When fine-tuned, GPT-3 would be expected to perform a lot better. Lastly, GPT-3 is currently operating on the text-002 models, and the 3rd version of GPT-3 is generally the one considered current. These benchmarks are for the original GPT3 model.
It's a good reference because people are familiar with GPT-3. The paper mostly compares Chinchilla to LaMDA, Jurassic, Gopher, MT-NLG, and GPT-3. In the broader tech industry and even to a certain extent within the AI field, GPT-3 is the only one that most people know by name.
Aren’t most of the models at the top not suitable for text generation? That’s what makes gpt different from Bert
What are the models at the top used for? Excuse my ignorance.
Mostly mask fill, but Transformers can be fine tuned to downstream tasks relatively easily (T5 was built for translation but is used for autocomplete in many cases)
would you mind sharing some references (or even just googleable terms) for this process of fine tuning?
> Is outperforming GPT-3 still a good reference?

It is if you outperform it with much fewer parameters