Hacker News new | ask | show | jobs
by msamwald 2079 days ago
A shorter reply would be: It would be great to compare PET not only to GPT-3, but also to other models, especially ones geared towards few-shot learning.

Do you know of any other models that should be used for such a comparison, or are there already any relevant results on SuperGLUE that should be mentioned?

2 comments

This appears to be SOTA on SuperGLUE with few-shot learning.

PET (well, a version called iPET from the same author) is at #9 on the SuperGLUE leaderboard [1], and none of the models above it mention being evaluated by few-shot learning.

1: https://super.gluebenchmark.com/leaderboard/

The results reported there are what most people would call ‘semi-supervised learning’, not ‘few-shot’. The true few-shot results are in a few places in the paper, https://arxiv.org/abs/2009.07118, labeled with ‘- dist’.
There are many BERT-based models that would have made for a good numeric comparison, had they tested on few-shot learning, but I'm not aware of any that have.
Well, in table 1 they compare to RoBERTa trained in a vanilla supervised fashion?