Hacker News new | ask | show | jobs
by YeGoblynQueenne 2430 days ago
To clarify, I meant this comment as an expression of skepticism- I don't believe that the SuperGLUE benchmark really evaluates language understanding, or that BERT and friends are within a few percents of human language understanding. I think SuperGLUE is just another benchmark that is measuring something else than what it's supposed to be measuring (machine learning benchmarks usually do).

It seems that the teams behind the attempts to beat such benchmarks are aware of the weaknesses of the benchmarks though, so that's encouraging.