|
|
|
|
|
by pmoriarty
2432 days ago
|
|
There was an article[1] posted to HN recently about these benchmarks, and it was pretty skeptical. Regarding SuperGLUE specifically, it asked: "Indeed, Bowman and his collaborators recently introduced a test called SuperGLUE that's specifically designed to be hard for BERT-based systems. So far, no neural network can beat human performance on it. But even if (or when) it happens, does it mean that machines can really understand language any better than before? Or does just it mean that science has gotten better at teaching machines to the test?" [1] - https://www.quantamagazine.org/machines-beat-humans-on-a-rea... |
|
I don't think anyone in the field thinks that once we match human performance on benchmark X, we're officially done. It just means it's time for more interesting benchmarks.
Over time, if it starts to become difficult to design benchmarks that humans can outperform machines on, then that will prompt interesting conceptual work about what exactly the difference between human and machine language competency is. And then that will lead either to more sophisticated benchmarks or alternatively gradually more sophisticated and persuasive arguments that machines really have surpassed us in language competence.
I don't think we're yet at a point where we don't know how to make harder benchmarks, and if and when we do hit such a point, I'd definitely bet the result will be a conceptual advance in benchmark design rather than declaring machine superiority once and for all. At least for the first few rounds of this cycle.