|
|
|
|
|
by 6gvONxR4sf7o
2435 days ago
|
|
One thing to always point out in these cases is that the human baseline isn't "how well people do at this task," like it's often hyped to be. It's "how well does a person quickly and repetitively doing this do, on average." The 'quickly and repetitively' part is important because we all make more boneheaded errors in this scenario. The 'on average' part is important because the errors the algo makes aren't just fewer than people, they're different. The algos often still get certain things wrong that humans almost never would. This is really really super great, let's be clear. It's just not up to the hype "omg super human" usually gets. |
|
I have no idea where the real human baseline is, or how to find it.
Also, consider this discussion. GLUE winners may be able to make informed parsing guesses about single text blocks, but they're years away from being able to make a useful contribution to a discussion like this one.