Hacker News new | ask | show | jobs
by fouc 2428 days ago
I agree, AI should realistically be able to detect the rare/uncommon/ambiguous usage as well, and rated for that.

I suppose in some case it could score better than humans on SuperGLUE benchmark.. but eventually it will have to come back down to near human score as it gets more accurate.

1 comments

Why? In many of those benchmarks the average human score is not 100, but the AI progression doesn't really have a ceiling or a slow down at the human number. It should go through it and settle somewhere above. Plus we create these tests with our own limitations. There may be a world of more complexity or subtlelty that we all fail to grasp but the AI will.

I think humans are already behind at the face recognition task for example.