Hacker News new | ask | show | jobs
by jakeinspace 1327 days ago
It’s very difficult to come up with an objective metric or benchmark for general AI that can’t be gamed, or which won’t turn out to be disappointingly easy. It makes sense that most research would be in the direction of tasks which are easily quantifiable. More hazy benchmarks like the Turing test are possibly better but that one in particular isn’t so good unless it’s enhanced (I’d say a 1 hour conversation with an AI posing as someone with graduate-student level understanding of a field of science or art I know something about would be adequate proof, but maybe I’m being naive).
1 comments

1. If it's too hard to come up with an objective benchmark for AGI then we should rethink what it is we're doing talking about progress toward that's goal.

2. You could also rework this list and say that an AI which can fulfill three tasks even badly is one step forwards. An AI which can fulfill three tasks from three separate categories is another step, etc. But I think it's a categorical mistake to count progress in specific AI as progress in general AI, there's not a very good reason to believe they are in a continuum with one another.