| It depends on 1) the domains 2) your comparison group. On 2), many software engineers and computer scientists compare these language models' logic and creative problem solving abilities with themselves and their peer group. But they are usually 1-2+ SD above average humans at these things. (Note: Someone gave GPT-4 an IQ test and the result was 96, slightly below the average of reference human group at 100. The SD of an IQ test is 15 or 16.) For language-focused domains, there is evidence that GPT-4 is already better than most humans, eg. 99th percentile at GRE Verbal, beat humans at a fairly novel puzzle like Twofer Goofer, which is not in its training set. Ref: GPT-4 Beats Humans at Hard Rhyme-based Riddles https://twofergoofer.com/blog/gpt-4 Yes, GPT-4 is not an AGI yet, but the research paper (OP) has a point. |
How did you go from "human-level IQ with some super-human abilities" to "not an AGI"?