|
|
|
|
|
by Bjorkbat
245 days ago
|
|
One of my most frustrating things regarding the potential of an AI bubble was some very smart and intelligent researcher being incredibly bullish on AI on Twitter because if you extrapolate graphs measuring AI's ability to complete long-duration tasks (https://metr.org/blog/2025-03-19-measuring-ai-ability-to-com...) or other benchmarks then by 2026 or 2027 then you've basically invented AGI. I'm going to take his statements at face value and assume that he really does have faith in his own predictions and isn't trying to fleece us. My gripe with this statement is that this prediction is based on proxies for capability that aren't particularly reliable. To elaborate, the latest frontier models score something like 65% on SWE-bench, but I don't think they're as capable as a human that also scored 65%. That isn't to say that they're incapable, but just that they aren't as capable as an equivalent human. I think there's a very real chance that a model absolutely crushes the SWE-bench benchmark but still isn't quite ready to function as an independent software engineering agent. So a lot of this bullishness basically hinges on the idea that if you extrapolate some line on a graph into the future, then by next year or the year after all white-collar work can be automated. Terrifying as that is, this all hinges on the idea that these graphs, these benchmarks, are good proxies. And if they aren't, oh wow. |
|