Hacker News new | ask | show | jobs
by wavemode 589 days ago
> dramatically higher performance this year than last year, and were dramatically better last year than the year before

Yeah, but, better at _what_?

Cars are dramatically faster today than 100 years ago. But they still can't fly.

Similarly, LLMs performing better on synthetic benchmarks does not demonstrate that they will eventually become superintelligent beings that will replace humanity.

If you want to actually measure that, then these benchmarks need to start asking questions that demonstrate superintelligence: "Here is a corpus of all current research on nuclear physics, now engineer a hydrogen bomb." My guess is, we will not see much progress.

1 comments

Humans could engineer a hydrogen bomb in the 1960's from publicly available research and multiple AI models from unrelated firms could do it right this moment if you unlocked its censors.

Turning that into an agent which builds its own hydrogen bomb using what amount to seized resources and to do it covertly at a pace that is faster than human agencies notice is a different sort of thing, but the middleware to do that sort of agent directed project is rapidly developing as well, and there is strong economic incentive for self-interested actors to pursue it. For a very brief moment in time, a huge amount of shareholder value will be created, and then suddenly destroyed.

A large-scale nuclear exchange is no longer the worst case scenario, in point of fact.

Assuming we don't hit those information-theoretic barriers, and we don't develop a host of new safeguards which nobody at the present time seems interested in developing.

> multiple AI models from unrelated firms could do it right this moment if you unlocked its censors

ok buddy

You believe I'm overestimating current AI. While my estimations are probably a bit higher than yours, mostly I think you're overestimating hydrogen bombs. They're not that complicated, and not that secret in 2024. These AI models have every unclassified scientific paper and every nonfiction book ever published on the subject at their disposal.

https://en.wikipedia.org/wiki/Thermonuclear_weapon

It's a mechanistic process featuring well-tread factual, verbose discourse. Scientists reasoning about facts and presenting what they found. "Tell me a joke about elephant mating habits in the form of a rap song" is a dramatically more complex task of synthesis.